Big data exploration and analytics for CIOs: Oh, the places you'll go
A comprehensive collection of articles, videos and more, hand-picked by our editors
Sastry Chilukuri, a partner at New York-based McKinsey & Company, had an assignment for an audience of technologists: List the five most important IT tools they used in their personal and professional lives. Turns out, the tools they depended on in their personal lives included the darlings of the app economy, such as Skype. But in their professional lives? "It was still Excel, PowerPoint" and so on, he said, ticking off the goodies but oldies of the workplace.
Chilukuri shared this story at the seminar "The next generation of big data," hosted by the Massachusetts Technology Leadership Council (MassTLC). Members of the panel Chilukuri moderated weren't surprised by the audience's attachment to the same old office apps. And it's easy to understand why when you hear how difficult it is to integrate data across the enterprise from people like panelist Jon Pilkington, vice president of products for the Chelmsford, Mass.-based data visualization software provider Datawatch Corp. It's even a struggle for businesses to get a unified view of the data across large Cognos or SAP implementations, let alone the entire enterprise, he said. The often-used solution? You guessed it. "Excel," he said.
But the corporate love affair with Excel can't last forever. As systems grow and data stores multiply and the corporate data strategy becomes more mature, businesses will need better, more sophisticated ways to integrate data, Pilkington said. That's especially the case if they're hoping to rack up big data analytics wins.
Younger companies have proven to be more adept at combining different data streams than established enterprises, said panelist Marilyn Matz, co-founder and CEO of the Waltham, Mass.-based database startup Paradigm4 Inc. -- and not just because they bypass the headache brought on by transitioning legacy systems to the big data economy. They also are unencumbered by traditional approaches to business intelligence and analytics. "When you go with some of these new [big data] technologies," she said, "it's not rip and replace; it's a new paradigm."
Why the talent shortage?
As businesses move into analytics, statistics becomes essential, Chilukuri said, but there is a dearth of talent, and the problem goes way back. "There's a lack of statistics in high school," he said. Students may take one or two classes at best (though that's starting to change), but that isn't enough to instill an understanding of what correlation means. "Training is going to help bridge that gap."
The deeper a business gets into big data and advanced analytics, however, statistics alone won't solve all of its talent issues, Chilukuri added. They will need data scientists, to be sure, but also "data translators." These are practitioners who can cut across IT, analytics and the business, and they will go a long way in helping "create business value."
What is the price of data? "Each person has data they're willing to give up and data they're not willing to give up," said panelist Bob Zurek, senior vice president of products for Wakefield, Mass.-based marketing company Epsilon. So what happens when people preside over their own data and have the power to share what they want with whom they want? Are businesses willing to pay for that kind of information -- not in loyalty card discounts or coupons, but in financial compensation more akin to a royalty? "We're starting to see conversations within the entrepreneurial community about this," Zurek said.
Previously on The Data Mill
CIOs have a lot to learn from NSA data collection
Visualization tools for spreadsheets and sleep stories
Dropbox CEO on recruiting top talent
Here's a challenge for CIOs: Figure out how to streamline the process for sharing ideas. That was the thinking behind Epsilon's Spark program. At Epsilon, an idea could hit a wall or get lost in transit too easily "because everyone's so busy," Zurek said. "So we stepped back and said we need to create an avenue and a process [for this]." With Spark, Epsilon employees formally submit ideas that are then vetted for funding consideration. Zurek, for one, would like the program to eventually be extended. "I'd like to open it up to all of our customers as well," he said.
Cutting through the big data fog
Big data technologies now saturate the market, which can make it difficult for CIOs to figure out what's needed and what to invest in. On top of that, "vendors don't do a good job of explaining what a tool does and doesn't do," said Paradigm4's Matz. To help cut through the noise, CIOs should "focus on the use cases," she said.
Taking data's temperature
Datawatch's Pilkington is seeing more "hot, warm and cold uses of data." The hottest data involves real-time or performance monitoring analytics, where he said in-memory technology is playing a big role. "Warm" data is shelved in relational data stores. And "cold" data? "Businesses are using Hadoop for archiving and storing large data sets," he said, allowing enterprises to call upon years and years of historical data when looking for things such as behavior over time.
Crowdsourcing for the crowdsourced?
Yelp Inc., a platform for crowdsourced business reviews, recently announced its latest Yelp Dataset Challenge for students. The company is opening up data from the Greater Phoenix area that includes more than 300,000 reviews from more than 70,000 users. Students who can come up with "insightful, unique and compelling" data models could win a $5,000 prize -- and more, if their research gets published.