Sergey Nivens - Fotolia
Forget the three V's. According to Michael Alton, applied data scientist at Intel Corp., the telling big data characteristic is complexity. "Data is not getting longer; it's getting wider," he said at the recent Useful Business Analytics Summit in Boston.
True, businesses are collecting more data than ever before, but much of that collection translates into more data points and observations about the same things, Alton said. So the real challenge (and where complexity comes into play) is figuring out how all of these data points and observations -- of products, people, geographies -- fit together.
As Alton put it, "relationships really are the big data problem." And not just for data scientists, but for CIOs as well. They need to build a big data infrastructure on the back end so data scientists and analysts can easily interact with technology and explore how these data microcosms fit together on the front end.
That's a difficult feat during the best of times, but it's complicated even more by how quickly tools and technologies are evolving, Alton said. Continually swapping out back end technologies, such as Hadoop, for the next best thing (Apache Spark is generating quite the buzz these days) can become a hardship "that slows down your innovation time and data scientists, who become burdened by tools rather than empowered by them," Alton said.
Intel's solution? Application program interfaces (APIs). The tech giant is looking to connect what it sees as a big data must-have -- distributed computing systems -- on the back end and data science tools, such as programming languages like Python and R, on the front end through APIs. Alton said the APIs will make upgrading to the next new thing a little more seamless.
"This is the paradigm of how we think we can bridge the gap between data scientist skills and distributed computing skills," he said.
A fix in 90 days or less
Amy Gaskins is part of an elite team at MetLife Inc. When the business needs help with data science or analytics (typically), Gaskins is one of a handful brought in to fix the problem -- stat! "Our mandate is 90 days or less, which requires us to win fast or fail fast," she said at the summit.
Gaskins, assistant vice president for the insurance company's data analysis SWAT team, said the policy builds trust with the business, which isn't always easy when talking to a 145-year-old company with a risk-averse culture. "We have to really sell that proposition -- that we can deliver value better than some vendors because we understand you, we can walk you through your business problem, we know the data," she said.
Her team, which she described as "multifunctional," also has to be agile, experimental and "intensely curious," to get the job done in 90 days or less. They rely on prototyping and experimentation -- "even if it means we have to call reinforcements in" -- to deliver on that mandate.
"You can talk round and round," she said, "but if you don't deliver, the trust is gone."
Mark Burgess of Equifax Inc. doesn't call himself a data scientist. "I don't do enough data science study to warrant that," he said at the summit. Nor does he call himself an architect, a programmer or an engineer. Instead, his title is more unique than that -- informaticist. Sure, the field of informatics has been around for years, but rarely is the title applied to IT outside of the medical profession. Yet, like a nurse informaticist who acts as a liaison between IT systems and clinicians, Burgess fills the gap between "data supply and data consumer."
As a software developer, he builds databases for an independent group of statisticians and analysts as well as toolkits of models and attributes for the "software consultancy" on the business side of the house.
"My job is feeding data scientists," he said.
Thinking outside the industry vertical
Searching for ways to innovate? Do like Thomas Speidel, trend specialist at Suncor Energy Inc. He's breaking down industry silos by looking beyond the oil and gas vertical for inspiration.
One example? The former Alberta Cancer Board employee observed how medical practitioners are using current technology and how these applications might work in the oil and gas industry.
"I just read Google Glass is being tested at [hospitals]," he said at the summit, so emergency room physicians can quickly access data -- vitals in real time, scans, X-rays -- and shrink the decision-making window. "What if we could do some of the same thing for our personnel in the field?" he said. Arming oil and gas employees with the right real-time -- and even predictive -- information on the health of an oil pump could mean huge cost savings.
"There's a song by Sting: 'Every breath you take, every move you make, every bond you break, every step you take, I'll be watching you.' I used to think that was a love song, but he was talking about the Internet and big data and analytics." -- Alfred Essa, vice president, analytics and R&D, McGraw-Hill Education
"If you're trying to build an analytics function from scratch, be willing to accept individuals from any background and leverage the strengths they bring to supplement your team." -- Debra Osborn, senior director, reporting and analytics, Aetna Inc.
"This gets down to whether you have a Fitbit or an ankle monitor. Both collect a lot of data; both are wearables. So what's the big difference between the two? The difference is control." -- Amy Gaskins, assistant vice president, data analysis, SWAT, MetLife Inc.
"A decision is an irrevocable allocation of resources. Once you make it, you can't go back. So you really need to make quality decisions." -- Stephen Sharpe, director of global strategic analytics, Johnson & Johnson Inc.
"Don't entirely rely on data. Data will lie to you. You [also] need subject matter expertise." -- Thomas Speidel, trend specialist, Suncor Energy Inc.