Big data tutorial: Everything you need to know
A comprehensive collection of articles, videos and more, hand-picked by our editors
The buzz generated by Tesla Motors Inc. to bring an affordable electric car to market by 2017 is still growing. But within IT circles, CIOs are more likely talking about how Elon Musk, CEO and product architect at Tesla, directed his CIO to do the unorthodox: build an enterprise resource planning system rather than upgrade their software from SAP.
It's a move that's making waves -- and not just in reference to ERP investments. At the MIT Sloan CIO Symposium, panelists applied broader meaning to Tesla's decision -- as well as Facebook's plans to build its own CRM software. They described the unconventional choice as a market failure to meet expectations for customized solutions and as a potential signal that the pendulum is swinging away from stack vendors to best-of-breed solutions and beyond, particularly as big data is increasingly factored into business strategy. But the key takeaway for CIOs was perhaps best articulated by Tom Davenport, professor of information technology and management at Babson College in Wellesley, Massachusetts.
"I don't know if Tesla's system is going to compete with SAP or Oracle. That strikes me as sort of nuts, frankly, to say we have to build all of that stuff. But building something that supports a strategic, differentiated aspect of your business has always made sense," he said during the Symposium session he moderated on big data. "Presumably that's still where we are."
Curt Monashfounder of Monash Research
As CIOs architect for big data, they're likely to bump up against a common and longstanding IT dilemma: To build or buy? Today, big data infrastructure bottlenecks can be specific and ill-suited for the one-size-fits-all solutions that have dominated the market for years. The better fit may come from technology alternatives such as in-memory or NoSQL databases, cloud, open source or, as is the case for Facebook and Tesla, a custom build. But first, CIOs will have to parse through the ambiguity of the term "big data" itself, juxtaposing what has become a catch-all marketing phrase with the technical pain points the business faces. And in the end, they're likely to make surgical rather than sweeping technology investments.
First, what is big data?
The term big data has been around for years, but its meaning is still muddied. "Big data is an umbrella marketing term that is used to comprise most of what's interesting these days," said Curt Monash, analyst and founder of Monash Research in Acton, Massachusetts.
Worse, as Monash explained in a 2011 blog post titled Big data has jumped the shark, the language used to define big data (including Gartner's well-known "three Vs") is often misleading, creating market confusion. As Monash put it, to imagine there's one technology stack that can "handle most of what's new in IT is pretty silly."
A blanket use of the term might be expedient for the vendors doing the selling, but it runs the risk of becoming a headache for the potential builder or buyer, according to Darrell Fernandes, CIO of the professional services group at Fidelity Investments. The label "can hurt us at times," he said during a big data panel at the MIT Symposium. That's especially true if the connection between technology investment and business outcome is fuzzy. Unlike the CRM tech trend in the 1990s when you could connect CRM technology to an explicit business value, Fernandes said, big data defined broadly results in nothing except generalized anxiety on the business and the IT side.
Fernandes isn't alone in describing what some experts call the "big data backlash." Forrester Research Inc., for example, deliberately avoids defining big data as a technology or even as a technology problem, because those terms can cause technologists to "miss the broader trend," according to the recent Forrester report, Reset on Big Data. Instead, big data is "closing the gap between the amount of information that's available and your ability to use that information for business insight," Brian Hopkins, an analyst for the Cambridge, Massachusetts-based consultancy and one of the report's authors, said. "It is a journey."
And the journey requires knowing why you're making it in the first place. Forrester advises that before building or buying anything, business and IT leaders drill down into the use cases. Sometimes these use cases will require a technology investment, Hopkins said. "But sometimes your business has to overcome important cultural issues with how they make decisions based on data," he said. Big data assumes that the business model is data-driven. Becoming a more data-driven organization requires a certain level of competency, and CIOs would do well to understand where the business is before making any kind of investment, Hopkins said.
A single vendor likely won't cut it
When a big data technology investment is necessary, Hopkins cautioned CIOs to avoid thinking in name-brand terms (such as Hadoop, a broadly used synonym for big data these days), but instead focus on functionality. He suggested looking for "flexible and affordable ways to capture, store and analyze data," the functions Hadoop indeed provides, he said.
Most likely, a technology stack that can take on all of the modern-day data problems a business encounters won't come from a single vendor. "There are no nice, neat packages you can acquire," Hopkins said. And buying from a single vendor is pricey, according to Monash. "It's an extremely expensive directive," he said. "Not just in terms of money, but also in terms of administrative costs and so on."
That includes offerings from mega vendors such as IBM, Microsoft and Oracle who have "a complete stack," Hopkins said. "But I don't think it works as nicely as they think, and neither should the buyer." Opting for the open source route, however, comes with its own set of difficulties, Hopkins said. Vendors such as Cloudera, MapR and Hortonworks "are making choices about which versions of the open source project to support." And, in some cases, those same vendors are building new, proprietary features that are made redundant by the open source community. Customers will have to think in terms of "technical building blocks," Hopkins said. "What building blocks do you need to go after business opportunities to close the gap?"
Monash has his own rule of thumb: "Figure out what your toughest technical requirements are," he said. "By toughest, I mean the parts that the fewest vendors can satisfy. And those will be closely related to what your business problems are."
The focus on business problems or business outcomes cannot be overstated. As Fernandes explained, his organization has worked hard to make the concept of big data more concrete. "What we've seen over [the] last several years is a real focus on specific hypothesis, specific use case to get away from big data and get really specific on the outcome," he said. "That's how we've gotten more traction in solving some of these things."
In the second part of this SearchCIO two-part series, get an up-close look at how technology leaders decide when to build, when to buy -- and when to do neither.