about big data becoming the next big management revolution. In part one of this SearchCIO Q&A, Brynjolfsson laid out the argument for big data's potential to fundamentally change how businesses operate and why that data-driven approach might make some CEOs very uneasy.
Here, he outlines the common mistakes companies can make when they try to capitalize on big data -- pseudo data-driven companies -- and what keeps him up at night about our race to keep up with big data management (think security and privacy).
In talking to data scientists, we've been told that aside from the many technical challenges of processing big data, there's the business danger of sifting that data to reinforce opinions. So, instead of using big data to identify blind spots, companies can actually use it for the reverse -- to cherry-pick the data to reinforce a preconceived idea.
Erik Brynjolfsson: We definitely see that. We call them 'pseudo data-driven companies.' They pretend to be data-driven, but really they do it kind of backwards. First they make the decision, and then they go out and gather the data to support that decision. We've all seen that -- an awful lot of PowerPoint presentations are done that way, with the executive saying, 'Bring me some bar charts that make the case.' That's backwards. That gives big data a bad name. You have to be genuinely open-minded. The three words that may be the most important for an executive to learn to say may be 'I don't know' when confronted with a question: 'I don't know, I'm not the expert, let's go get the data and find out what they answer is' -- and then to go with what the data says.
You do have to have respect for the facts, for reality.
Brynjolfsson: It's a very different mind-set. It's one where the key is to learn how to be wrong quickly, and learn from that: Go out and test and experiment and always be updating; every hypothesis is a tentative hypothesis, subject to a potential falsification by the data.
The three words that may be the most important for an executive to learn to say may be 'I don't know' when confronted with a question: 'I don't know, I'm not the expert, let's go get the data and find out what they answer is' -- and then to go with what the data says.
Are there other ways big data can be misused?
Brynjolfsson: Yes, another concern is that you just get such an overwhelming amount of data that you see correlations everywhere; there are patterns in almost any place you look -- and you have to prioritize.
I can give people a guideline on how to be luckier in this, and that is, don't start with the data and see where it leads you; start with the problem you have and see if the data exists to answer the question. We've found it to be far more effective to start with the business questions -- the places where you need to add value -- because the data probably exists to help you with that. But if you start with data, you can go down all sorts of wild goose chases because there is just a ton of fun and interesting relationships, but you need to focus on the ones that actually relate to some business goal.
We've talked a little bit about what some CEOs might find unsettling about this kind of data-driven approach. Is there anything you find unsettling about this tsunami of data and how it is going to affect businesses?
Brynjolfsson: Absolutely. There are three. The first is the question of causality. Correlation is not causation, so finding these patterns in the data is not sufficient. Managing that is important, and training people is important. People need to understand that just because A and B are correlated doesn't mean that A causes B or that even B causes A -- and it could be a third factor.
The other two things that concern me are closely related, call them 2A and 2B, which are privacy and security. We are in a brave new world where you can just know things that you never could before. It's remarkable and for some people, scary, how by analyzing somebody's digital stream -- what they post on Facebook, Twitter, their mobile phones -- you can know an enormous amount about that person that you could not before. Even stuff in the public records -- your mortgage, an arrest record -- which used to be obscure, now, in principle, is available with a couple of clicks. If you have Google Glasses, you have that information in real time as you meet people. That's kind of a weird new world, where we can't rely on the laws of physics to control what you see and don't see anymore. Now those barriers are gone, so we are going to have rely on other things -- either the laws of Congress or our own ethics and norms. That has not been worked out.
Here is the thing I wonder about: If you have a generation coming of age that doesn't have the same concerns about privacy and information sharing, what sort of impact does that have on the whole economic structure? You're in the business school. When you have a generation that shares everything, that's a type of communism in itself: How does that ultimately impact business?
Brynjolfsson: That's interesting. I hadn't heard that comparison. It is a revolution that is ongoing. We're going to have to come to some norms. I think the technology is evolving so rapidly that you can make different kinds of mistakes. You can legislate too late or you can legislate too early. I'm probably more worried about having us legislating too early and having to undo things. I think we want to keep an eye on it and put more energy into these questions.
Related SearchCIO coverage
Security threats and privacy concerns in the e-life
Big data vs. personal privacy, watch out!
Company mission vs. personal privacy, a tangled tale
But it's evolving every rapidly. I'm an academic, so one of the first things I realize is that we have the academic framework to think through what the plusses and minuses are -- you certainly don't want to block all kinds of information-sharing; most information-sharing is beneficial. So the real art -- and hopefully someday a science -- will be designing mechanisms that encourage the beneficial information-sharing, like having Amazon recommend me a good book, but discourage the bad kind of information sharing, like having insurance markets collapse because everyone's diseases are known in advance and insurance can't work in that situation. In economics, it is well-known that certain kinds of information actually destroy value and others create value. We may be at the beginnings of a framework for that.
You can hear Prof. Brynjolfsson moderate a panel on the big data management revolution at the upcoming MIT Sloan CIO Symposiumon May 22 at the MIT campus in Cambridge, Mass.
Let us know what you think about the story; email Linda Tucci, Executive Editor.