Those attending Gartner Inc.'s Master Data Management (MDM) Summit last month got a surprise on opening day: an appearance by Nate Silver, the analytics expert and New York Times columnist who correctly predicted the 2013 presidential election. Silver was the highly anticipated feature of the last leg of Gartner's Business Intelligence (BI) Summit, the companion show to MDM. The Stamford, Conn.-based consultancy has been hosting the two shows back-to-back for years, but this may have been the first time something like Silver's keynote acted as an unofficial bridge.
More surprising than his appearance, perhaps, was Nate Silver's message to the more than 1,400 business and IT professionals: Whatever you do, don't give into the big data hype. Silver called his new book The Signal and the Noise: Why So Many Predictions Fail but Some Don't the "turd in the big data punchbowl." He pointed out some of the challenges of analyzing big data and cautioned businesses against turning analytics over to the machines.
Interested in hearing more?
Listen to the second part of the series, which covers highlights from the BI and MDM summits, on SearchDataManagement.com.
In this podcast conference recap, the first of a two-part series, SearchCIO's senior news writer Nicole Laskowski and SearchDataManagement reporter Jack Vaughan delve into Silver's concerns about the big data obsession and why it's easy (and dangerous) to swallow the hype. In addition, the podcast explores how analytics and MDM are joined at the hip: Although the message might sound like a broken record these days, proper data management, in general, never goes out of style, and serves as the foundation for trustworthy analytics on data of all sizes.
Read the full transcript from this podcast below:
Scott Peterson: Hi this is Scott Peterson editorial director of the business application and architecture group at TechTarget and I'm here with Nicole Laskowski, news writer with SearchCIO and Jack Vaughn, news writer for Search Data Management, and what we have here is a representatives from two media groups here at TechTarget. Jack covers the data management part of big data and Nicole covers analytics and business intelligence from a strategic point of view so they were both at the Gartner conferences recently, one in Great Vine, Texas on business intelligence and another one on master data management and we thought we would get them together for this podcast to compare notes and discover what really happened there and what was of note for our readers and listeners. A question for both of you to start out, the combination of business intelligence and master data management, is there a particular reason why Gartner paired these two shows together, and what was the advantages for the attendees there? Nicole?
Nicole Laskowski: I think with analytics and data management in general, you can't have one without the other, they're just very related. You can't have the analytics without good data management, good governance, and I think pairing the conferences together, what's nice about it is that for professionals, for IT, and business professionals were interested in both topics, they can kind of hit the analytics conference or the BI conference and the data conference at the same time, they don't have to go to separate conferences for that so they can cover two areas with one trip.
Jack Vaughan: Yeah one of the truest words ever spoken, garbage in, garbage out, so what the analytic people dice has got to be honest, genuine, truthful at some points so that is the connection. Certainly the analytics is the sweet spot in a lot of people's minds these days because there seems to be opportunity in web recall and big data often.
Scott Peterson: And the hierarchy of things, the data management part has to come before the analytics part, am I correct?
Jack Vaughan: Yeah that's fair to say, although a lot of the big data stuff people are doing that's getting so much attention, it's not necessarily consistent, the data is somewhat experimental and as it comes into the enterprise over a period of years, that will change. Of course the part of the event that I covered very much is the master data management and that too, even within the data management world, it's sort of a rare thing and is somewhat limited to the biggest, biggest companies, companies that have 40 ERP systems. And it's almost an endless journey to try to master that data.
Scott Peterson: What is master data management and what cases is it really supposed to be applied?
Jack Vaughan: Well again, to me, it seems it's with the big companies, the classic case where you try to integrate things years ago is you have the meeting to decide what a customer was and usually would end with a fist fight and you would forget about that project and they kept saying it's a journey, it's going to cost, you need buy in from the businesses, the Gartner and other people apologize because we've all heard that. Again you are only as good as your information and trying to have one view of the truth of information, it's been an age old quest and MDM is a discipline trying to ensure data quality and ensure use of data across large organization that's consistent.
Scott Peterson: Nicole, if that's the case, if it's not necessarily for everybody or everybody maybe isn't even able to do it, what does that say about the data they are bringing into their analytic systems?
Nicole Laskowski: I think that MDM is something that Gartner would argue that was for every business. But yes I think that for larger businesses it's more important because you have different silos within the business that are working with data and order to maintain that consistency that Jack talked about, you need to have a sort of overarching understanding on a bigger more strategic level about how you're talking about things. I think data quality across the board, no matter the size of the company, it's absolutely important, I don't think there is a company that would say it isn't.
Scott Peterson: You had the pleasure of hearing one of the great luminaries in the space at the conference, Nate Silver, the New York Times blogger who became famous for correctly picking the 2012 election. What did he have to say? Why was he there and what did he have to say about all of this?
Nicole Laskowski: Yeah it was a highly anticipated keynote, it was definitely, from my prospective the busiest session there, his session actually overlapped with the MDM portion of the conference so you had twice as many attendees possibly attending his talk. And he was there promoting his book, the Signal in the Noise, that was part of the reason, I think, for the appearance but I thought what he said was really interesting in that he kind of threw out a caution to businesses. Businesses that are interested in getting into this big data space and it's basically to proceed with a bit of caution.
One of the things he talked about was that there could be a tendency to kind of rely more on machine learning when you are talking about big data, when you are talking about volume, variety, and velocity in order to sort of deal with that, kind of crunch through it. And he said that machine learning, machines are good for some things but not for everything, one of the things he said was that they are very good at repeatable tasks, they are very good at even finding correlations between the data, but they are not good at finding causation. So it's good to know what happened but if you don't know why or how something happened, then you're limited in how you are looking at a problem.
One of the examples he gave, it was a great example, it was the 1990's chess match between IBM's Deep Blue and Garry Kasparov and in one of those matches, Deep Blue made this move that Garry Kasparov did not get. He thought about it and he just came to the conclusion that the machine was better than the human in playing chess.
What Nate Silver did is, he went back and he talked to Deep Blue's programmer and what it turns out happened is that this was a timed chess match, and in a timed chess match if you time out before making a move, you automatically lose the game and so Deep Blue was programmed in cases like that to make a legal random move and that's what he did. So Deep Blue made a legal random move and that's what Kasparov saw, it was kind of a bug rather than the machine being smarter than the human. And Jack, I know you saw Nate Silver speak as well, I don't know if you had any take away from his talk.
Jack Vaughan: Well I present enjoyed your story and SearchCIO and think you nailed it there and I went to see this fellow had to say about big data and from the point of view of it's terribly hyped I wonder what disappointed him, he did not disappoint me because he addressed that throughout. He said that you need humility and to understand uncertainty when you look at results and that really the human still does play a very important role and some concern with big data, besides being over-hyped could be dangerous if we cede the power to the machines so to speak. He said there is an obsession with big data in his recent book, he said his book was like a turd in the punch bowl, so he was and that was not disappointing to me.
Nicole Laskowski: And that's how he opened with, just quite an opening remark.
Jack Vaughan: And then coming with the...
Scott Peterson: Oh wait hold on, what exactly is the turd in the punchbowl?
Jack Vaughan: That he is raining on the parade that he's saying that you have to think, where you can't think, that this isn't a magic bullet, that you really have to put on your thinking cap, use your noggin, when you look at the data.
Scott Peterson: So even the best, let's say data analytics project, the data, the results, the answers we can get out of it still require certain human interpretation?
Jack Vaughan: Yeah, particularly as the volume becomes larger the chances for making mistakes grow larger, I believe exponentially rather than linearly. He did say much like we were talking about Watson, that you always have to go through trial and error, you should error and realize that you will and one of the, he did have nice things to say about the weather bureau, he said they are getting much better at more localized, more accurate, more on time predictions. But predictions are difficult, we are still humans, and he even said we still have a cave man brain of sorts. By that he meant that theoretically in evolution, the humans that survived were good at hearing animals in the woods and reacting as if they were a danger to them. And again, he said that our ability to see patterns is actually over aggressive at times and that, again, was a caution or a turd.
Nicole Laskowski: And one of the things he talked about was building models that kind of chuck through this data and you have to continuously refine and refine and refine those models and that requires human interaction.
Scott Peterson: That's a good place to pause, we're going to end this part one of this discussion. Part one will reside on SearchCIO and part two coming up will reside on Search Data Management. For Nicole Laskowski and Jack Vaughn, this is Scott Peterson, editorial director of the business applications group at TechTarget, thanks for listening.