Is this the beginning of the end for the vaunted data scientist? That's the clever pitch from Tableau Software, one of a handful of business intelligence vendors pushing the envelope on self-service BI tools.
Because the software's user interface offers drag-and-drop features, even users without a strong math background can build visualizations and interrogate the data, the vendor promises. Tableau isn't the only one with a strong self-service play. More and more vendors are offering analytical packaged applications that mask the complexity of analytics on the back end with easy-to-use features, which begs the question: Are data scientists just experiencing their 15 minutes of fame?
Dan Sommer, a Gartner analyst, doesn't think so. At the Gartner BI and Analytics Summit, he argued that while access to self-service tools may put analytics into the hands of just about every employee, it won't eliminate the role of the data scientist altogether. "You don't give a Ferrari to someone who just got their driver's license," Sommer said.
Or, perhaps more to the point, you don't give a bunch of raw materials to just anybody and say, "Build a Ferrari." That's not a job for the average mechanic; nor is sniffing out never-before-seen relationships from a company's diverse data sources a job for data dabblers.
There are just too many hard-to-detect data traps, and even data smarties like Carmen Reinhart, professor of international finance at Harvard Kennedy School, and Kenneth Rogoff, professor of public policy and economics at Harvard University, fall victim to them. They co-authored Growth in a Time of Debt, a study of the relationship between government debt and economic growth. The paper argues that as countries take on significant debt, their economic growth slows.
When Thomas Herndon, a University of Massachusetts Amherst graduate student in economics, tried to duplicate the findings, however, he "basically found the biggest spreadsheet error in the history of mankind," Sommer said. Turns out, some of the conclusions in the popular paper were based on incomplete data sets. Although the paper's basic finding didn't change with more complete information, Herndon found the conclusion wasn't nearly as black and white.
Traditional data warehousing vendors back in style
One surprising takeaway from the Gartner Magic Quadrant on data warehousing and database management systems? Traditional data warehousing "came back with a vengeance in terms of demand," said Mark Beyer, Gartner analyst, at the BI Summit.
Most traditional vendors seen as "leaders" in this space, including IBM, Teradata and SAP, with its HTAP or hybrid transaction/analytical processing, are building logical data warehouse roadmaps. A term coined by Beyer, the logical data warehouse is a relatively new approach to data management that veers away from the central repository. Instead, data lives where it best resides -- be it in a traditional data warehouse, analytical database, or Hadoop-distributed file system -- and virtual layers provide views into the data.
The traditional vendors are "getting into a title fight and coming after each other," Beyer said. But they also need to watch their backs: Cloud provider Amazon Redshift, Hadoop distributor Cloudera and NoSQL database provider MarkLogic found their way into the quadrant this year. They didn't debut as "leaders," but they didn't give a weak performance, either.
One man embraces the quantified self to determine his (data) worth
Welcome to the self-quantified self. Federico Zannier of Brooklyn, N.Y., data-mined himself and then launched a Kickstarter campaign to hawk his personal data in a project he called "a bit(e) of me."
Previously on The Data Mill
For a smart analytics strategy, think Goldilocks
Five tips for a cloud-first strategy
Ford's connected car revs up with APIs and external app developers
"I violated my own privacy," he said in his campaign video. Why? U.S. advertisers, known to buy and sell customer data, raked in $30 billion in revenue in 2012, Zannier explained. "In 2012, I personally made zero dollars. â€¦ Is my personal data worthless to me?"
To find out, Zannier put a price on his data. For a mere $2, anyone could buy a day's worth of Zannier's self-quantified self -- bundled into a single folder. The data included websites he visited that day, photos of his face looking at his computer taken every 30 seconds, screenshots of the pages he was looking at, his GPS location, the positions of his mouse and a list of applications he used.
He attracted 213 backers and raised $2,733. Not exactly a goldmine, but more than five times his goal of $500. And he promised to use the funds "to finish a browser extension and an iPhone app that allows you to do the same."