News Stay informed about the latest enterprise technology news and product updates.

Eight big data myths that need busting

Can CIOs make big data the new normal by 2020? It starts with helping their companies distinguish big data facts from big data fiction, says Gartner analyst Mark Beyer.

ORLANDO, Fla. -- Ask a dozen CIOs to define big data, and you'll likely get a dozen different responses. Gartner analyst Mark Beyer says that is because big data -- for all the hype -- is still not the norm for enterprise IT professionals.

"When something becomes familiar, it starts to feel normal," Beyer said during his talk at this year's Gartner Symposium/ITxpo. "Our job, as IT pros, is to make big data normal by 2020."

CIOs can help their enterprises inch toward normalcy by distinguishing big data facts from big data fiction. "Myths play to anxiety, not actual situations," he said.

Here are Beyer's eight big data myths:

1. Big data starts at 100 TB. Stop looking for a standard size for big data, because there isn't one. "Big data is what I'm doing with the data; it's not how big it is," Beyer said.

2. You have to replace infrastructure if you want to do big data. "If I decide to change my whole infrastructure because I have a new need, I'm risking everything I've done before," Beyer said. His rule of thumb? "You have to figure out if the sacrifice of the [infrastructure] maturity is worth the risk."

Mark Beyer, Gartner analyst Mark Beyer, Gartner analyst

3. Eighty percent of all data is unstructured. This might be one of the most often-quoted big data stats around, and, according to Beyer, it's also inaccurate. "The biggest information assets in the world are machine data. Calling them unstructured because they're not relational is a lie. Machine data is structured data." The bulk of this machine data, by the way, tends to be repeated information confirming everything's fine. "That's what machine data usually says," he said.

4. Tools will replace data scientists. Rest assured, all of the money spent to attract, woo and win over a data scientist is not for naught, said Beyer. "Tools are engineering; engineering is the reuse of a discovered fact. Science is discovering new facts." Tools won't replace data scientists -- at least not until the tools can procreate and evolve.

5. More data fixes data quality issues. "More low-quality data yields more low-quality answers," Beyer said. CIOs should keep their eye on data quality. Take the temperamental geolocation data collected by cell phones, a device some people treat as a stand-in for the human, he said. Cell phones, however, can be accidentally left at the office or the GPS function can be turned off at any given point in time. "Cell phones are not people," Beyer said.

6. Real time is just faster. Operating in real time doesn't mean speeding up the data ingestion and cleansing and analysis processes currently in place, Beyer said. It's about "making sure the interval between data collection and decision is as short as possible," he said. Plus, most enterprise data isn't needed for real-time operations.

7. Data volume trumps domain knowledge. For those who think they can simply wash their hands of big data business processes, think again. That's because "a good data scientist must be stopped" from collecting data at some point, Beyer said. Without a business process in place, data scientists will keep going and going and going past the point of providing some business value. Someone needs to help draw the line.

8. Data models are useless. The statement is a sweeping one. But, Beyer clarified, everything placed into a digital asset has a digital model. "We don't eliminate models because we have big data," he said.

Next Steps

Read about 10 big data cases for more inspiration on making big data the new norm.

Dig Deeper on Enterprise business intelligence software and big data

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

According to BI experts, businesses harbor many myths about big data. Is there one that really drives you crazy?
The worst myth that companies are still believing in is that just having access to big data will give them the answers they need. I mean, are you serious? Really? Big data is often compared to gold mining, and with good reason - it's your ability to interpret the results, to separate the mud from the gold, that makes big data valuable. You have to know what questions to ask before big data can help you.
The article starts with saying 'ask a dozen CIOs to define big data, and you'll likely get a dozen different responses.' What puzzles me is why no one took the effort to write down Gartners' own definition on big data. It ain't rocket science, is it?
Defining big data's tough, as the CIOs and consultancies in this article make clear: Here's a recent definition from our colleagues, which notes that it's an evolving term: