Add semantic analysis to ward off big data/bad analytics syndrome

Big data and semantic analysis, virtual personal assistants for everyone and IT resolutions: The Data Mill reports.

Traditional data analytics has three rungs: descriptive, predictive and prescriptive analytics. But Scott Mongeau doesn't believe the ladder goes far enough -- especially as businesses reach to make better decisions based on big data analytics and data science.

The Data Mill

A business analytics consultant for Deloitte Nederland, Mongeau believes both the rigorous testing of models (what he calls diagnostics) and a better way to describe the meaning and context of projects to computers (what he calls semantic analysis) need to be built into the three rungs if data analytics is going to help solve big problems. He argued that without the rigorous testing of analytical models and semantic analysis -- and a social network in which to share the theories behind analytical models -- big data analytics will be susceptible to bias and, in the end, fail to help people make better decisions.

That's because data science -- and science in general -- is hard, said Mongeau in a TEDx talk for Rotterdam School of Management. Humans are predisposed to make bad decisions with data -- even when they aren't doing the computing. "As the economic Nobel Prize winning, experimental psychologist Daniel Kahneman puts it: We were built to win, not necessarily to be right," he said.

So feeding data into a model for analysis doesn't erase the potential for bias and misinterpretation. That's because models themselves are biased -- they are all just a representation of reality, he said. And people can engage in "over fitting" or placing too much significance on the results from a sample; or they mistake correlation for causation; or they have a false sense of confidence in "the big data approach" -- namely that having so much data and so many variables cancels other factors out.

Even a data science institution like Google Inc. can fall prey to the big data/bad analysis syndrome. Google Flu Trends, for example, relied on 45 key search terms that seemed to predict potential flu outbreaks faster than traditional reporting methods. And then there was an unanticipated H1N1 outbreak.

"Everyone worried about catching the virus went online and began searching," Mongeau, a part-time lecturer in business decision making at Nyenrode New Business School in Amsterdam, said. "As a result, Google Flu Trends over reported outbreaks." Google then did something Mongeau believes businesses should pay attention to: It used the new information to help revise (and strengthen) its model.

"In order for us to have good science, we need to have people who are not driven to prove final theories, but who are striving to have theories attacked … and stimulate debate," Mongeau said. "Knowledge is created in a social framework. We share knowledge and validate it amongst each other."

Models -- like Google's -- need diagnostics or the use of "proper statistical tests and procedures to ensure models are sound," he said. And to really take things to the next level, CIOs will need to consider semantic analysis, or technology that can better describe to computers the meaning and context of the project.

"If we're able to improve diagnostics with analytics, and we're able to improve semantic analysis," he said, "we'll see tighter integration of decision making with computers in organizations."

Big data = personal assistant?

Could everyone have a virtual personal assistant by 2024? That's the prediction of one venture capitalist.

Previously on The Data Mill

Big data tech and bitcoin at MIT VC Conference

Analytics 3.0 -- the old guard masters how to build data products

Want better analytics? Start asking 'crunchy' questions

Speaking at the MIT Venture Capital Conference, Kent Bennett, partner at Bessemer Venture Partners, said big data today is being applied on the consumer, but isn't really being used by the consumer. 

"I think there's a massive opportunity for [a] Siri-like [device] … that takes all of my data -- what I'm purchasing, where I live, what I'm up to -- and proactively gives me good ideas for things I could buy or do," he said. "I am positive that 10 years from now we're all going to have a virtual personal assistant who will know all of our tastes and will help us make decisions."

Uh, where do I sign up?

Analytics, BI and the CIO

Gartner Inc. believes business intelligence (BI) and data analytics will remain a top focus for CIOs through 2017.

"As the cost of acquiring, storing and managing data continues to fall, companies are finding it practical to apply BI and analytics in a far wider range of situations," Roy Schulte, a Gartner analyst, said.

Gartner predicts:

  • By 2015, the majority of BI vendors will make data discovery their prime BI platform offering, shifting BI emphasis from reporting to analysis.
  • By 2017, more than 50% of analytics implementations will use streaming data generated by machines, applications and/or individuals.
  • By 2017, analytic applications offered by software vendors will be indistinguishable from analytic applications offered by service providers.
  • Until 2016, big data confusion will constrain spending on BI and analytics software to single-digit growth.

#ITresolutions

What's your New Year's IT resolution? In SearchCIO's final tweet chat of the year, we asked loyal Twitter followers about IT regrets and resolutions. Read our full recaps (here and here) or simply enjoy this (lightly edited) tweet snack:

"I resolve to partner more with the business...can't offer solutions unless you know what they want." -- Brian Katz, mobile guy, @bmkatz

"Not mine, but common for IT leaders I talk to -- not doing 'the big thing' earlier e.g. DevOps, Cloud, Mobile Big Data, Social."-- Andi Mann, vice president at CA Technologies, @AndiMann

"The CIO paradox -- how to stay ahead of the business curve in a safe & secure manner with so many toys out there." -- Harvey Koeppel, president of Pictographics Inc., @hrkoeppel

Happy holidays! See you in 2014.

Welcome to The Data Mill, a weekly column devoted to all things data. Heard something newsy (or gossipy)? Email meor find me on Twitter at @TT_Nicole.

This was first published in December 2013

Dig deeper on Enterprise business intelligence software

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Related Discussions

Nicole Laskowski, Senior News Writer asks:

What's your #ITresolution?

0  Responses So Far

Join the Discussion

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchCompliance

SearchHealthIT

SearchCloudComputing

SearchMobileComputing

SearchDataCenter

Close