Tip

Introduction to data science: A crash course for CIOs

Data science is one hot field these days, but what exactly is it? The following definition sheds light on this critical new field: "Data science is the study of where information comes from, what it represents and how it can be

    Requires Free Membership to View

turned into a valuable resource in the creation of business and IT strategies."

Phil Simon,
author and speaker

Mining large amounts of structured and unstructured data to identify patterns can help an organization rein in costs, increase efficiencies, recognize new market opportunities and increase an organization's competitive advantage. Some companies are hiring data scientists to help them turn raw data into information. To be effective, such individuals must possess emotional intelligence in addition to education and experience in data analytics.

Put differently, data scientists are not the same wine in a different bottle. Typically, they are amalgams of data modelers, statisticians, technologists and business analysts. They are able to harness the power of big data -- the vast amounts of unstructured information contained in blog posts, tweets, call detail records, podcasts, videos and the like.

Why am I hearing about data science now? For many reasons. First, we have entered the era of big data, and big data and data science are cousins. It's safe to say that we wouldn't be hearing much about the latter were it not for the former.

Second, highly visible companies like Amazon, Apple, Facebook, Google, Netflix and Twitter have used big data very effectively and seen tremendous results in the process. For instance, Google has shown a remarkable ability to predict flu outbreaks more accurately than the CDC.

Third, people like Michael Lewis (author of Moneyball) and statistician Nate Silver have made data cool. The 2012 U.S. presidential election saw the unprecedented use of big data and data scientists. Some have even said that data is the new oil.

[Data scientists] are able to harness the power of big data -- the vast amounts of unstructured information contained in blog posts, tweets, call detail records, podcasts, videos and the like.

Do data scientists use legacy enterprise tools and technologies?

In short, no. Unstructured data mostly comprises what we now call big data. Relational databases and data warehouses cannot store -- much less effectively analyze -- petabytes of unstructured data such as videos, blog posts and tweets. Relational databases and SQL statements just can't handle big data, period, given that relational databases weren't built with the same level of fault tolerance and parallel processing as distributed file systems.

For data scientists, big data tools of their trade include Hadoop, New SQL and NoSQL and columnar databases to store, retrieve and analyze petabytes of semi-structured and unstructured data, the vast majority of which comprises big data.

What's more, many data scientists use a free software environment called R (part of the GNU project) for statistical computing and graphics. Collectively, new solutions like these allow data scientists to work their magic.

From a midsize organization perspective, big data tools were far too expensive five years ago. Today, data storage has become a commodity and midsize organizations are now using Hadoop, NoSQL databases and data contest sites like Kaggle Inc. to harness the power of big data.

What can data scientists do for my organization?

More on data science

Data science: Mining for hidden value

Data scientists help businesses navigate seas of big data

Data scientists dig open source tools

There's no dearth of ways in which data scientists can benefit organizations of all sizes. In other words, big data and data science are not the sole purviews of large enterprises. Healthcare organizations like Explorys [featured in Simon's fifth book, Too Big to Ignore: The Business Case for Big Data] are marrying structured and unstructured data, decreasing healthcare costs, providing superior care and saving lives. The ability to discover new and important customer insights is particularly acute.

In a nutshell, data scientists can find needles in haystacks of big data.

About the author:
Phil Simon is a sought-after speaker and the author of five management books, most recently Too Big To Ignore: The Business Case for Big Data. A recognized technology expert, he consults companies on how to optimize their use of technology. His contributions have been featured on NBC, CNBC, Inc. magazine, BusinessWeek, The Huffington Post, The Globe and Mail, Fast Company, The New York Times, ReadWriteWeb, and many other sites. Write to him at phil@philsimon.com.

This was first published in May 2013

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Expert Discussion

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.