Data science: Mining available info for hidden value

At Sears Holdings Corp. in Hoffman Estates, Ill., Chief Technology Officer Phil Shelley understands the importance of promoting data science and finding the hidden value in statistics that most businesses throw away, archive or ignore.

In this video, filmed at the Fusion 2012 CEO-CIO Symposium in Madison, Wis., Features Writer Karen Goulart sits down with Shelley to discuss the hidden value in data science and the reasons why mainstream businesses should use such open source programs as Hadoop.

Shelley explains that there is business value in analyzing data for sales trends and consumer habits. Using programs like Hadoop to conduct data science explorations is valuable for all mainstream businesses, online or not, he says.

Read a partial transcript from this interview below, and watch the Q&A to learn more about what Shelley has to say about data science.

Karen Goulart: You are speaking at the symposium about practical ways of using big data technology for mainstream industry. And you're going to be talking about why businesses have hidden value in the data they throw away or archive or ignore. Can you tell me a little bit more about that?

Phil Shelley: Yes. Sure. I'm also going to be talking about a technology called Hadoop, which really came out of Google a long time ago, about seven years ago now. So, Hadoop is not well understood or known outside the Internet space. So, we're going to spend quite a bit of time explaining the technology that came out of the Hadoop effort and the Hadoop project and how it relates to, let's say, normal, non-Internet businesses.

A lot of businesses in the non-Internet space have a lot of data. They actually don't keep it, they don't analyze it and they don't make business value from it in the same way as, say, some of the better-known names in the Internet space do. So, I'm going to explore that aspect of data in a more normal enterprise and how they can use some of the technologies that originally started in the Internet space, but now can be leveraged by the same regular, normal, non-Internet companies.

What are some of the big data opportunities at regular companies that are waiting to be exploited?

Medium to large companies have, obviously, a lot of transactional data. It could be a manufacturing company that has manufacturing processes that store data. It could be inventory. It could be supply chain. It could be customer transaction data. Most companies of any size don't keep that data today. They archive it, put it on tape, put it away somewhere and never look at it again.

What's happened in the last few years -- especially pioneered by [companies] like Google, Facebook particularly, Yahoo, Amazon -- is that those people have been using these new technologies to keep every grain of detail. For instance, your Facebook has everything about you that you've ever done in Facebook stored away in Hadoop, in Facebook, that you can dig into and your friends can dig into. They can look for connections between you and anybody they think you might be interested in connecting with. That is not done in non-Internet companies, in most cases. They don't have the tools to keep all that data.

An example might be [the] supply chain. [Manufacturers] don't keep all of the history of all the products for all the years gone by in their supply chain. Most of the time they don't realize that there's any value in it, but now, because the technology's available, you actually can keep that data. And then you can mine it for hidden value.

Let us know what you think about the story; email Karen Goulart, Features Writer.

View All Videos

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

Q. What kind of science do you get with bad data?

A. Bad science. See also: Economics.

I assume the findings of this 'science' will be used for marketing. I'm pretty sure it won't be for physics, biochem, or engineering. Am I close?