Sergey Nivens - Fotolia
Data virtualization tools have been around for years, but the technology is shifting in importance from tactical to strategic, as businesses look to integrate and access data from across Web sources, social media and the Internet of Things (IoT).
In this Q&A, data management expert Rick Sherman, founder of Athena IT Solutions, explains why data virtualization should be on the CIO radar, the benefits it brings over traditional data integration tools, how it can be used for competitive advantage and which industries the early adopters are in.
Editor's note: The following interview has been edited for clarity and length.
Why is data virtualization an important issue for CIOs?
Rick Sherman: For CIOs, most companies -- small, medium and large -- are trying to access information from a lot of different sources both internally and externally. They might want to get information about their customers, share information with partners, buyers or whatever. In the old days they controlled the data, they could get to the data, they'd integrate the data. But now … data's coming from everywhere, not just from applications. Companies don't have the time or the money to physically integrate the data. Data virtualization is the ability to … extract the data from different sources and virtually be able to look at the data, analyze the data, combine the data with other sources. It's a perfect way in this fast-changing world to be able to get to data in real time … and analyze [and experiment with] the data. … it's perfect for what CIOs are confronted with: the onslaught of data they have.
It sounds like it's a need that didn't exist 10 years ago. Is it just in the past four or five years that data virtualization tools have become necessary, or have they been around longer than that?
Sherman: Well, like a lot of things in analytics, things have been around for a long time but the business need for them and the ability of the environment that we're in -- in terms of the amount of memory people have, network bandwidth -- [wasn't conducive to effective use of its capabilities]. The technology … has existed for a while but the business demand (social media, IoT, sensor devices, machine learning, Web data and a lot of the cloud data) [did not]. A lot of companies use cloud applications … so there's much more demand for this virtualization of the data and there's much more data that's scattered out there. Even though the technology existed before, the need for it has exploded and then the capabilities for that kind of technology to go after the volumes of data -- the unstructured data and structured data -- have all sort of grown based on the demand.
It's like a perfect storm where demand and capability have caught up to where it's a pretty practical and effective way to get the data.
What kind of advantages do data virtualization tools bring over traditional data integration tools or even application integration?
Sherman: Well, traditional data integration -- and I've been doing data integration for a long, long time -- is a great way … [to] get data into the state that you can analyze [it]. [These are] great technologies that have improved over the years. But it takes time to sort of find what that data is, figure how to cleanse data [and] put big data in one place. There is certainly still a need for data integration, application integration [and] data warehousing.
There [are some] use cases for data virtualization [instead of traditional data integration]. One is [if] it's a new source of data. You may need at some time later on to integrate the data but you want to get to the data now to analyze and look at it, see how useful it is, and you haven't gotten to the point where you can invest in getting it integrated. That's one use case scenario: the precursor of integrating it. There are plenty of other use cases where you never integrate the data with your source of data; you may not own the data. There's social media data, there's Web data, there's data that you might be exchanging between prospects, suppliers, partners and so on, that you may never own or have the ability or desire to integrate with your data. There's plenty of use cases for this data that's out there that you don't need to integrate. Both [scenarios] -- as a precursor to data integration or where you don't need to physically integrate the data but you need to analyze the data and bring it virtually into play -- are great use cases for data virtualization.
Is data virtualization technology something that can be used for competitive advantage or is it more sort of table stakes at this point for any business that's really serious about business intelligence?
Sherman: I think there are a lot of common uses of data virtualization at the tactical level. Applications like call centers use data virtualization all the time. That's tactical, that's just sort of keeping up with your peers. But, on a strategic level, data virtualization is just emerging as something that's more broadly used. To be able to get to the big data sources like social media, Web analytics, Internet of Things -- if you can use data virtualization to get that data fast and analyze it, that does give you a strategic or business edge as an advantage to it. It may not be the case five years from now, but at this point from a strategic point of view there is a competitive advantage to using it.
How would a call center use data virtualization?
Sherman: This has been a pretty common application for a while. [Say] you're in a call center in a financial services firm that has different lines of business. [It might] have bank accounts, business banks accounts, 401K, custodial accounts. Those are all in different applications, so the mortgage data might be in one location, credit cards might be in another, the kids' custodial account might be in another.
[If I call customer service] an account rep or customer support person [can] use virtualization to query across all those sources in real time and find out all the information related to me that they have. …
Some companies are [also integrating] social media data or Web analytics data. They have a customer prospect, they're providing some different services, they can see what products that customer orders. You might go to Amazon for an order; what product did you look at? If you're already a customer they might be looking at some campaign marketing tools [or] social media related to you. It could be related to what data they have on you as a customer or as a prospect and they could extend that to social media or Web analytics. It could extend to all types of data.
Would the strategic use of it be pulling in new pieces of information that people have not, up until now, been using to help in a call center discussion?
Sherman: Yes, and even more strategically than that, so the call center is sort of the tactical, practical use of it. More strategically, people might be using data virtualization to pull in social media, different marketing campaigns. Most companies have one or more marketing tools out there. They might have their sales pipeline information. They might have information that their partners or suppliers are providing relating to you, if their business is to customers directly or to other businesses. There's a lot of different information that can come in and they can analyze how effectively their marketing campaigns are running, their sales campaigns are running.
It can be used in healthcare [to assess] their patient population health. There's a lot of different metrics related to how many times you visit your primary care physician, different specialists, information related to different tests that were run on you in the physician's office or in a hospital. All that information can be brought in using virtualization to analyze your patient population.
There are a lot of different applications to it, more than just the data they have on you now that's in some applications in house. It's the ability to bring all this varied data together from a lot of different sources quickly in real time.
Are there specific verticals that are well-suited for data virtualization over others or does it have broad appeal across industries?
Sherman: Well, it certainly does have broad appeal but … the more information-intensive and information-astute industries are the first ones to use it. Financial services and different industries that do a lot of marketing analytics are two of the profiles that are the early adopters and more advanced users of virtualization. I mentioned healthcare; that [will] be a little bit of a late adopter in that they have to worry about privacy and security and some other issues.
In between … as companies get more into predictive analytics [and] more into analyzing information outside of their on-premises data, the more applicable this becomes. So any company that's using a lot of big data sources is … a prime candidate for it. … As data explodes and as more and more information becomes available, the use of that data and the use of virtualization of that data expands.
Explore different data virtualization technologies and tools in this SearchDataManagment Essential Guide. Then read how Pfizer quickened the pace of information delivery to the company's researchers, and take a quiz to see how familiar you really are with data virtualization techs.