CIOs have some heavy lifting to do.
Machine learning — essentially algorithms that can process massive amounts of information in humanlike ways — offers IT chiefs a wealth of new opportunities, said Ed Featherston, vice president and principal architect at Cloud Technology Partners, a cloud computing consulting outfit in Boston.
“What machine learning does is help them identify patterns that they may not have seen or found before and find potential new business opportunities or new ways to change things in the business,” Featherston said in a video interview published last week. He spoke to SearchCIO at Cloud Expo in New York in June.
Many vendors offer machine learning capabilities — IBM, with its Watson supercomputer, is among the most famous; Amazon, Microsoft and Google all have their own services, and they’re readily available to CIOs. To work their analytics magic, though, they require vast pools of data, presenting a significant challenge, Featherston said: getting data to where the algorithm is.
“If I’m using IBM Watson, for example, and I have 50 PB of data,” he said in the video, “sending that out over the internet: probably not going to be an optimum solution.”
Think for a moment about how big just one petabyte (PB) is. Tech explainer site Lifewire equates it to “over 4,000 digital photos per day, over your entire life.” It could take a typical company years to transfer all of it to the cloud.
Of course, the same tech giants offering machine learning capabilities are also public cloud providers with the power to process the amount of data needed. And they all have ways for companies to get it to them, Featherston said.
Amazon Web Services (AWS), for example, has an appliance called Snowball that’s shipped to a company that wants to move data. Up to 80 terabytes (TB) of data can be transferred onto the device; then it’s physically shipped to an AWS data center. (A terabyte, while no petabyte, is nothing to sneeze at. Lifewire estimates it would take 1,498 CD-ROM discs to hold 1 TB.)
Still, AWS goes much bigger than that. Keeping with the piling-on theme, AWS last year rolled out the Snowmobile, a truck that can hold and ship 100 PB of data.
“They drive up with a tractor trailer full of storage units to your location, tie into your network, load those petabytes of data up onto it, drive the truck back to an Amazon location and load that data up onto the network,” Featherston said.
Other providers are catching up to AWS, the top-selling cloud service. Google’s Google Cloud Platform – a distant third in the public cloud market, behind Microsoft’s Azure — last week released its Google Transfer Appliance, in two sizes. Up to 480 TB of data can be put onto the larger one, and then UPS or FedEx can pick it up and cart it away to Google.
Delivery methods that make it easy for customers to transfer data to cloud providers make sense, Featherston said, because vendors know that having lots of data on hand is key for enabling machine learning capabilities.
“The more information [machine learning] has to work with, and the feedback information it has to work with, the more it can produce usable results,” he said. “So the volume is critical, but the vendors that are offering these algorithms are also offering you ways to get that data there.”
It’s not bad business, either.