Real-time data analytics is like the giant squid of the technology landscape: There are companies out there doing real-time data analytics, but sightings are rare.
One of those elusive creatures is Mixcloud, an online platform for streaming audio content. The London-based startup, sometimes described as "the YouTube of radio," is not only using real-time analytics to make quick business decisions and build a better product, it's also working on a customer-facing real-time data analytics portal.
When the portal is completed, Mixcloud customers will be able to see who's listening to their content -- and when, according to Mat Clayton, chief technology officer and one of four Mixcloud founders.
Mixcloud, which launched in 2008, provides a place for users to create "cloudcasts" -- DJ mixes, podcasts, radio shows, even original content -- which they then upload to the platform for online streaming and general consumption. (The product also comes in a mobile app form and as a player widget that can be embedded on other websites.) Just as writers can self-publish their creations to readers via the Internet, cloudcasters can now air their audio creations via a platform in the cloud.
People who produce content can actually see who is viewing their content -- where, when and how.
As an Internet-only company, Mixcloud has a trove of data on unique visitors. Initially, the company used Google Analytics' free tool to provide insight into baseline metrics, but when Clayton and his team wanted to look at data on a more granular level, they ran into a snag. Google Analytics "does a lot of estimation" when drilling into or segmenting large amounts of traffic data. What it doesn't do is provide the "bands on those estimations," said Clayton in a recent webinar, referring to a so-called confidence interval that indicates how reliable the estimates are. That means when Clayton and his team added together multiple estimated values, they could total more than 100% at times.
"The variance is quite high, to say the least," Clayton said, adding that the company was uneasy about making product decisions based on the information.
The paid version of Google Analytics solves the problem, he said, but "it's significantly more expensive. I think we're talking six figures the last time I checked." Google Analytics is still used for baseline metrics to this day, but Clayton and his team also realized that as Mixcloud grew (the site now has more than 3 million active monthly listeners and sees, on average, three hours of content uploaded every minute), they needed a more sophisticated analytics tool.
"We decided we needed a system that was accurate and capable of scaling with our platform," he said.
Although the small tech team is heavy on engineers, building an analytics platform isn't a core strength, Clayton said, and he began looking for an external platform to bring in-house. He eventually landed on Acunu Analytics, a fairly fresh vendor face that launched in 2009 with the backing of "some of Europe's top VC [venture capital] funds," according to the website. Acunu Analytics relies on Apache Cassandra, an open source, NoSQL database -- originally developed by Facebook -- that was designed to handle large amounts of data -- and handle it quickly.
"Cassandra has proven itself capable of delivering near real-time performance to support interactive, Web-based applications at scale," Jeff Kelly, researcher for the Boston-based Wikibon Project, wrote after last year's Cassandra Summit. "It does this through a combination of its ability to store and access data in columns, its ability to perform extremely fast inserts, its use of distributed counters, and its ability to take advantage of solid-state drives."
On the front end, Acunu provides live dashboards and instant query capability for its users. Those dashboards give Clayton and his team a real-time window into how the servers are performing, as well as how features on the site are functioning, such as a "follow" or "play" button.
Music and technology
Music companies critical of Google's anti-piracy efforts
The Pirate Bay faces U.K. ban after court ruling
Cloud analytics: The future, but not today
"If we break the ability for users to click 'play' -- and this has actually happened … [the dashboards] gives us the ability to identify that those metrics have dropped very, very quickly, and alert us to the problem," he said. "Then we can go and debug and fix it."
Not only does the real-time monitoring ensure high-quality standards are maintained from -- quite literally -- one minute to the next, but it also enables Clayton's team to test where features, such as a recommendation box, have the most success on the page. By moving those features around the page and testing to see what triggers the most engagement, Clayton and his team can tease out user-friendly features and sweet spot locations and then "double down" on their efforts.
"We managed a 200% increase in the number of unique people per day clicking 'follow,' and each person who clicks follow is now following 200% more people," Clayton said during the webinar. "And that was just analyzing exactly where we put every single button -- which ones worked, which ones didn't."
Mixcloud also uses real-time Acunu Analytics to roll out new products or site features without taking the site offline. They, instead, do upgrades "on the fly," rolling out a new product to either a subset of users or all users depending on the risk level, he said.
Mixcloud's use of data to improve the overall product is a big part of the story -- but more chapters on the company's use of analytics are still being written. One includes a forward-looking personalization product of a real-time analytics portal for users to monitor how their content is performing on the site, according to Clayton.
"When users launch content," he said, "there's actually only a few short hours when this content goes massively viral or gets a lot of traction. And it would be good to be able to offer those users clear data -- actual insights -- of what's happening right now so the user can fix any problems, push it onto the right networks and focus the attention in the right places."
It's a difficult problem when data balloons by the second, he said, and "how we manage that will be the next challenge."