News Stay informed about the latest enterprise technology news and product updates.

Semi-structured data is king of LinkedIn's recommendation engine

LinkedIn's semi-structured data shines, CVS responds to social media finger wagging, and Microsoft does the shuffle, The Data Mill reports.

If you have a LinkedIn profile, you've no doubt noticed a section called "Jobs you may be interested in." The feature seems similar to an Amazon or a Netflix recommendation engine, but it's a little more complicated than that, according to Anmol Bhasin, director of engineering for LinkedIn's recommendations and personalization department.

The Data MillMachine-learning algorithms are used to comb through data extracted from member profiles and find interesting patterns. "But what I wanted to bring to light is that it's not the machine-learning algorithm that's really driving this product," Bhasin said during a recent presentation for Data Driven Business, which organizes an annual series of text analytics conferences.

The real driver is the rich troves of semi-structured data or text that LinkedIn's members freely give away: job titles, geographies, industry information, skill sets. Gathering up that data across the more than 225 million LinkedIn profiles can uncover patterns, such as when people generally look for the next step in their career, work migration trends and what cities are "stickiest," i.e. areas potential employees are less likely to move from. Comparing a member's data against the trends helps to target the right job opening to the right profile.

"So far this product has been tremendously successful. It accounts for about 50% of our job applications running through the network," Bhasin said.

Social shaming works

It's been a rough summer for drug store chain CVS. Customers created a meme when they took to social media to sound off about the incredible growing length of the CVS receipt. They published photos, sarcastic remarks, digital finger wags and creative uses for the sometimes six-foot-long strips of paper. A parody Twitter profile (@CVS_Receipt) launched with the message, "I am unnecessarily long." Even American talk show host and comedian Ellen DeGeneres got on the bandwagon.

"Has this happened to you?" she asked her audience, holding up a CVS receipt next to her. "It's the size of a 12-year-old child."

The social chatter paid off. CVS took to its Facebook page to report it has "been listening" and plans to cut those receipts by 25% in the coming weeks.

In-memory to the rescue

The problem with most business intelligence is its "stilted interaction model." That's according to Mark Madsen, the president and founder of the consultancy Third Nature Inc. His comments came out of a recent edition of The Briefing Room, where he appeared alongside Chris McPherson from IBM Business Analytics.

Databases that use SQL queries periodically encounter workflow problems. There are gaps of time as the query "goes out to the database, fetches some data and puts it on the screen," Madsen said. Dusting off a 1982 IBM study called The Economic Value of Rapid Response Time, "task-performance gaps" longer than three seconds can disrupt worker thought process and create serious inefficiencies. Madsen argued that's still a problem today as bigger data loads are processed more quickly, but SQL queries are still slow.

Previously on
The Data Mill

Community cloud could fix data crunching dilemma for cancer research

Social collaboration software through the lens of a SharePoint lover

The science and profit model behind lean analytics

That's where in-memory technology can be helpful. Caching and aggregating data and building data marts are techniques that can speed performance time up by cutting out the "back and forth with the back-end stores," which is "where our performance and interaction problems come from," Madsen said.

BYOD fast fact

A Gartner Inc. report, Bring Your Own Device: The Facts and the Future, offered up evidence that bring your own device (BYOD) is where it's at. Survey evidence suggested 38% of companies expect to stop providing devices to workers by 2016. Gartner estimated only 15% of companies will never provide any BYOD option.

The corporate shuffle

Soon-to-depart CEO Steve Ballmer was not the only Microsoft exec to make news last week. Days after Ballmer announced plans to retire within the year, VMware announced it was hiring Tony Scott as its new CIO. Scott was Microsoft's CIO up until this past June when he left to pursue "personal projects," tipping the media off to his departure with a LinkedIn profile update. Scott's sudden resurfacing was not exactly unexpected: When he left Microsoft, Scott said he would be headed "back to 'work' (in some form) in a few months." Meanwhile, no word yet on who will permanently replace Scott as CIO.

Welcome to The Data Mill, a weekly column devoted to all things data. Heard something newsy (or gossipy)? Email me or find me on Twitter at @TT_Nicole.

Next Steps

Online music startup picks a recommendation engine

Dig Deeper on Enterprise business intelligence software and big data