This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
5. - Big data in action: The case studies: Read more in this section
- 'Dirty data' gives Land O'Lakes an advantage
- Ancestory.com reinvents legacy systems with Hadoop
- Chief scientist's role in big data analytics
- Miami Dolphins use mobile to get fans to the stadium
- Leveraging a semantic database at State Street
- Data discovery in the hands of users
Explore other sections in this guide:
- 1. - Big data: The promise, and a primer
- 2. - Watch big data evolve before your eyes
- 3. - Big data in traction: The challenges
By the numbers
- Facebook sees 300 million photos uploaded to the site daily.
- On New Year's Eve, that number jumped to 750 million photographs uploaded to the site -- in a single night.
- Every day, Facebook records 3.2 billion "likes" or comments.
- The social networking site has the largest Hadoop cluster in the world, about 100 PB, which ingests 500 TB of new data every day.
Facebook Inc. tracks more than 1 billion active users on its site every month -- up from 600 million at the end of 2008. More users mean more data means more storage space. And for the Menlo Park, Calif.-based company, which started life in 2004, that growth also represents an ever-expanding data trove to mine for business and customer insights.
Up until a year ago, the world's largest social networking platform relied on a homegrown reporting business intelligence (BI) tool, as well as tools from MicroStrategy to dig through data. Both, however, required technical expertise, causing a bottleneck between BI specialists and would-be analysts. "MicroStrategy is a good tool, but [it created] a dependency on the developers," said Namit RaiSurana, BI engineer for Facebook at last month's Gartner BI Summit.
RaiSurana and his team searched for a solution that would close the gap, and landed on what is becoming a common cure for analysis paralysis: data discovery tools. By transforming data into rich, readily consumable visualizations, these tools allow business users to analyze complex data sets without having to be trained extensively in data science. But the business-friendly tools -- already vying to become the Excel spreadsheet of the big data age -- are a double-edged sword.
"Data discovery tools give business users more flexibility and give them control," said Rita Sallam, research vice president for Stamford, Conn.-based Gartner Inc. "That being said, now there's the possibility of business users creating calculations different from those sanctioned by the enterprise."
The tools tend to be easy to use and put power into the hands of the business user, but without support from IT, they can also create silos, Sallam said. The challenge for IT business leaders is this, she said: "How can you have that balance where you want to meet the business needs, but at the same time, have some sort of mechanism in place to achieve some sort of level of governance?"
For Facebook, the answer was investing in a community where users could ask questions, challenge themselves, build up their skill sets and increasingly become more independent from BI developers. Facebook selected tools from Seattle, Wash.-based Tableau, whose popular software stemmed from research done at Stanford University between 1997 and 2002. What came next for Facebook was a search for the experts who could help implement the tool. That led RaiSurana to several new hires, including Andy Kriebel, author of the blog VizWiz and a co-presenter at the Gartner summit. Before Facebook wooed him away, Kriebel worked for Coca-Cola, where he provided promotion analysis.
"Namit [RaiSurana] talked about all of this gigantic data that we have, but unless we teach people how to understand that data and use that data, it's really meaningless," said Kriebel, who's now in charge of data visualization at Facebook. "So, our goal … is to make everyone an analyst."
The company is on its way to realizing that goal. Today, Kriebel is one of four BI engineers who support an impressive 500 Tableau users per month at Facebook. Here are three practices that led to Facebook's success.
1. Empower the tribe. Fortunately for RaiSurana and Kriebel, their target user group knows a thing or two about networking and collaboration. One of the first things they did was establish a Tableau Facebook Group where users can pose questions. The open platform saves developers from having to address the same question repeatedly on an individual basis, and it helps build a kind of reference manual that evolves organically with the users. The Facebook Group also acts as a platform for collaboration, something that isn't built right into the business model of most companies. That kind of collaboration, according to Kriebel, help takes some of the pressure off the handful of Tableau experts who support hundreds of users.
"The greatest thing about that is we're building this incredible tribe and this incredible repository of knowledge," Kriebel said. "It's awesome to see questions posted and other people responding who you trained a couple of months ago. It makes us feel really good that we're contributing by not having to contribute."
It's awesome to see questions posted and other people responding who you trained a couple of months ago. It makes us feel really good that we're contributing by not having to contribute.
Andy Kriebel, Facebook
Users can also share their work by publishing their initial findings or projects to an uncertified Facebook site, accessible by any analyst.
"There's no really holding back, because you can publish stuff on uncertified and start sharing with other people," RaiSurana said. "There's no lag or dependency on the BI team itself."
The process, however, has built-in controls. Before any project can move from the uncertified to the certified site, it undergoes a BI review process.
2. Provide beginning and advanced training. Every other week, Tableau users or would-be users can attend either a beginner's or an advanced training session. Beginner's sessions focus on getting a feel for the data discovery tools, learning about different chart types, and even building and publishing a dashboard. Advanced sessions delve into more advanced types of charts, such as scatterplots and lollipops.
As a member of the elite team that supports Tableau at Facebook, Kriebel finds inspiration for new lessons every time someone gets stuck and needs help.
Gartner's Sallam points to this as a best practice for any organization: "Training and change management become critically important," she said. "It has to be an ongoing program -- monthly or weekly -- to get the most out of the tool's capabilities."
3. Rev up competitive juices -- or gamify! Kriebel's enthusiasm for data visualization is undeniable. One way he shares that enthusiasm with others is through a "data visualization of the week" contest, where he finds and rewards (with swag or other prizes) a compelling example from the user group. "I think people appreciate it more for the recognition," he said.
Sharing examples of good work can boost morale and can help build interest in the tool. According to Sallam, data discovery tools "grow sideways" by word of mouth rather than by a top-down initiative.
Working with the tool can trigger recognition for an employee, but so can building onto the tool. With Mark Zuckerberg at its helm, it might not be surprising that Facebook embraces hacking, also known as "Facebook pushing the envelope." That includes hosting hackathons. Here's the innovative part: In order to participate, employees are required to tackle non-work projects. Hackathons can focus on general, out-of-the-box exploits or on targeted projects, such as improving the data discovery tool's functionality. Two examples: better email functionality and a richer metadata repository.
"Tableau is great the way we bought it, but there are some things we wanted to have to make it a better fit," RaiSurana said.