Strata + Hadoop World 2016: Hadoop and Spark in spotlight
Reporting and analysis from IT events
In the story, The Red-Headed League, Sherlock Holmes and Dr. Watson meet a red-headed man who tells them he's been hired by an agency just for the color of his hair, and that he gets paid a lot of money to sit in a room and do nothing all day. He asks Holmes and Watson to investigate.
After the man leaves, Watson wants to get started on the investigation right away, but Holmes says this is nothing more than a three-pipe problem. "He sits in his chair, and what does he do? He smokes three pipes. At the end of the three pipes, he has the solution to the Red-Headed League," Maria Konnikova, contributing writer to The New Yorker and author of Mastermind: How to Think Like Sherlock Holmes, said during her talk at the recent Strata + Hadoop World conference in New York.
Sitting quietly with one's thoughts shouldn't be left to fictional detectives; stepping back from the daily bombardment of alerts, alarms and notifications and embracing quiet should become a regular practice in any IT shop, Konnikova argued.
Studies show that disconnecting from the chaos leads to creative thinking and creative problem solving, Konnikova said, pointing to the work of Jonathan Schooler, a psychologist at the University of California, Santa Barbara, who studies attention. Schooler discovered that "what happens when we're doing nothing is this default mode network of the brain becomes incredibly active," she said. "People start thinking about the future, they start planning, they start activating parts of the brain that tells them, 'Hey, these are possible future outcomes;' they start building things, they start creating things in their mind, they start cross-pollinating ideas."
IT folks -- subject to constant demands on their attention -- may find it especially hard to think creatively. According to Konnikova, multitasking, a skill lauded by modern workplaces, may be the enemy of creativity. The heaviest multitaskers turn out to have a hard time concentrating on a problem. "That's what happens when we don't know how to sit alone," she said. "We lose the ability to concentrate and we become even worse at the thing we're supposed to be good at."
Konnikova isn't suggesting CIOs and IT leaders become three-pipe problem solvers, but she does suggest taking 10 minutes to do mindfulness meditation. "I hate the term. You think Buddhism, you think new world, but you may not know that you can do this absolutely everywhere," she said.
Ten minutes a day can help develop the muscle to pay attention, Konnikova said. Action-oriented alpha types may at first find themselves bored by meditation, but that initial boredom will bring benefits, she said. Indeed, it's an increasingly common practice among hedge fund managers and CEOs. It's a way to "make boredom productive so that, afterwards, you can become more creative and insightful; you are able to ask big questions, to see the big picture," she said.
Joseph Sirosh, corporate vice president for machine learning at Microsoft, gave Strata + Hadoop World attendees some insight into both machine learning technology and how to handle a viral hit. Sirosh was there to talk about #HowOldRobot, a Microsoft invention that uses facial recognition and analysis technology to guess the ages of people in photos. Originally built to show off machine learning technology at a Microsoft conference, #HowOldRobot turned out to have mass appeal.
When it officially launched on April 30, how-old.net attracted 50 million users in one week. At its peak, the site was juggling 1.2 million users per hour.
"We learned a few things about that spark of virality," Sirosh said. They are as follows:
- Consider the cloud. "It's easy, it's cheap, it's agile," he said. "It took us three weeks and one developer to build #HowOldRobot and all of the analytics behind it."
- Elasticity was a key to #HowOldRobot's success. "At its peak, the cloud upped its scale to 1,600 cores. And when the traffic decreased, it scaled back," he said. "With the economics of the cloud, it cost us a few thousand dollars."
- Live metrics and monitoring are vital. Real-time analytics tools helped Sirosh's team keep tabs on social media feeds, providing the flexibility to react when necessary.
- Experimenting is the new normal for app development. "Remember, it doesn't have to be perfect to be successful," he said.
IoT and the Tour de France
This year, the Tour de France, a multistage bike race that's been around since 1903, was introduced to the Internet of Things. All bicycles were outfitted with a small GPS device attached just below the seat. Not only did commentators and broadcasters have access to data such as the distance between a breakaway group of riders and the rest of the pack, but fans could access rider statistics through a beta site.
The old-world sport meets new-world technology was powered by Dimension Data, an IT services company. "Each of these devices on the bikes talks to other bikes, as well as sensors in the team cars," Jim McHugh, vice president of marketing at Cisco, said during his Strata + Hadoop World talk.
That data gets relayed to a race helicopter, and then to a media truck at the end of the stage, and then to a Dimension Data truck. "Inside that truck, an analytics platform is doing the analysis and computation, which are passed on to broadcasters and commentators and digital platforms," he said.
A new role for data scientists
Data scientists aren't just digging through data lakes at the biggest retail or financial institutions. They're also helping with disaster relief efforts. "The modern day first responder is a data scientist," DJ Patil, chief data scientist of the United States Office of Science and Technology Policy, said at Strata.
When earthquakes struck Nepal earlier this year, data scientists attempted to gather information through GPS and imaging technology as quickly as possible to predict aftershocks and landslides, according to an article in Wired Magazine. The process is still being fine-tuned, but information like this can help disaster responders "prioritize how to use their resources," Patil said.
"We think a technology is neither radical nor revolutionary unless it benefits all of America." -- DJ Patil, chief data scientist, United States Office of Science and Technology Policy
"The cloud turns hardware into software, it turns software into services and it can turn data into intelligence." -- Joseph Sirosh, corporate vice president for machine learning, Microsoft
"Your product is only as good as your team." -- AnnMarie Thomas, associate professor, School of Engineering, Schulze School of Entrepreneurship and the Opus College of Business, University of St. Thomas in St. Paul, Minn.
"I don't know if you've noticed how strangely bucolic the whole world of big data is. You have data streams that flow into the data lake. And then there's the data silo that stands by the ole data warehouse, where grandpa used to sling his bits. … Today, I'd like to challenge this picture and have you think about data as this evil, radioactive waste pile of sludge that none of us actually knows how to safely contain. In particular, I would like draw a parallel between our industry and the troubled industry of nuclear energy, which is another one of these technologies that never figured out how to separate the beneficial uses from the malevolent." -- Maciej Ceglowski, founder, Pinboard.in
Read how Intel is experimenting with mindfulness to curtail digital overload.
Discover more about Spark vs. Hadoop.
As vendors diversify, "What is Hadoop?" becomes a tough question.