It's official: We now live in the big data era. So says the Federal Trade Commission in a 50-page report issued last week. While the FTC report acknowledges the many advantages of big data, it also raises concerns about how certain big data practices negatively impact consumers. The agency's message to businesses is clear: We're watching.
"Big Data: A Tool for Inclusion or Exclusion?" provides examples of how the commercial use of big data can "more effectively match products and services to consumers," and create employment opportunities in fields like education and healthcare. But the FTC report also underscores the inaccuracies and biases in big data practices that can be harmful to underserved communities -- companies using big data to exclude low-income individuals from getting credit, for example.
"The clear implication of the report is that the FTC is looking to investigate unfair and unethical big data practices -- and frankly bring enforcement actions in that area," said Brenda Sharton, head of law firm Goodwin Procter LLP's business litigation group.
For now, the FTC will rely on the consumer protection and equal opportunity laws already on the books -- the Fair Credit Reporting Act (FCRA), the Equal Credit Opportunity Act (ECOA) and the agency's own FTC Act -- to bring those enforcements.
"The FTC is functionally arguing that it has all the authority it needs to regulate big data practices in the U.S.," said Sharton, who is also co-chair of Goodwin Procter's privacy and cybersecurity group. "It is letting companies and businesses know how they can avoid becoming a target of enforcement" as they use big data for commercial purposes, she added.
Citing recent big data research that demonstrates how errors and biases can slip into every stage of a company's big data analytics practices and lead to discriminatory effects, the FTC report put forward four questions companies should consider to minimize these errors:
- "How representative is your data set?" Are your data sets missing information from certain population subsets? The types of excluded data that can be problematic include zip codes, social media usage or membership, and shopping habits; this data can be a proxy for poor or minority populations, Sharton said. "Also, be careful about limiting credit offers or the provision of goods and services based on irrelevant or protected characteristics, like race, gender, marital status."
- "Does our data model account for biases?" Look at whether there are biases in the collection and analytics stages of your big data life cycle, and figure out a strategy to overcome them, the report advises. The example the FTC report gives in the area of employment is when a company's big data algorithm only analyzes the data of applicants' from top-tier colleges to help them make hiring decisions. "They may be incorporating previous biases in college admission decisions," said the report's authors.
- "How accurate are your predictions based on big data?" Not all correlations are meaningful, no matter how good big data analytics is at finding them, warns the FTC. Take Google Flu Trends, for example. While initially accurate, the machine learning algorithm produced highly inaccurate estimates over time, failing to take into consideration such factors as people being more likely to search for flu-related terms if their local newspaper ran a story on a flu outbreak -- even if the outbreak occurred nowhere near them.
- "Does your reliance on big data raise ethical or fairness concerns?" Look at your analytics model through a lens of fairness. If, for instance, your hiring algorithm excludes candidates that live farther away from your company, make sure you've investigated the different types of populations in nearby and far-away neighborhoods to make sure your sorting by distance does not discriminate based on race and income.
The FTC report touches on another consumer protection issue: the need for sophisticated and nuanced information security when analyzing big data.
"Companies that maintain big data on consumers should take care to reasonably secure that data commensurate with the amount and sensitivity of the data at issue, the size and complexity of the company's operations, and the cost of available security measures," wrote the report's authors. Companies that manage Social Security or medical information, for instance, should have "particularly robust" security measures as compared with those that maintain only consumer names.
Companies should also be wary of the promises they make to their customers about how their data will be or won't be analyzed.
"If you're going to be using [the consumer data] for any statistical analysis, and it's either you or your third-party vendor, you want to make sure you don't promise consumers that you won't do that," Sharton said.
She advises companies that practice big data analytics to make sure they have counsel who can determine they are complying with laws like FCRA and ECOA -- "someone who is able to spot issues, someone who has this on their radar screen."
And there shouldn't be a disconnect between those in the company making decisions on how data is going to be used and those who are collecting the data or setting up those systems, which could be the CIO.
"Whoever is determining how the data is going to be used needs to pay attention to these issues, so they don't find themselves the subject of an FTC enforcement action," she said.
CIO news roundup for week of Jan. 11
More technology headlines from the week:
- Uber is calling all developers to help launch a new service for its riders. The ride-sharing company is promising to create "trip experiences" for users -- providing them with customized information and entertainment, like a news brief or a music playlist, to make their rides more enjoyable and productive.
- Google is facing off with Facebook and Microsoft in the virtual reality realm. According to re/code, the company is creating its own virtual reality computing group. Google's VP of product management, Clay Bavor, will head the division, a move that re/code speculates signals the search giant's offer of viable virtual reality products for enterprises.
- In less optimistic news, it looks like driverless cars -- at least Google's -- are not yet ready for the real world. The company released a report that said that in California between September 2014 and November 2015, its self-driving cars "disengaged" 341 times. Translation: Test drivers had to take control from the computer. While most of the disengagements were relatively minor, 69 of them consisted of events where "safe operation of the vehicle required control by the driver."
- Attention, Internet Explorer holdouts: It's time to say goodbye. Microsoft officially ended support on Tuesday for all versions of IE, except for version 11. This means no more security updates for those browsers. For enterprises having trouble sunsetting the older browsers because of custom apps, Microsoft has a temporary workaround.
FTC report: Focus on IoT privacy, not just IoT security
Big data ethics and the virtues of transparency
CTO weighs the benefits of data mining with risks