Manage Learn to apply best practices and optimize your operations.

Mechanical Turk supplies Gilt with 'artificial artificial intelligence'

In this Q&A, a data scientist explains why he combines objective data with human critiques to assess risk in Gilt's pre-emptive shipping program.

This article can also be found in the Premium Editorial Download: CIO Decisions: Artificial intelligence in business: The future is now

Igor Elbert, a data scientist for Gilt Groupe's, uses predictor attributes such as metadata and even day-of-sale data for pre-emptive shipping. Also referred to as anticipatory shipping, pre-emptive shipping aims to get online purchases to customers faster by using data analytics to predict what they might be buying at a given time. Objective data on customer buying habits, however, is not enough to predict which designer dress or duvet cover customers will buy before they've even seen the product. Elbert also needs to cull subjective factors to combine with the hard data -- namely, a human critique of the fashion du jour in order to assess the risk of preshipping the product.

As he explains in this second part of a two-part SearchCIO Q&A, that's where Amazon Mechanical Turk comes into play. The platform crowdsources hard-to-automate tasks, a practice Amazon refers to as artificial artificial intelligence. Here, Elbert describes how he surveys "Turkers" for their impressions of a product to help build the models that guide Gilt's pre-emptive shipping program.

How do you make predictions about what customers want before they even know it exists?

Igor ElbertIgor Elbert

Igor Elbert: One of the things I'm doing is looking at sets of attributes, which basically breaks down into three groups:

The first is objective parameters like product metadata. Let's say we're talking about a black dress. We haven't sold this particular product, but we've sold black dresses and we've sold black dresses from well-known brands. So, I look at the season, prices, ratio of our sale prices to original prices; I look at how much markup we're putting into it. I look at the human factor: who the product buyer is -- who put the item on sale. I build it all into my models.

The second group of parameters relates to the sale day: How will this particular day shape up? I look at number and type of sales, inventory depth and breadth, brands that are offered. It's important because strong offerings have a 'halo' effect and help other products that are being sold. For Gilt, these parameters vary significantly from day-to-day because our offering changes daily.

The third group is purely subjective: I show an image of the product through Mechanical Turk to a random group of people and I ask them to tell me about the product.

What is Mechanical Turk?

Elbert: There is an Amazon platform called 'Mechanical Turk,' which basically crowdsources tasks. So, I create a template, and I include all of the dresses we plan to put on sale. And then I'll say I want an opinion about every dress from three different people. Amazon calls them 'Turkers.' I'll also provide some characteristics, so I'll say they should live in the United States; for this particular set, I'm interested in women's opinions. And I say I'm willing to pay 10 cents per dress. Amazon basically shows this dress to three different people, they answer my questions and then Amazon sends me results. It's very similar to a survey.

Amazon calls it 'artificial artificial intelligence' because it's for tasks that are hard to automate. It would be very difficult for anyone to write an algorithm that looks at an image and can tell if it's plain or sexy. But humans spot it right away.

What kinds of questions specifically do you ask Turkers?

Elbert: I'll ask them to assign adjectives to the product. So, let's say it's a dress. I'll ask, is it pretty, weird, unusual, sexy, formal -- like 15 different adjectives. I'll ask them what the dress is good for -- a dinner party, a prom or something else. I'll ask if they think the dress will sell well. I'll ask if the dress looks expensive, moderately expensive, inexpensive.

I'll show just a picture of the product and tell them the material it's made of, but I don't tell them the brand or anything else. I'm doing it to pick out, essentially, risky items. If many people say the dress is weird or unusual, it doesn't mean it won't sell well. It just means there is some risk for this particular item, which may be higher than [for] items that a lot of people don't call unusual. And I factor it into my model.

I supplement my objective predictors with subjective opinion to get an idea about the risk. For a particular dress, let's say three out of three people call it unusual, then maybe I should not be taking any additional risk by moving it closer to the customer. Because basically I'm placing bets: I'm betting that this dress will be sold in this region. If this dress is risky, then I would not be placing this bet -- it would be even riskier for me to start moving it before someone bought it.

Are you using Turkers for every product Gilt sells?

Elbert: Right now we're doing it for a subset of products, but we are moving toward doing it for all of the products. Besides pre-emptive shipping, this gives us a lot of information about our products, and it has many uses.

How do you know to trust the Turkers' judgment?

Elbert: With Turkers, there is a concept of prequalification. So, you can run them through a series of tasks. And if you do [a task] over and over again, you can pick Turkers who perform well. One of the things I asked them is to give me a price range of the product just by looking at the image. How much do you think this dress costs? Some were spot-on. Some were predicting the cost of the dress within $1. For other dresses, some said the dress looks like it should cost $200, but we're trying to sell it for $400. It doesn't mean they're wrong; maybe we overpriced it. They're given limited information. They didn't know this is a super-fancy brand. But when there is a big disconnect between the perceived price and the asking price, it adds to risk.

So, if you forget the fashion aspect, what I'm doing is trying to do is quantify risk for fashion items we sell. I'm using different ways to determine that, combining objective parameters with subjective parameters, and I'm trying to guess which items have a higher propensity to be sold in certain areas.

In the description of the talk you're giving on pre-emptive shipping at the upcoming Strata Conference + Hadoop World, you call this 'the human element.' Are there other human elements you have to consider?

Elbert: There are two parts: One, what I get from Turkers. Two, I look at who from Gilt bought this product for our site -- which person -- because the buyer turns out to be highly predictable. So, some people on our site seem to buy merchandise that the West Coast likes more. I'm not sure they're doing it consciously. They just pick items, clothes or tastes that align with the 'West Coast.' So, I'll also look at who put the sale together.

In the future, I'll also look at who shot the product -- because we photograph most of the products we sell. I'm not doing it right now because the data is not available, but I plan to do it. And if I kept this data, I would also include which model we used for a particular item -- just to give my algorithm a chance to detect if there is some correlation between certain people and certain demographics buying the product.

Let us know what you think of the story; email Nicole Laskowski, senior news writer, or find her on Twitter @TT_Nicole.

Next Steps

Amazon CTO puts IoT on the map

Seven useful data science tips from McGraw-Hill Education

Big data crime and punishment

Meet your new colleague: The machine software tester

Dig Deeper on Enterprise artificial intelligence (AI)

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

How are you leveraging crowdsourcing at your company?
We're not, but I think crowdsourcing is a great idea. I've tried to sell it before. No luck!
This sounds like one of the more interesting tasks on Mechanical Turk, and one that feels like it has some real-world implications. Some of the tasks I've encountered have appeared mindless and their value seemed limited (for me and for the requester).
Interesting, I had now idea that such marketing activities were being carried out by Amazon's mechanical turks. I'll tell my cat (who can advise on a number of topics):