alphaspirit - Fotolia
Who grows up dreaming they'll be in charge of user-generated content (UGC) for a large retailer? Not Matt Fisher, with an undergraduate degree in marketing and an interest in law. But the program manager for product reviews and ratings at Nordstrom Inc. recalls "falling in love with data" while working for Microsoft and Amazon. A year ago, he joined Nordstrom's product management and development organization, and is currently in charge of user-generated content created on Nordstrom websites.
"Anywhere our customers can leave a comment, upload a photo, ask a question, or find an answer to an already asked question is tagged as user-generated content," Fisher said.
The role isn't new, exactly -- it's been around since 2009 -- but in the last year, it's gone from being part of Nordstrom's "voice of the customer" team to a separate entity, said Fisher, who is slated to give a presentation at the Text Analytics Summit West this fall.
In this SearchCIO Business POV interview, Senior News Writer Nicole Laskowski sat down with Fisher to discuss the technology he's using -- including Hadoop -- to parse through tens of thousands of customer reviews. One lesson learned from mining user-generated content? Don't take five-star customer product reviews at face value.
You're working with a lot of text or unstructured data. What's the biggest hurdle for you?
Matt Fisher: We have millions and millions of rows of unstructured data in the form of free-form text that's coming in from different areas: It could be customer reviews; it could be a question; it could be through our ForeSee surveys online or we also use OpinionLab to aggregate content on the site; it could be through our comment cards. We're trying to tie all of this together to get a better understanding of what our customers are saying -- what are the themes and trends in the areas that are working great and in the areas that aren't working great.
One problem: We don't have an enterprise-wide solution to do this. So trying to create a consistent and pragmatic process across the board has been one barrier. It's a big org, we have lot of people sitting in different teams, so creating consistency on how things are measured has been a problem.
What's an example of an area that isn't working great?
Fisher: One specific project I'm working on is to better identify shipping issues and holes in our fulfillment process. We have fulfillment centers in the United States with one primary center in the Midwest. What I've been doing is looking at every [customer] review piece of content and every 'question content' that comes in through our platform. We're analyzing to identify whether it's [related to] shipping or fulfillment or a product. We have specific flags we apply.
Then we do sentiment analysis to determine if these customers are happy or upset. We found that, although some of our customers are unhappy, they're typically giving us high ratings on the product. But within that product review, they're also giving us feedback on our shipping.
So we're stripping the product review out and we're capturing the fulfillment data, and then we're able to share that with our partners on our fulfillment team in the Midwest and here in Seattle. We can say, 'There seems to be a trend with this zip code, and this zip code [is] getting ripped boxes.' Or with this division, which is at the higher end of the price spectrum, the customers are expecting a nicer box than what they're getting.
We're starting to uncover a lot of these trends that, without text analytics, we would never get because these are five-star reviews. There's nothing on the surface that would seem to call out that there's an issue or a problem. But as we started to break it apart, we identified that negative sentiment can be included within those reviews.
How are you flagging content? Is that done manually?
Fisher: Initially, we did. We were going review by review, and it was myself and another colleague on my team. Now we've set up an automated process where basically we can ingest a large set of data into a Hadoop environment -- all unstructured data -- and we can programmatically run the queries we've built that will look for the sentiment in each cluster.
We utilize the Stanford Natural Language Processing [NLP] libraries. It's a free toolset for NLP and sentiment analysis. So let's say it's a single review with 10 sentences: We can break it apart and identify if there are negative components and where they are within those sentences, and then we can figure out if it's shipping or non-shipping related. Then we can pass it along to the next step where we'll take a manual pass at the 2% or 3% that actually come out of that.
How did you figure out that reviews contain useful information about shipping?
Fisher: My team and I started to look at [customer comment] moderation practices. We use a third-party vendor for the face of our review system, and they host moderation for us. So there's an algorithmic moderation set and there's a manual moderation set. We were trying to look at what instances were being flagged most frequently by the algorithm the third-party vendor was using. … And it turns out we saw a significant amount of moderated review content being flagged for shipping. We identified that in Q4 of last year; there was a spike, and we knew that was due to weather instances. That was across the board for everybody.
But, beyond that time frame, we started to see an uptick in shipping complaints. Some were isolated to a specific area, being a state, or some [were more general] on the east coast or the west coast. We jumped in and asked, 'What could be driving this?' There is a fine line between what we approve and deny to post on our site based on our content guidelines. As we're going through and reviewing which ones met our guidelines and which ones didn't, we came across brands and vendors that had a lot of issues related to shipping. It was surprising. Some were done by our fulfillment center; some were drop shipped. And so I went out to our fulfillment center in the Midwest. When as I was out there, I spoke to our Nordstrom VP of fulfillment and I mentioned this to him. He thought it was a fantastic idea to dig in.
I partnered with his team to build a mini scorecard for them. In the past, they were going off of other metrics and QA [quality assurance] checks, but they didn't have direct one-to-one feedback from customers, which is essentially what a review is or a question is.
Were you able to resolve any of the shipping issues you were seeing?
Fisher: A specific drop ship vendor had been a repeat offender, in the sense that reviews over a long period of time -- being greater than 12 months -- were significantly showing negative trends. Items were delivered in boxes that were ripped or the item itself was ripped or the sizing was wrong or a left and right shoe weren't the same, and it was occurring commonly. It wasn't in one geographic area, it was across the board. Our fulfillment team who manages those relationships was able to dig in and work with that vendor to try and understand what was happening and how to resolve it.
The last update I did, I looked at that drop shift vendor, and the number of negative comments and negative instances around receiving broken items or shipped boxes with wrong materials had significantly decreased.
How do you work with IT?
Fisher: Anything that needs hard coding or development resources, we work closely with our partners in IT. And we use them multiple times a day. They're in a building two blocks from us. And we kind of have a giant share backlog across the teams that all share the same resources and we work together to prioritize.
Teams share the same IT resources?
Fisher: There are different functional teams. My work shares development resources with two other product managers. We work together with the same IT resources -- developers, architects, QA -- and we'll prioritize together. So we share resources between a few of us.