Do you know why you’re hiring data scientists? A lot of companies feel pressure to hire one, but a lot of companies aren’t ready for them or don’t know what to do with them, said Stephen Gatchell, head of data governance at Bose Corporation, during a closing keynote panel at the recent Global Artificial Intelligence Conference in Boston.
Before hiring a data scientist, Gatchell suggested IT departments first need to determine the business case for hiring one and then create a solid data foundation from which to build from. That foundation doesn’t necessarily need the highest quality data, Gatchell argued, just enough of it to get data scientists started and train the AI models. Unfortunately, getting enough of the data for AI can also be a challenge. Gatchell suggests turning to your competition to build your data stores.
Editor’s note: The following transcript has been edited for clarity and length.
What are some of the key challenges around managing data for AI and machine learning?
Gatchell: The first challenge stems from hiring data scientists before a company is even ready to hire a data scientist — just because the industry says it should be hiring them. [You need a] concept of the business use cases and what problems you’re trying to solve. It seems like a lot of companies are hiring data scientists but they don’t understand why they’re hiring them. Then, once they do hire them, there isn’t enough data to train those data scientists and for them to have the proper material to get the company’s expected results.
In terms of the [challenge of having] quality of data, I think data quality is overrated. You can never wait until you have enough good quality data if you want to do machine learning and AI because you’re going to wait forever for it.
I also don’t think there are enough industry groups that come together — even though they may be in competition — to consolidate some of their information. I think there’s plenty of data out there, it’s just a matter of collecting the data together and utilizing it effectively. If you’re in the consumer electronics industry, for example, and you go to your competitors and you all have the same data, your differentiation is around the data science itself. So, I think it takes a little more maturing of the industry to get together and realize that. Competitors join forces all the time to solve problems — especially in the healthcare industry.
The point is to make sure you understand what the business use case is and don’t just go hiring data scientists off the street because you think that’s the right thing to do. Secondly, we should figure out how to mature the industries enough so that we can get the information and the data sets we need [by working with others in our industry] versus just hiring marketing companies and pulling in data that they’re selling. Groups can get together and have enough data to do the training, models, et cetera.