News Stay informed about the latest enterprise technology news and product updates.

Training data for algorithms must be right, not just plentiful

Machine learning algorithms require training data — and a lot of it — to get the models working correctly. But more training data alone doesn’t necessarily make for smarter algorithms, according to Tolga Kurtoglu, CEO at PARC, a research and development company spun out Xerox in 2002.

If companies want to tune their models correctly, Kurtoglu said they need to gather the right training data. He provided a vivid example of the importance of this requirement at the recent AI World event in Boston.

When PARC engineers looked at how to get more performance out of lithium ion batteries for electric cars, they realized the data they needed didn’t exist. “The current battery cells are manufactured in a way that have sensors outside of the cell structure,” Kurtoglu explained. The sensor data included temperature and voltage, and it was used as a proxy for what was happening inside the cell structure.

But proxy data wasn’t good enough. PARC engineers wanted internal monitoring capabilities to measure the battery’s health, gathering data points on the cell’s pH balance and its chemical decomposition, for example. So they built a fiber optic sensor and fitted it into the battery cell to provide a window into the unseen space.

The data gave way to critical insights for improving a lithium ion battery’s life, enabling engineers to “build analytics capabilities in a very different way” by combining traditional data sources together with the new data generated by the fiber optics sensor, Kurtoglu said.