How To Discover If Your BI Data Is Making Fake News

Garbage-in. Garbage-out.” Anyone who has worked on datasets knows the truth of this saying. Bad data has always been bad news for decision makers, often with millions of dollars and jobs at stake if the data was corrupt, not cleaned or not validated. How do you discover if your business intelligence (BI) data is making ‘fake news’?

Bad Data and Predictive Modeling

To properly train a predictive model, historical data must meet exceptionally broad and high-quality standards. First, the data must be right: It must be correct, properly labeled, de-deduped, and so forth. But you must also have the right data; lots of unbiased data, over the entire range of inputs for which you are developing the predictive model.

Right Data is Correct Data

According to a Harvard Business Review article, most data fails to meet “data are right” standards. The reasons vary from the creators not understanding the scope, to poorly calibrated measurement tools, and overly complex processes or human error.

Data scientists clean the data before training the predictive model, but this is time-consuming work and does not correct all the errors and often, after all this effort, the data still does not meet “the right data” standards.

Right Data Can Still Be Biased

“Even perfectly accurate data could be problematically biased,” says Anand Rao, partner and global AI leader at PricewaterhouseCoopers. “If an insurance company based in the Midwest used its historical data to train its AI systems, then expanded to Florida, the system would not be useful for predicting the risk of hurricanes,” explains Rao.

Triangulate Your Data Sources

Manipulating data for financial, political or other gains is common news these days. This is why skilled data scientists and data engineers know they need to go beyond mathematics and patterns and understand their individual data sources. If a company has data coming in from multiple sources, it’s important to check the data from one source against another before applying any machine learning.

Is Your Company Ready to Embark on Business Intelligence?

