Business intelligence initiatives help organizations make sense of myriad data points, turning a mass of database fields into actionable information used to advance the business. However, not every BI strategy is as successful as it could be -- often because it doesn't start with high-quality data.
Inconsistent data can come in many forms. You may have, as is the case for some of our data at Westminster College, data stratums. When you peel back the layers, you discover that it’s possible to identify key moments in the organization’s history based solely on how aspects of the information change over time. In analyzing our fundraising data, we went back 13 years and discovered about four different layers of data. In each layer, data was vastly different as different vice presidents came and went.
For ongoing reporting and intelligence needs, this is a far from ideal situation that has the potential to damage decision making. Here are some different solutions.
Create a translation matrix
Fortunately, even though there were four different sets of codes and data layers, all data within each layer was consistent. Therefore, a translation matrix could be created that would map data from each individual layer so that reporting and intelligence systems used a common, understood data set. Some BI systems that integrate with data warehouses can do this as a part of the configuration.
Of course, there are potential pitfalls with this method: You may run into inconsistencies within each layer, but you can probably create translation matrix entries to handle these or you can recode inconsistencies within each layer.
Historical data scrubbing
A potentially cleaner -- but much more difficult -- way to handle inconsistent data is to go back through the historical data and do data scrubbing so that it’s consistent from start to finish. It’s a one-time effort that can streamline ongoing reporting and intelligence efforts. However, historical data scrubbing is also a difficult undertaking that must be done with extreme care. As a part of this effort, it’s more than likely that you will need to create a translation matrix somewhere along the way.
At Westminster, we’ve combined these approaches. Due to serious data-quality issues, ongoing reporting difficulties and business process challenges, we moved to a new fundraising system with an end goal to recode and make consistent all of our historical fundraising information. Our BI strategy involved creating translation matrices and has normalized our fundraising data. We now enjoy much better integration opportunities with other areas of campus, leading to improved and more efficient business processes and reporting outcomes.
Lack of high quality data
Many BI efforts have multiple purposes. You generally want to provide at-a-glance insight into the current environment, potentially base-lining current operations against historical performance. You also want to build predictive models that can help to better gauge future efforts based on actual historical performance. Each goal requires historical data and current data. Current data is pretty easy to gather -- as long as it’s being captured somewhere. If it’s not, you'll need to identify a data location or create a field and develop the process by which that information will be captured, and then you have to go out and capture it. Easy enough.
However, lack of historical data is a serious problem for your BI strategy.
There are really only two ways you can fix missing historical data. For most statistical models, at least three consistent data points of sufficient quality -- for example, three years’ worth of information -- are required before any kind of reasonable inferences can be drawn. If the necessary information exists somewhere but has not been made a part of your operational database, do an audit -- go back and enter that information into the appropriate place.
At Westminster, we now have a BI tool that can help us predict student success -- retention -- based on any factors we choose. We’d like to consider participation in activities, as we have found there could be a positive correlation between certain activities and a student’s success. We have paper records for some student activities, but these activities have not been made a part of the students’ electronic records. In order to use this as a metric, we need go back through three years’ worth of paper files and enter that information into the database, and then create the data points for the BI strategy tool.
The BI strategy waiting game
Alternatively, if it’s not possible to create the historical data, we can simply wait it out until we have enough data points to begin drawing conclusions based on the information. It's a low-tech solution that will greatly depend on what time frame is needed to gather a reliable sample data set.
While this is not an exhaustive list of everything that can go wrong without high-quality data in BI efforts, remember: Many data-quality issues can be rectified, whether through a lot of manual effort or by simply waiting until high-quality data is gathered to your standards.
A guide to scorecards and information dashboard design
BI analytics on steroids: Expert podcast