The word wrangling has two meanings: (1) engaging in complicated disputes and (2) herding livestock. The ambiguity of those definitions can be applied to all the data any organization deals with. First, disputes over the validity and integrity of data must be resolved. Secondly, the data has to be herded to a single source so it can be curated, accessed, and sent to where it can be used to make the best business decisions.
So, wrangling data requires a Data Governance structure. As part of a Data Governance structure, you must deploy consistent processes and business rules for integrating and managing data across all sources and platforms.
There are many methodologies for addressing the consolidation of multiple data sources-- data warehousing, data marts, data lakes, operational data stores, or, more recently, data virtualization and capabilities within the analytics software. All require a detailed understanding of the lineage of the data sources and the validity (business rules) of the data they contain.
Those multiple data sources help you make informed business decisions consistently across the organization. In this blog, we’ll investigate the following:
what is meant by a single source of truth (SSOT)
why it is essential to have the SSOT
how to make the SSOT work for you by:
creating a data warehouse or some kind of central information store
creating the SSOT through a BI platform
incorporating data virtualization through integration (rather than storage)
What is Meant by a Single Source of Truth (SSOT)?
Generally speaking, a single source of truth is a data structure that provides a single location where anyone can access information and obtain a consistent and correct answer. It sounds obvious, but try this: Google the phrase “single source of the truth.” In less than .7 seconds, you’ll get nearly 650 million hits.
The previous illustrates that, while you might be able to get a consistent definition and some examples with a Google search, you don’t get anywhere near a single source. Moreover, some of those sources might even be questionable or biased. Which begs the question, who can say what the truth matter is?
For the purposes of this discussion, we’ll parse the term Single Source of truth to: (1) single source, and (2) truth.
First, Let’s Talk About Truth
We touched on Data Governance in the context of knowing the lineage and validity of the data your business uses. Your data needs to be accurate. Likewise, business rules that govern and generate the data for the best decisions must be well-defined and consistent before deployment.
For example, in our blog on creating a data-driven culture within your organization, we emphasized the importance of clearly defining calculations and metrics across the organization at the beginning.
For example, it is not uncommon for different departments within a business to calculategross margins differently. Is gross margin strictly your net sales minus the cost of goods sold? Or do you include some “fudge factor” through creative accounting that affects your profit and loss reports? Your auditor might disagree, and data wrangling might ensue.
Those and other inconsistencies get in the way of uncovering meaningful insights into your business's actual health. They hamstring business leaders in their responsibility to make timely decisions. Transparency suffers. The data becomes distorted. The data-driven culture wanes.
Why It Is Important to Establish a Single Source of Truth
Once the “truth” is established, go for the gusto: a single source. The rule of thumb is that no matter what department you’re in, everyone in the business should access the same system for information. That same system is the key to making cohesive decisions with the rest of the organization.
The sure indicators that your organization requires a single source of truth are:
It is challenging to find a starting point when looking for information.
When someone wants information, it’s unclear whether it is accurate and trustworthy.
The lack of a single source of information interrupts workflow as information seekers must rely on colleagues to answer questions.
Finally, that single source is the basis for meaningful insights into the company's direction. It is based on the concept that data-driven decision-making must always be from the same source.
How to Establish the Single Source of the Truth
The first step is to establish a centralized analytics delivery organization. We talked about that in the blog linked above. That organization will most likely be your IT department, which delivers analytics and sets data standards.
Once that organization is in place, there are three main methods for creating an SSOT. Those methods vary depending on the organization, the data sources, and the data itself.
The goal is reshaping the data and applying business rules to the data by:
Creating a Data Warehouse Thedata warehouse integrates disparate data from multiple sources and stores the new integrated data in one place. (Watch this space for our upcoming blog on improving your data warehouse systems to make better business decisions.)
Creating an SSOT Through a BI Platform A BI platform pulls data from disparate sources into one location. Sometimes the BI platform stores the data; sometimes, it shuttles the data in real-time to where it is needed.
Employing Data Virtualization This method integrates the data but does not store it.Data virtualization shapes the data in real-time. So instead of a “bucket of data,” you have a “bucket of rules.
Each of the previous has three critical considerations/criteria:
Where is the business rule defined? For example, If you have a data warehouse, you will determine what “gross margin” is in a single place—it’s in the warehouse and is calculated consistently.
Where is the data governance applied? An example of effective data governance would be when a business wants to share its financial information internally, but it needs to withhold salary information.
Where is the data integrated? Data flows from various sources, including relational databases, in a data warehouse. In a BI platform, for example, data is stored within database tables for a warehouse. The data is integrated with virtual tables using data virtualization.
The Bottom Line
Businesses can gain accuracy and consistency of business information when an SSOT is successfully implemented. As a result, SSOT will allow you to manage your company better. And in turn, it drives the success of building a data-driven culture.