Companies are transforming from data-generating to data-driven organisations. More and more data is available and the technology to convert this data into useful insights is widely accessible. Data is increasingly important and with it, the quality of data. Data offers your organisation enormous opportunities, because you want to make decisions based on reliable information. But you have to work for data!

This article describes how to assess the quality of data and how to structurally improve data quality.

Usable data

In his standard work ‘Data Driven’, Thomas C. Redman describes that having reliable and correct data is one of the most important company resources. The data that an organisation has about its customers and products is unique and allows it to formulate appropriate policies and take decisions that improve the company's performance. However, that data must be correct and correctly converted into information. Knowledge can only be accrued and smart decisions can only be made on the basis of correct data (see the DIKW pyramid on the right).

DIKW pyramid

Data quality depends on demand
According to Redman, the quality of data completely depends on the question asked by the user. A distance paced out with your feet is sufficient when playing a game. However, this method is not of sufficient quality for mounting a kitchen cupboard.

URDA and Data Quality

The Usability of data is not only determined by the precision (accuracy) of the value alone. There are important partial aspects that play a role in whether data is suitable for answering your business-related question.

The Reliability of a given is the degree to which the value reflects actual reality. If a person's gender is stated as ‘Male’, it may be correct, but the value ‘Guy’ is questionable.

This brings us to the partial aspect of Durability. If it has been agreed within an organisation what the possible values of Gender are (Male/Female/Unknown) and which data owner registered that information and through which collection process, we are on the right track. We can test the information and make enquiries.

Finally, there is the partial-aspect of Availability, which determines whether a piece of data can be retrieved by the user and whether he/she can and may use the data.

If one of these partial aspects is below par, then the Usability will be suffering for it. Therefore, give these three partial-aspects the attention they need.

URDA

Durable

You can still have your data in order, but if it is not accessible or cannot be used in time, the ultimate objective of improving business performance will not be achieved. The IT department facilitates users by making data available and offering users the right tools to convert data into information.

The user usually asks the question: ‘Do we have that information and, if so, where can I find it?’ A business data model and data stewards are the answer in this instance. The user can then enter a Data Warehouse or Data Lake to obtain the data and start working with it.

Reliable

Correctly recording the characteristics of an object in the systems still often requires a human. The agreements made under the heading Durable are very useful in validating the registered data. The ACCU test can be performed on the basis of definitions and requirements. This test checks data for the following aspects:

  • Topicality: has the data been updated properly and does it reflect the current situation?
  • Correctness: does the data meet the requirements, such as domain values, format and business rules?
  • Completeness: has all data been entered and are valid references to reference values present?
  • Uniqueness: is data not registered twice (within and across applications)?

There are various models stating the characteristics of data quality. The level of detail varies widely. For example, the ISO/IEC 25012 standard distinguishes no fewer than 15 different characteristics (see opposite).

15 characteristics of ISO / IEC 25012

Available

You can still have your data in order, but if it is not accessible or cannot be used in time, the ultimate objective of improving business performance will not be achieved. The IT department facilitates users by making data available and offering users the right tools to convert data into information.

The user usually asks the question: ‘Do we have that information and, if so, where can I find it?’ A business data model and data stewards are the answer in this instance. The user can then enter a Data Warehouse or Data Lake to obtain the data and start working with it.

Data migration is an impulse for data quality

Data quality plays an important role in the data migrations that Data eXcellence performs. The implementation of a new system and the associated data migration is an excellent time to lay down the data Policy and pay extra attention to improving the data quality.
Data eXcellence advises and provides support in the structural improvement of data quality. Higher data quality generates more value from your data.

Want to know more?