Data Quality Management

From apppm
(Difference between revisions)
Jump to: navigation, search
(Data Quality)
(Data Quality)
Line 9: Line 9:
 
===Data Quality===
 
===Data Quality===
  
As per ISO 8000-2 guidelines, ''data'' is defined as "reinterpretable representation of information in a formalised manner suitable for communication, interpretation, or processing" while ''data quality'' is defined as the "degree to which a set of inherent characteristics of data fulfils requirements"<ref>2017 ed. ISO 8000-2:2015 Data Quality - Part 2: Vocabulary, ISO</ref>. Data quality is a multifaceted concept which considers various dimensions for measuring quality<ref>Pg. 6, 2006 ed. Data Quality: Concepts, methodologies and Techniques, Carlo Batini & Monica Scannapieca</ref>. Data quality dimensions in literature consider accuracy, completeness, consistency, integrity, representation, timeliness, uniqueness and validity. The ISO 8000-8 guidelines divide data quality into three semiotic categories known as; ''syntactic quality, semantic quality, and pragmatic quality''<ref>2015 ed. ISO 8000-8:2015 Data Quality - Part 8: Information and data quality: Concepts and measuring, ISO</ref>. Semiotic theory involves the usage of symbols such as letters and numbers to communicate information<ref>1998. Understanding Data Quality in a DataWarehouse: A Semiotic Approach, Shanks, G and Darke, P.</ref>.  
+
As per ISO 8000-2 guidelines, ''data'' is defined as "reinterpretable representation of information in a formalised manner suitable for communication, interpretation, or processing" while ''data quality'' is defined as the "degree to which a set of inherent characteristics of data fulfils requirements"<ref>2017 ed. ISO 8000-2:2015 Data Quality - Part 2: Vocabulary, ISO</ref>. Data quality is a multifaceted concept which considers various dimensions for measuring quality<ref>Pg. 6, 2006 ed. Data Quality: Concepts, methodologies and Techniques, Carlo Batini & Monica Scannapieca</ref>. Data quality dimensions in literature consider accuracy, completeness, consistency, integrity, representation, timeliness, uniqueness and validity. The ISO 8000-8 guidelines divide data quality into three semiotic categories known as; ''syntactic quality, semantic quality, and pragmatic quality''<ref>2015 ed. ISO 8000-8:2015 Data Quality - Part 8: Information and data quality: Concepts and measuring, ISO</ref>. Semiotic theory involves the usage of symbols such as letters and numbers to communicate information<ref>1998. Understanding Data Quality in a Data Warehouse: A Semiotic Approach, Shanks, G and Darke, P.</ref>.  
  
 
====Syntactic Quality====
 
====Syntactic Quality====

Revision as of 13:06, 18 February 2018

Contents

Abstract

Data quality management (DQM) serves the objective of continuously improving the quality of data relevant to an organisation, program or project[1]. It is important to understand that the supreme goal DQM is not about simply improving data quality in the interest of having high-quality data, but rather to achieve desired outcomes that rely on high-quality data[2]. DQM is is the management of people, processes, technology and data through coordinated activities aimed at directing and controlling an organisation in terms of data quality"[3].

Data quality has a significant impact on both the efficiency and effectiveness of programs and projects [4]. As part of the digital transformation, data has become more readily available and more important than ever before. Organisations are performing data analytics to leverage key resources and optimise processes to gain a competitive advantage. As such, data is becomingly increasingly valuable to program and project managers who are driving decision making based on data insight. However, if the data quality is poor, managers risk taking misguided decisions based on unreliable data. It is therefore imperative that a proper data quality management system is in place to ensure decisions are being driven based on high-quality data. This article explores the fundamentals behind DQM using references to industry best practices and ISO 8000-1 and ISO 9000-1 guidelines.

Overview

Data Quality

As per ISO 8000-2 guidelines, data is defined as "reinterpretable representation of information in a formalised manner suitable for communication, interpretation, or processing" while data quality is defined as the "degree to which a set of inherent characteristics of data fulfils requirements"[5]. Data quality is a multifaceted concept which considers various dimensions for measuring quality[6]. Data quality dimensions in literature consider accuracy, completeness, consistency, integrity, representation, timeliness, uniqueness and validity. The ISO 8000-8 guidelines divide data quality into three semiotic categories known as; syntactic quality, semantic quality, and pragmatic quality[7]. Semiotic theory involves the usage of symbols such as letters and numbers to communicate information[8].

Syntactic Quality

Semantic Quality

Pragmatic Quality

Define which level these 'dimensions' fall under Completeness Accuracy Validity Consistency Integrity Timeliness

Framework: Data Quality Life Cycle

Insert Diagram!

Quality Management

ISO 9001, reasons and benefits of implementing a quality management system

Fundamental Principles of a Data Quality Management Program

Figure 3.A: Functions of a Data Governance Program, inspired from Knowledgent: Building a Successful DQM Program

Before investigating the principles that make up a DQM program, it is important to recognize that DQM often functions as one of many building blocks of a larger data governance program[9]. Figure 3.A highlights the various functions which make up a data governance program, these include; DQM, data architecture, metadata management, master data management, data distribution, data security, and information lifecycle management. Therefore, DQM does not touch upon these other building blocks of data governance, however, there is often a strong interplay between the different functions. The ISO 8000 guidelines define the three pillars of a DQM program as: People, Processes, and Improvement.

Three Pillars of DQM Program

People
Process
Improvement

ISO 8000 Framework for DQM

Structure and Components of the DQM Framework

Glossary

DQM: Data Quality Management //ISO: International Organisation for Standardization

References

  1. Pg. 3, 2014 ed. Building a Successful Data Quality Management Program, Knowledgent
  2. Pg. 3, 2014 ed. Building a Successful Data Quality Management Program, Knowledgent
  3. 2017 ed. ISO 8000-2:2015 Data Quality - Part 2: Vocabulary, ISO
  4. Pg. 2, 2006 ed. Data Quality: Concepts, methodologies and Techniques, Carlo Batini & Monica Scannapieca
  5. 2017 ed. ISO 8000-2:2015 Data Quality - Part 2: Vocabulary, ISO
  6. Pg. 6, 2006 ed. Data Quality: Concepts, methodologies and Techniques, Carlo Batini & Monica Scannapieca
  7. 2015 ed. ISO 8000-8:2015 Data Quality - Part 8: Information and data quality: Concepts and measuring, ISO
  8. 1998. Understanding Data Quality in a Data Warehouse: A Semiotic Approach, Shanks, G and Darke, P.
  9. Page 3, 2014 ed. Building a Successful Data Quality Management Program, Knowledgent

Bibliography

Batini, C. and Scannapieco, M. (2006): Data Quality: Concepts, Methodologies and Techniques. Berlin: Springer. This book explores various concepts, methodologies and techniques involving data quality processes. It provides a solid introduction to the topic of data quality.

Knowledgent (2014): Building a Successful DQM Program. Knowledgent White Paper Series. This paper provides an introduction to DQM within enterprise information management, explaining the basic concepts behind DQM and also explaining the data quality cycle framework.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox