Simplified

What is data quality?

Data quality is the extent to which data meets stated requirements and expectations and is suitable for use

10/01/2023

Data quality is the degree to which data meets set requirements and expectations and is suitable for use. When deploying data effectively within an organization, it is important to ensure high data quality because data that is of low quality is not reliable, can lead to erroneous or inaccurate insights and damages trust in the data analytics environment.

Data quality can be affected by factors such as erroneous or incomplete data, duplicates, inconsistency and unexpected changes in the data.

If you want to increase your data quality you will need to address the following areas:

Complete data
This means that the data is complete and accurate and does not contain missing or incorrect values, for example, missing mandatory fields such as a date of birth.
Correct data
This means that the data is correct and meets expectations and specifications, such as data in the correct format and with the expected values. A date of birth should always be in the past and be analyzable in the format dd-mm-yyyy.
Consistent data
This means that the data is stored consistently across systems. For example, the customer or citizen number, name and address should be the same across all connected systems.
Unique data
This means that the data is unique and does not contain duplicates when it should not be there. For example, consider excluding duplicate customers or orders.
Current Dates
This means that the data is updated in accordance with an expected interval. A non-updated data set on which monthly billing is based can have serious financial impact.
Accurate data
This means that the data must be accurate, excluding outdated dates and misspelled names as much as possible.
Authentic data
This means that the data comes from a reliable source and has not been falsified. Nowadays an increasingly topical area as AI applications increasingly generate unvalidated information

Back to overview

Recent insights

Simplified

What is data integration?

Simplified

What is Master Data Management, or MDM?

Simplified

What is a modern data platform?

Simplified

What is NIS 2.0?

Simplified

What is synthetic data?

Simplified

What is data Architecture?

Simplified

What are data marts?

Simplified

What is data mesh?

Simplified

What are data contracts?

Simplified

What is active metadata?

Simplified

What are XOps?

Simplified

What are Data Stewards?

Simplified

What is SCD2?

Simplified

How does an upsert work?

Simplified

What is Microsoft Purview?

Simplified

What is a Large Language Model?

Simplified

What is Data Fabric?

Simplified

What are facts and dimensions?

Simplified

What is PySpark?

Simplified

What is data masking?

Simplified

What is DORA?

Simplified

What is XLA?

Simplified

What is data sharing?

Simplified

What is data mapping

Simplified

What is serverless?

Simplified

What is Snowflake?

Simplified

What is a Data Catalog?

Simplified

What is a data lakehouse?

Simplified

What is Azure Synapse?

Simplified

What is Schedule on read?

Simplified

What are data pipelines?

Simplified

What is a Data Swamp?

Simplified

What is data literacy?

Simplified

What is OneLake?

Simplified

What is No- & low-code?

Simplified

What is a star schema?

Simplified

What is data lineage

Simplified

What is a datalake

Simplified

What are Word Embeddings

Simplified

What is Kimball?

Simplified

What is Microsoft Fabric?

Simplified

What is data pseudonymization?

Simplified

What is data as a service?

Simplified

What is covered by DSML?

Simplified

What is a medallion Architecture

Simplified

What is data analytics?

Simplified

What are good KPIs?

Simplified

What is data cleansing?

Simplified

What is Data Access Control?

Simplified

What is streaming data?

Simplified

What is Data science?

Simplified

What is Bronze, Silver and Gold?

Simplified

What are Data Products?

Simplified

What is CDAO?

Simplified

What is ETL and ELT?

Simplified

What is data storytelling?

Simplified

What is the BIO (government)?

Simplified

What is data management

Simplified

What is reverse ETL?

Simplified

What is metadata?

Simplified

What is data governance?

Simplified

What is Delta Lake Framework?

Simplified

What does an AI Data Agent do?

Simplified

What is Edge Analytics?

Simplified

What is Microsoft Power BI?

Simplified

What does MCP do?

Simplified

What is DLP?

Simplified

What can you do with Microsoft Power BI Copilot?

Sign up for the newsletter now and don't miss any more insights

Want to know more?

What is data quality?

Recent insights

What is data integration?

What is Master Data Management, or MDM?

What is a modern data platform?

What is NIS 2.0?

What is synthetic data?

What is data Architecture?

What are data marts?

What is data mesh?

What are data contracts?

What is active metadata?

What are XOps?

What are Data Stewards?

What is SCD2?

How does an upsert work?

What is Microsoft Purview?

What is a Large Language Model?

What is Data Fabric?

What are facts and dimensions?

What is PySpark?

What is data masking?

What is DORA?

What is XLA?

What is data sharing?

What is data mapping

What is serverless?

What is Snowflake?

What is a Data Catalog?

What is a data lakehouse?

What is Azure Synapse?

What is Schedule on read?

What are data pipelines?

What is a Data Swamp?

What is data literacy?

What is OneLake?

What is No- & low-code?

What is a star schema?

What is data lineage

What is a datalake

What are Word Embeddings

What is Kimball?

What is Microsoft Fabric?

What is data pseudonymization?

What is data as a service?

What is covered by DSML?

What is a medallion Architecture

What is data analytics?

What are good KPIs?

What is data cleansing?

What is Data Access Control?

What is streaming data?

What is Data science?

What is Bronze, Silver and Gold?

What are Data Products?

What is CDAO?

What is ETL and ELT?

What is data storytelling?

What is the BIO (government)?

What is data management

What is reverse ETL?

What is metadata?

What is data governance?

What is Delta Lake Framework?

What does an AI Data Agent do?

What is Edge Analytics?

What is Microsoft Power BI?

What does MCP do?

What is DLP?

What can you do with Microsoft Power BI Copilot?

Sign up for the newsletter now and don't miss any more insights

If so, please contact Richard