Data Quality: Why 60% of Projects Fail - and How to Avoid It
Data is everywhere. But if it's not reliable, it might as well be nowhere. This guide explains the real definition of quality, the 8 dimensions to measure, and where to start - no jargon.
Oleg Chitic
· 12 min read
📖 Key vocabulary
The number that hurts
$3,300. That's the ballpark figure that poor-quality data can cost an organization - per employee, per year - in wasted time, rework, and missed opportunities.
At 200 employees, this is no longer a detail. It's a silent leak.
And the damage doesn't stop there:
- ×30% of employee time is spent searching for, verifying, or correcting erroneous data (Harvard Business Review)
- ×68% of customers leave a company due to poor experiences linked to incorrect data (Experian Data Quality)
- ×15 to 25% of revenue is impacted by poor-quality data in the most affected organizations (Gartner Research)
"Data quality isn't optional - it's a strategic necessity that determines our ability to innovate and stay competitive."
Yet, despite these alarming figures, most organizations don't treat quality at the root. They clean up. They don't govern.
Cleaning ≠ Quality: the big misunderstanding
When people say "data quality," here's what most teams picture:
- →Fixing capitalization and accents
- →Standardizing date formats
- →Removing the most obvious duplicates
- →Filling in empty fields
That's not quality. That's surface cleaning.
It's like repainting the façade of a building with cracked foundations. From the outside, everything looks clean. But the structure is fragile - and the next shock will bring it down.
🧹 Surface cleaning
- × Fixing formats after the fact
- × De-duplicating once a quarter
- × Manually filling empty fields
- × No prevention rules at the source
🏗️ True quality
- ✓ 8 dimensions measured continuously
- ✓ Validation rules at the point of entry
- ✓ Automated controls in the systems
- ✓ Named owners, tracked indicators
Analogy:
Imagine a library where someone comes in every month to put books back in alphabetical order - but with no classification system, no catalog, and no librarian. A month later, it's chaos again. Cleaning never replaces organization.
True quality is the strength of the foundations: data that is accurate, complete, consistent, unique, valid, current, well-connected, and plausible.
These are the 8 dimensions of the global DAMA-DMBOK standard.
The 8 dimensions of data quality (DAMA-DMBOK)
The DAMA-DMBOK (Data Management Body of Knowledge) is the global standard for data management. It defines 8 dimensions to measure quality. Each dimension answers a simple business question - and translates into technical rules that can be automated in your data processing pipelines (ETL).
Think of these dimensions as the 8 pillars of a building. If even one is missing, the entire structure is weakened.
Accuracy
"Does the data reflect reality?"
The data matches the real world and aligns with a trusted source.
Example:
The price in your enterprise resource planning system (ERP) matches the actual price → accurate billing, reliable margins.
Completeness
"Is any information missing?"
All required fields are filled. The record is complete for its intended use.
Example:
Customer record with shipping address and primary contact filled in → delivery guaranteed.
Consistency
"Does the data contradict itself?"
Data is uniform and non-contradictory across systems. Logical consistency is maintained (e.g., age vs. date of birth).
Example:
Order status is identical in sales and logistics → seamless process.
Uniqueness
"Are there duplicates?"
One single record per entity. Redundancy is controlled and intentional.
Example:
1 customer = 1 master record → 360° view, no duplicate emails or billing errors.
Validity
"Is the format correct?"
Data follows defined formats, types, and value ranges. Controlled by reference lists.
Example:
Order date in DD/MM/YYYY format → successful system import, no rejection.
Timeliness
"Is the data up to date?"
Data is available when expected and current enough for its intended use.
Example:
System inventory matches actual inventory → reliable delivery promises.
Integrity
"Are the relationships between data valid?"
Referential integrity is respected - foreign keys point to existing records. No orphans.
Example:
Every order is linked to an existing customer → no "ghost" orders without an owner.
Reasonableness
"Does the data make sense?"
Values are credible within the business context. Would domain experts validate this data?
Example:
An annual salary of $10,000,000 → automatic alert. Something is wrong.
Key takeaway:
These 8 dimensions are not independent. Data can be accurate but outdated. Complete but inconsistent. Unique but invalid. Each missing pillar weakens the whole - and directly impacts your results.
Data quality is about respecting all 8 pillars together. Not just the capitalization.
The golden rule: 1-10-100
If you remember only one thing from this article, let it be this rule. It explains why prevention is always cheaper than correction:
Prevent
Validate the data at the point of entry. A required field, a dropdown list, an enforced format. Simple, fast, and virtually free.
→ "The address is validated by Canada Post before saving."
Correct
Detect and fix the error in the system after entry. You need to find it, understand it, correct it, and verify it hasn't propagated other errors.
→ "The address is invalid. The data team corrects it manually."
Suffer
The error went undetected. It causes a failure: returned package, wrong invoice, false report sent to management, regulatory fine, lost customer.
→ "The package comes back. The customer calls, furious. They never return."
"Every accurate entry today saves time and money tomorrow."
The 1-10-100 rule is universal. It works for a customer address, a product price, an order status, or an employee's social insurance number. Prevention always costs 100× less than suffering the consequences.
CQSEV application:
In the CQSEV framework, the 1-10-100 rule maps directly to the 3 pillars. Govern = write the validation rule. Manage = verify the rule is applied every week. Transform = program the validation into the automated pipeline so the error is blocked before it enters the system.
4 myths that cost a fortune
After 15+ years in the field, I encounter the same misconceptions in almost every organization. Here are the 4 most dangerous myths - and the reality behind each one.
"Data quality is a technical problem."
It's an organizational and human issue.
78% of quality problems come from processes and human practices. Only 22% are technical in origin. The people who enter data, the processes that are missing, the rules that don't exist - those are the real causes.
"Automated tools fix everything."
Technology + People + Process = Success.
The real breakdown: 30% technology, 70% organizational. A tool without process and business expertise just moves the problem around. The tool is the last link, not the first.
"Our data is good by default."
Degradation is continuous and natural.
Data degrades by 2 to 3% per month naturally - people move, change jobs, products evolve. And 80% of executives overestimate their data quality. Blind trust is the worst enemy.
"Data quality is a one-time project."
It's a mandatory ongoing process.
Once cleaned, data doesn't stay clean. Natural degradation runs at 15 to 25% per year. It's like building maintenance: you don't clean once and call it done for 10 years. Maintenance is permanent.
Key takeaway:
Data quality is a corporate culture to be built collectively. Organizations that adopt it improve their operational efficiency by 35% on average.
The root causes
If 60% of projects fail because of quality, it's not an accident. The same root causes appear in nearly every organization I've worked with:
- Manual processes - Human entry without validation is the number one source of errors. No dropdown, no required field, no control at the source
- Silos between departments - Marketing has its version of the customer. Finance has its own. Logistics too. Three systems, three truths, zero consistency
- No clear rules - Nobody has defined what an "active customer," an "in-stock product," or a "valid address" means. Everyone interprets their own way
- Scattered local files - Excel files on everyone's workstation. No single source of truth. Impossible to know which version is correct
- Lack of user training - People don't know that how they enter data impacts everything downstream. Nobody explained the 1-10-100 rule to them
Notice a pattern: none of these causes are technological. They are all human and organizational. That's myth #1 in action.
"Data is everywhere… but if it's not well organized, it's as if it were nowhere."
CQSEV and quality: Q axis × 3 pillars
In the CQSEV framework, Quality is one of the 5 axes - and it's evaluated across 3 pillars simultaneously. That's what makes the difference between a theoretical audit and an actionable diagnostic.
Govern - Does the rule exist?
Has "quality" been defined for each domain? Are the 8 dimensions documented? Is an owner named? Are measurable indicators (KPIs) set?
Example: "Customer address completeness must exceed 97%. The Data Owner for the Customer domain is Jean Tremblay."
Manage - Is someone checking?
Does the day-to-day data guardian (Data Steward) measure quality every week? Are anomalies fixed quickly? Are results visible to the teams?
Example: "Every Monday, Marie checks the duplicate rate in the CRM. If it's above 3%, she launches a cleanup."
Transform - Do the systems enforce it?
Do the automated data processing pipelines (ETL) check the 8 dimensions? Is invalid data automatically rejected? Are alerts sent?
Example: "The pipeline validates every address through Canada Post. If invalid → automatic rejection + alert to the Data Steward."
✅ The CQSEV quality test
If you can answer "yes" to all 3 questions above for each of the 8 dimensions, your quality is solid. The boxes where you answer "no" or "I don't know" are your immediate action priorities.
→ See the full CQSEV matrix (5 axes × 3 pillars)
Where to start?
Don't try to measure all 8 dimensions across all your data at once. Start small. Prove the value. Expand.
Action 1 - Pick ONE critical domain
Customer data or financial data are almost always the best starting point. They're the most visible, the most used, and the ones whose problems cost the most.
Action 2 - Measure 3 dimensions this week
Open your main database. Measure:
- Completeness - What percentage of customer records have a filled address? An email? A phone number?
- Uniqueness - How many duplicates are in the database? 1%, 5%, 15%?
- Timeliness - When was the last update? Are there records that haven't been touched in over a year?
These 3 numbers are your baseline. They'll let you demonstrate progress in 3 months.
Action 3 - Name an owner and a steward
For your chosen domain, name a Data Owner who decides the rules, and a Data Steward who checks every week. These aren't new positions to create - they're often people who are already doing this work without knowing it.
The secret:
Don't start by buying a tool. Start by measuring. If you don't know your duplicate rate today, no tool will magically solve the problem. The tool comes after the process. Always.
Further reading
If you want to go deeper, three complementary articles:
- What Is Data Governance? The Library Analogy - the fundamentals and the 4 essential registries
- Data Governance: Complete Guide 2026 - key roles, 7 mistakes to avoid, and the 5-step method
- Govern, Manage, Transform: Why All Three - how to connect rules, fieldwork, and automated systems
"Data quality isn't a one-time project. It's a corporate culture built one dimension at a time."
Assess your data quality in 30 minutes
The CQSEV diagnostic grid helps you identify your empty boxes across all 5 axes - including Quality. Free, concrete, no strings attached.
Download the CQSEV gridFAQ - Data Quality
What is data quality exactly?
Data quality measures how reliable, complete, and usable your data is for decision-making. The global DAMA-DMBOK standard defines it through 8 dimensions: accuracy, completeness, consistency, uniqueness, validity, timeliness, integrity, and reasonableness. Quality data is data you can trust - from the front lines to the boardroom.
How much does poor data quality cost?
The ballpark is roughly $3,300 per employee per year in wasted time, rework, and missed opportunities. For a 200-person organization, that's about $660,000 per year. According to Gartner, 15 to 25% of revenue can be impacted in the worst cases. The 1-10-100 rule shows that preventing an error costs $1, correcting it $10, and suffering the consequences $100 or more.
What's the difference between data cleaning and data quality?
Cleaning is a one-time, surface-level action: fixing capitalization, removing duplicates, filling empty fields. Quality is a structural, ongoing approach: measuring 8 dimensions, defining validation rules, naming responsible owners, and automating controls in systems. Cleaning treats symptoms. Quality treats causes.
How do I measure data quality concretely?
Start with 3 simple measurements on your most critical domain (customers or finances): the completeness rate (percentage of required fields filled), the duplicate rate, and the date of last update. These 3 numbers are your baseline. Then gradually expand to all 8 DAMA-DMBOK dimensions. The CQSEV framework helps you structure this assessment across 3 pillars: rules, field verification, and automation.
What's the link between data quality and governance?
Quality is one of the pillars of data governance. Without governance - meaning without rules, owners, and processes - quality naturally degrades by 2 to 3% per month. In the CQSEV framework, Quality (Q) is one of 5 axes, evaluated across 3 pillars: Govern (define the rules), Manage (verify in the field), and Transform (automate in the systems). Governance provides the framework. Quality is its measurable result.
Where should I start to improve data quality?
Three concrete actions this week: (1) pick a critical domain - customers or finances; (2) measure completeness, uniqueness, and timeliness for that domain; (3) name a Data Owner who decides the rules and a Data Steward who checks every week. Don't start by buying a tool - start by measuring. The tool comes after the process. Always.