Quality of data for AI systems
ISO/IEC 42001 Annex A Control A.7.4 requires you to define and implement data quality requirements for each AI system, then prove those requirements are enforced across sourcing, preparation, training, evaluation, and operations. Operationalize it by setting measurable quality dimensions (accuracy, completeness, representativeness, timeliness, relevance), adding gates to your data pipeline, and keeping auditable evidence. 1
Key takeaways:
- Write AI-system-specific data quality requirements (not a generic “good data” policy) and map them to the system’s intended purpose and risk.
- Implement pipeline controls (validation checks, approval gates, monitoring) that prevent low-quality data from entering training or production.
- Retain artifacts that show requirements, test results, exceptions, and remediation across the full data lifecycle. 1
“Quality of data for AI systems” becomes real only when you translate it into enforceable rules that engineers and data owners must follow. ISO/IEC 42001 Annex A Control A.7.4 sets a clear expectation: define data quality requirements and implement them for AI systems. 1 For a Compliance Officer, CCO, or GRC lead, the fastest path is to treat “data quality” as a set of testable acceptance criteria tied to the AI system’s intended use, plus operational gates that block or quarantine data that fails.
This requirement matters because data quality drives model performance, bias, drift, explainability, and incident rates. If you cannot show how you ensure accuracy, completeness, representativeness, timeliness, and relevance of the data, you will struggle to defend outcomes, investigate issues, or satisfy customers and auditors. 1 The goal is not perfection; it is defined standards, consistent execution, documented exceptions, and evidence that your controls run.
Regulatory text
Requirement (verbatim): “The organization shall define and implement data quality requirements for AI systems.” 1
Operator interpretation: You need (1) written data quality requirements for each AI system (or class of systems), and (2) implemented controls that enforce those requirements in practice. “Implemented” means the checks exist in the workflow and produce records: you can show what was tested, when, by whom/what, results, exceptions, and fixes. 1
Plain-English interpretation (what this control really asks for)
For every AI system you build or use, define what “acceptable data” means and ensure your pipelines reject, quarantine, or explicitly approve deviations. Your quality definition must cover, at minimum:
- Accuracy: values are correct versus a trusted reference or sampling validation.
- Completeness: required fields and records exist; missingness is understood and bounded.
- Representativeness: data reflects the population and conditions where the model will be used.
- Timeliness: data is current enough for the intended decision; stale data is controlled.
- Relevance: features and records relate to the intended purpose; no “convenient but unrelated” inputs. 1
Who it applies to
Entity scope: This applies to organizations that provide AI systems, use AI systems, or otherwise deploy AI in operations. 1
Operational scope (where you feel it):
- Data ingestion from internal systems (ERP/CRM/HRIS), logs, IoT, or user-generated content
- Data purchased or received from third parties (data brokers, partners, consultants)
- Labeling/annotation operations (internal or outsourced)
- Feature engineering, training, evaluation, and post-deployment monitoring
- RAG/LLM applications that pull from enterprise knowledge bases (your “data” is also documents) 1
If you are only an “AI user” buying a model or SaaS product, you still own the quality of the data you supply (prompts, retrieved documents, customer records) and the quality expectations you set contractually and operationally for the provider’s inputs/outputs.
What you actually need to do (step-by-step)
Step 1: Define the AI system boundary and data inventory
Create a scoped inventory for the AI system:
- Data sources (systems, tables, document repositories, streams)
- Data types (structured records, images, text, audio)
- Data flows (collection → storage → processing → training/eval → production)
- Owners (data owner, model owner, system owner)
- Third parties involved (data suppliers, labelers, cloud platforms)
Output: “AI System Data Map” that an auditor can trace end-to-end.
Step 2: Write measurable data quality requirements per dataset (or dataset class)
For each dataset used in training, evaluation, and production inputs, define requirements across the five dimensions:
- Accuracy requirements: how you validate correctness (sampling plan, cross-check against authoritative systems, label QA).
- Completeness requirements: mandatory fields, allowable missingness patterns, handling rules.
- Representativeness requirements: inclusion/exclusion rules, coverage across key segments relevant to intended use.
- Timeliness requirements: refresh expectations, max data age, latency constraints, backfill approach.
- Relevance requirements: feature inclusion rationale, prohibited proxies, and “allowed use” statement tied to the AI system’s purpose. 1
Practical tip: Put these requirements into a one-page “Data Quality Spec” per dataset. Avoid purely narrative language. Use testable criteria and link each criterion to a control check.
Step 3: Implement enforcement points (quality gates) in the pipeline
You need controls that stop bad data from silently flowing downstream. Common enforcement points:
- Ingestion gate: schema checks, format checks, required field presence, basic anomaly detection.
- Pre-training gate: label quality checks, duplication checks, outlier checks, PII/secret scanning where relevant to your environment.
- Pre-release gate: dataset version approval, sign-off that requirements were met, exception log reviewed.
- Production monitoring: detect drift, missingness changes, staleness, and document retrieval quality issues for RAG systems. 1
If you cannot block data automatically, you can still comply by implementing manual approvals, but you must show consistent execution and documented decisions.
Step 4: Define roles, escalation, and exception handling
Data quality controls fail most often at the handoff. Make it explicit:
- Data owner sets acceptable thresholds/criteria and approves exceptions.
- AI/model owner confirms the dataset is fit for intended use.
- Engineering implements checks and logging.
- Compliance/GRC samples evidence, verifies exceptions are justified, and ensures changes are controlled.
Set an exception process with:
- documented rationale,
- compensating controls,
- expiry/review trigger,
- approval record.
Step 5: Add dataset and model change control
Data changes can break representativeness and relevance without anyone noticing. Implement:
- dataset versioning (what changed, when, why),
- impact assessment for major shifts (new source, new labeling vendor, new extraction method),
- re-validation triggers when upstream systems change. 1
Step 6: Manage third-party data quality explicitly
If a third party supplies data or labeling:
- contractually require data quality dimensions and reporting aligned to your requirements,
- require provenance and collection/labeling methodology disclosures appropriate to your risk,
- audit or sample deliverables and record acceptance results,
- define reject/rework procedures.
Where Daydream fits naturally: many teams track third-party data sources, labeling providers, and dataset approvals across scattered spreadsheets. Daydream can centralize third-party due diligence artifacts, dataset acceptance evidence, and exception workflows so you can produce an audit-ready trail without rebuilding your GRC process around the ML pipeline.
Required evidence and artifacts to retain
Keep artifacts that prove both definition and implementation:
Definition artifacts
- Data Quality Policy/Standard for AI (organization-level)
- AI System Data Map (sources, flows, owners)
- Dataset-level Data Quality Specs (accuracy, completeness, representativeness, timeliness, relevance)
- Data acceptance criteria and exception procedure 1
Implementation artifacts
- Automated check logs (schema validation, missingness reports, anomaly outputs)
- Label QA reports (sampling results, reviewer notes)
- Dataset version history (change logs, approvals)
- Monitoring dashboards and alerts (staleness, drift proxies, retrieval quality checks for RAG)
- Exception register with approvals, compensating controls, closure evidence
- Third-party deliverable acceptance records and due diligence documentation 1
Common exam/audit questions and hangups
Auditors tend to probe for operational proof, not policy statements:
- “Show me the data quality requirements for this AI system and where they are enforced.” 1
- “What happens when data fails checks? Who can override and where is it recorded?”
- “How do you know training data reflects the production environment?”
- “How do you manage dataset changes and retraining triggers?”
- “Which third parties supply data/labels, and how do you validate their outputs?”
Hangup to expect: teams can describe controls verbally but cannot produce logs, approvals, and exception records tied to specific dataset versions.
Frequent implementation mistakes (and how to avoid them)
-
Writing generic requirements that cannot be tested.
Fix: require measurable acceptance criteria and specify the test method and owner. -
Treating training data quality as the only scope.
Fix: include production inputs, retrieval corpora (for RAG), and feedback data used for tuning. 1 -
No representativeness definition.
Fix: define the operational population and conditions; document known gaps and mitigation (reweighting, targeted collection, constrained use). -
Exception sprawl.
Fix: require time-bounded exceptions with review triggers and a closure workflow. -
Third-party data accepted “as delivered.”
Fix: implement incoming acceptance tests and keep evidence; enforce contractual reporting and audit rights where feasible.
Enforcement context and risk implications
No public enforcement cases were provided in the source catalog for this specific control. Practically, weak data quality controls raise the likelihood of:
- incorrect or inconsistent outputs,
- bias claims tied to unrepresentative data,
- inability to investigate incidents due to missing lineage and versioning,
- customer audit failures and contractual disputes with enterprise buyers. 1
Practical 30/60/90-day execution plan
First 30 days: Define and scope
- Select priority AI systems (start with customer-impacting or high-risk decisions).
- Build the AI System Data Map for each.
- Draft a standard “Data Quality Spec” template covering the five dimensions.
- Assign data owners and set an exceptions process. 1
By 60 days: Implement gates and logging
- Implement ingestion and pre-training validation checks for priority datasets.
- Establish dataset versioning and approval workflow (even if manual initially).
- Stand up an exception register and require approvals for overrides.
- Begin third-party data/labeling intake testing and acceptance records. 1
By 90 days: Prove operating effectiveness
- Add production monitoring for timeliness and completeness; add representativeness checks where feasible.
- Run a mock audit: pick one AI system and produce an evidence pack (requirements, logs, exceptions, approvals).
- Close or time-bound open exceptions; document remediation outcomes.
- If evidence is fragmented, consolidate in a system of record (Daydream or your existing GRC tool) to support repeatable audits. 1
Frequently Asked Questions
Do I need separate data quality requirements for every AI model?
You need requirements for each AI system’s datasets and data flows, but you can reuse a standard template. The key is that the final requirements reflect the system’s intended purpose and operational context. 1
What counts as “implement” for ISO 42001 A.7.4?
“Implement” means the requirements are enforced through workflow gates, checks, approvals, monitoring, and exception handling with retained records. A policy alone will not satisfy an audit. 1
How do we handle representativeness if we lack demographic attributes?
Define representativeness using available operational segments (geography, device type, channel, product line, time windows) and document known blind spots. Treat missing sensitive attributes as a risk to be managed through testing and constrained use rather than ignoring representativeness. 1
Does this apply to GenAI and RAG systems where “data” is documents?
Yes. Your corpus quality (accuracy, freshness, coverage, relevance) becomes the primary driver of output quality. Apply the same quality dimensions to document sources, retrieval indexes, and update processes. 1
What evidence do auditors ask for most often?
They typically ask for dataset-level requirements, proof the checks ran (logs or reports), version/approval records, and a clean exception trail showing who approved deviations and why. 1
We buy third-party data. How far does our responsibility go?
You remain responsible for defining acceptance criteria, testing incoming data against those criteria, and documenting exceptions and remediation. Contractual requirements help, but you still need your own intake controls and evidence. 1
Footnotes
Frequently Asked Questions
Do I need separate data quality requirements for every AI model?
You need requirements for each AI system’s datasets and data flows, but you can reuse a standard template. The key is that the final requirements reflect the system’s intended purpose and operational context. (Source: ISO/IEC 42001:2023 Artificial intelligence — Management system)
What counts as “implement” for ISO 42001 A.7.4?
“Implement” means the requirements are enforced through workflow gates, checks, approvals, monitoring, and exception handling with retained records. A policy alone will not satisfy an audit. (Source: ISO/IEC 42001:2023 Artificial intelligence — Management system)
How do we handle representativeness if we lack demographic attributes?
Define representativeness using available operational segments (geography, device type, channel, product line, time windows) and document known blind spots. Treat missing sensitive attributes as a risk to be managed through testing and constrained use rather than ignoring representativeness. (Source: ISO/IEC 42001:2023 Artificial intelligence — Management system)
Does this apply to GenAI and RAG systems where “data” is documents?
Yes. Your corpus quality (accuracy, freshness, coverage, relevance) becomes the primary driver of output quality. Apply the same quality dimensions to document sources, retrieval indexes, and update processes. (Source: ISO/IEC 42001:2023 Artificial intelligence — Management system)
What evidence do auditors ask for most often?
They typically ask for dataset-level requirements, proof the checks ran (logs or reports), version/approval records, and a clean exception trail showing who approved deviations and why. (Source: ISO/IEC 42001:2023 Artificial intelligence — Management system)
We buy third-party data. How far does our responsibility go?
You remain responsible for defining acceptance criteria, testing incoming data against those criteria, and documenting exceptions and remediation. Contractual requirements help, but you still need your own intake controls and evidence. (Source: ISO/IEC 42001:2023 Artificial intelligence — Management system)
Authoritative Sources
Operationalize this requirement
Map requirement text to controls, owners, evidence, and review workflows inside Daydream.
See Daydream