GOVERN-6.2: Contingency processes are in place to handle failures or incidents in third-party data or AI systems deemed to be high-risk.
To meet GOVERN-6.2, you need documented, tested contingency processes for failures or incidents in high-risk third-party data or AI systems, so you can keep operations safe while you contain, switch providers, roll back models, or run in degraded mode. Make it real: define triggers, assign owners, pre-approve fallback paths, and retain recurring evidence. 1
Key takeaways:
- Scope “high-risk third-party data/AI” first, then build incident and continuity playbooks per dependency. 1
- Contingency must include technical fallbacks (kill switch, rollback, alternate data feeds) and governance (notifications, approvals, contractual rights). 1
- Audit readiness depends on evidence of testing, drills, and executed tabletop outcomes, not policy language. 1
GOVERN-6.2 is a governance requirement with an operational spine: if a third party’s data pipeline, model, API, hosted AI service, or tooling fails, you must be able to respond quickly and predictably for the AI systems you classify as high-risk. That means you cannot depend on “the vendor will fix it” as your plan. You need your own contingency processes that cover detection, decisioning, safe degradation, incident handling, communications, and recovery paths. 1
For a Compliance Officer, CCO, or GRC lead, the fastest way to operationalize this requirement is to treat third-party AI/data dependencies like critical services: map dependencies, define failure modes, pre-authorize response actions, and test them. Then bake those requirements into third-party due diligence and contracting so you can execute the contingency plan under real pressure. 1
This page gives requirement-level guidance you can implement without waiting for a full AI governance rebuild. It focuses on what to stand up, who owns it, what evidence to retain, and what examiners or internal audit teams typically ask for when they see “contingency processes” tied to high-risk AI use cases. 1
Regulatory text
Requirement (excerpt): “Contingency processes are in place to handle failures or incidents in third-party data or AI systems deemed to be high-risk.” 1
What the operator must do:
You must (1) identify which third-party data and AI services are in scope for “high-risk” AI use, (2) define and implement contingency processes for failures/incidents affecting those dependencies, (3) ensure the processes are executable (owners, triggers, pre-approved actions), and (4) retain evidence that the processes operate in practice (tests, drills, tickets, post-incident reports). 1
Plain-English interpretation
If your high-risk AI system depends on a third party, you need a plan for what happens when that third party has an outage, a security incident, a data quality issue, a model behavior regression, or a contractual/access disruption. Your plan must keep people safe and business operations controlled, even if accuracy drops or automation must be paused. “Contingency” here is broader than disaster recovery; it includes incident response, operational continuity, and AI-specific controls like rollbacks and human override. 1
Who it applies to
Entities: Organizations developing or deploying AI systems, especially where third parties provide models, training data, inference APIs, labeling, monitoring, hosting, or critical components of the AI lifecycle. 1
Operational contexts that typically become “high-risk” in practice (non-exhaustive):
- AI used in decisions affecting customers, employees, patients, students, or citizens (eligibility, pricing, safety, access).
- AI that can cause material operational disruption if it fails (fraud controls, identity verification, security automation).
- AI that drives regulated processes or sensitive data handling (privacy, security, financial reporting, safety). 1
Third-party dependency types in scope:
- External model providers (hosted LLM APIs; managed ML platforms).
- Third-party data sources (identity, credit, geolocation, device intelligence, sanctions lists).
- Data processors (ETL, enrichment, labeling, synthetic data, feature stores).
- MLOps tooling providers (monitoring, evaluation, prompt management, guardrails).
- Cloud infrastructure and managed services used by the AI workload. 1
What you actually need to do (step-by-step)
1) Set the scope: “high-risk” + “third-party dependency”
Create (or update) an inventory that links each high-risk AI system to:
- Third-party services it depends on (data feeds, models, hosting, tooling)
- Dependency criticality (what breaks if it fails)
- Acceptable operating modes (full, degraded, paused/manual) 1
Practical tip: Start with the top few dependencies that can stop decisions or create harm quickly (identity/fraud, eligibility, safety controls). If your inventory is immature, begin with a single “high-risk register” worksheet and iterate.
2) Define failure and incident scenarios per dependency
For each in-scope third party, document scenario playbooks across at least these buckets:
- Availability failure: API down, latency spikes, quota exhaustion.
- Integrity failure: wrong data, schema drift, poisoning indicators, duplicated records.
- Model failure: sudden performance regression, unsafe outputs, prompt-injection exposure, policy changes by provider.
- Security/privacy incident: breach at provider, credential compromise, data exfiltration.
- Access/legal disruption: contract termination, region restrictions, sanctions, subprocessor changes. 1
Output: a “Failure Modes & Effects” table (simple is fine) that maps scenario → detection → action → owner → required communications.
3) Establish triggers, decision rights, and safe states
Your contingency process needs clear triggers and pre-assigned authority. Define:
- Trigger thresholds (example: repeated failed calls, monitoring alarms, integrity checks failing, provider incident notice)
- Decision maker for switching modes (SRE/on-call, product owner, risk owner)
- “Safe state” options:
- Kill switch (disable automation)
- Rollback to last approved model/config
- Fallback provider (secondary model/data feed)
- Degraded mode (reduced features, stricter thresholds)
- Human-in-the-loop review until resolved 1
Examiner lens: If your plan requires an ad hoc meeting to decide who can shut it off, you will not meet the intent of “contingency processes are in place.”
4) Build the technical and operational runbooks
Create runbooks that an on-call engineer and an accountable business owner can execute:
- Detection & triage: monitoring sources, where alerts land, severity classification
- Containment: isolate affected pipelines, revoke tokens, block outputs, quarantine data
- Continuity: switch to fallback provider, cached data, manual process, or “pause decisions”
- Recovery: restore normal operations, validate outputs, re-enable automation with sign-off
- Comms: internal stakeholders, customer support scripts, regulator-ready facts where applicable
- Evidence capture: ticket templates, timeline, artifacts to attach (logs, screenshots) 1
Keep runbooks system-specific. A single generic IR document rarely answers “what do we do when this provider’s embedding endpoint changes behavior.”
5) Embed requirements into third-party due diligence and contracts
Contingency is constrained by what your third party will support. Add or confirm:
- Notification obligations for incidents, outages, material model/data changes
- Audit/reporting rights (SOC reports, incident summaries)
- Subprocessor transparency for data and AI services
- Data portability and exit assistance
- Service continuity commitments where business-critical
- Right to suspend use without penalty when risk controls trip 1
If Legal pushes back on new terms, document compensating controls (e.g., stronger internal monitoring, strict rate limits, easier provider swap architecture).
6) Test the contingency process and keep proof
Run tabletop exercises for the highest-risk dependency scenarios and record:
- What triggered the drill
- Who responded and when
- Decisions made (and who approved them)
- Gaps found and remediation tickets
- Updated runbooks and next test date 1
Also test technical mechanisms (kill switch, rollback, fallback routing) in a controlled environment. A contingency plan that cannot be executed on demand is not a control.
7) Operationalize recurring evidence collection (audit-ready)
Assign a control owner and define recurring evidence:
- Inventory changes (new third parties, new model versions)
- Test/drill completion
- Incident postmortems tied to third-party dependencies
- Contract reviews and renewal checks for required clauses 1
Daydream (or any GRC system you use) becomes valuable here by mapping GOVERN-6.2 to the policy, procedure, named owners, and a recurring evidence cadence so you can answer audits without a scramble. 1
Required evidence and artifacts to retain
Keep artifacts in a single, reviewable package per high-risk system:
- High-risk AI system register with linked third-party dependencies and criticality.
- Contingency playbooks/runbooks per key third party, with triggers, safe states, and comms paths.
- RACI / decision-rights matrix for shutdown, rollback, and provider switching.
- Monitoring and detection evidence: alert routing, dashboards, integrity checks, model monitoring rules.
- Test evidence: tabletop minutes, technical test results, sign-offs, remediation tickets.
- Incident records: tickets, timelines, root cause analyses, corrective actions when third-party incidents occur.
- Third-party due diligence and contracts: security reviews, incident notification terms, exit/portability language, material change clauses. 1
Common exam/audit questions and hangups
Auditors and regulators tend to probe execution details:
- “Which third parties support your high-risk AI systems, and how do you know the list is complete?”
- “Show me the last time you tested failover or rollback for a third-party AI dependency.”
- “Who can authorize disabling automated decisions, and what’s the escalation path after hours?”
- “How do you detect silent failure, like data drift or model output degradation, not just downtime?”
- “Do your contracts require the provider to notify you of incidents and material changes?” 1
Hangup: teams produce a general incident response policy but cannot show AI- and provider-specific contingency steps.
Frequent implementation mistakes (and how to avoid them)
-
Mistake: Treating “contingency” as only disaster recovery.
Fix: Include AI-specific failure modes (output safety regressions, model/provider updates, data integrity failures). 1 -
Mistake: No pre-approved safe state.
Fix: Define “pause automation” criteria and assign authority to execute it without convening a committee. 1 -
Mistake: Assuming the third party’s SLA equals your contingency plan.
Fix: Build internal fallback paths and document what you do while the provider restores service. 1 -
Mistake: No evidence of testing.
Fix: Schedule tabletop drills and technical tests; store the artifacts with the system record. 1 -
Mistake: Procurement and engineering run separate tracks.
Fix: Make contingency requirements a gating item in third-party onboarding for high-risk AI systems. 1
Enforcement context and risk implications
NIST AI RMF is a framework, not a regulator, and this requirement does not carry a standalone statutory penalty in the provided sources. 2 The risk is practical: third-party AI/data failures can create safety issues, unfair outcomes, or operational outages. If you cannot demonstrate tested contingency processes, you will struggle to defend your governance posture to internal audit, customers, and regulators who evaluate operational resilience, third-party risk management, and incident handling. 1
A practical 30/60/90-day execution plan
First 30 days (Immediate)
- Name an owner for GOVERN-6.2 and define the in-scope high-risk AI systems list. 1
- Build the dependency map: for each system, list third-party data and AI services and rank them by criticality.
- Draft a minimum viable runbook for the top dependency: triggers, safe states, comms, and evidence capture.
Days 31–60 (Near-term)
- Expand runbooks to remaining critical dependencies; add detection rules for integrity and behavior failures. 1
- Add contract/due diligence requirements for incident notification and material change disclosure for new and renewing third parties.
- Run at least one tabletop exercise; file remediation items with owners and due dates.
Days 61–90 (Operationalize)
- Test technical controls: kill switch, rollback procedure, and fallback routing in a controlled environment. 1
- Formalize recurring evidence collection (inventory refresh, drill schedule, renewal review checklist).
- Centralize artifacts in your GRC repository (Daydream or equivalent) with a control-to-evidence map tied to GOVERN-6.2. 1
Frequently Asked Questions
Does GOVERN-6.2 require a secondary provider for every high-risk third-party AI service?
The requirement calls for contingency processes, not a specific architecture. For some dependencies, a safe “pause and manual review” mode can be the contingency, as long as it is documented and tested. 1
What counts as an “incident” for a third-party AI system?
Treat outages, security events, integrity failures, and unsafe or materially degraded model behavior as incidents when they affect a high-risk AI use case. Document your classification criteria and escalation thresholds in the runbook. 1
How do we prove we can execute the contingency plan?
Keep drill records, technical test results for rollback/kill switch, and ticket evidence from real events. Auditors want proof that people can follow the process and that the mechanisms work. 1
Our third party won’t agree to strong incident-notification terms. Can we still comply?
You can mitigate with compensating controls such as stronger internal monitoring, tighter rate limiting, and faster provider swap capability, and document the residual risk acceptance. Keep the negotiation record and your risk decision. 1
Does this apply if we only use third-party data, not third-party models?
Yes. The text explicitly covers third-party data systems, and data failures can be as harmful as model failures in high-risk AI decisions. Build contingencies for data integrity, availability, and change management. 1
Where should these artifacts live: security, GRC, or engineering?
Engineering should own the runbooks and technical procedures; GRC should own the control mapping, evidence retention, and audit response package. Store them in a shared system of record with clear ownership and access controls. 1
Footnotes
Frequently Asked Questions
Does GOVERN-6.2 require a secondary provider for every high-risk third-party AI service?
The requirement calls for contingency processes, not a specific architecture. For some dependencies, a safe “pause and manual review” mode can be the contingency, as long as it is documented and tested. (Source: NIST AI RMF Core)
What counts as an “incident” for a third-party AI system?
Treat outages, security events, integrity failures, and unsafe or materially degraded model behavior as incidents when they affect a high-risk AI use case. Document your classification criteria and escalation thresholds in the runbook. (Source: NIST AI RMF Core)
How do we prove we can execute the contingency plan?
Keep drill records, technical test results for rollback/kill switch, and ticket evidence from real events. Auditors want proof that people can follow the process and that the mechanisms work. (Source: NIST AI RMF Core)
Our third party won’t agree to strong incident-notification terms. Can we still comply?
You can mitigate with compensating controls such as stronger internal monitoring, tighter rate limiting, and faster provider swap capability, and document the residual risk acceptance. Keep the negotiation record and your risk decision. (Source: NIST AI RMF Core)
Does this apply if we only use third-party data, not third-party models?
Yes. The text explicitly covers third-party data systems, and data failures can be as harmful as model failures in high-risk AI decisions. Build contingencies for data integrity, availability, and change management. (Source: NIST AI RMF Core)
Where should these artifacts live: security, GRC, or engineering?
Engineering should own the runbooks and technical procedures; GRC should own the control mapping, evidence retention, and audit response package. Store them in a shared system of record with clear ownership and access controls. (Source: NIST AI RMF Core)
Operationalize this requirement
Map requirement text to controls, owners, evidence, and review workflows inside Daydream.
See Daydream