MEASURE-3.1: Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts.

10 min readLast verified: February 2026By Isaac Silverman

MEASURE-3.1 requires you to stand up an operating process (with named owners and documented methods) that regularly discovers, logs, and tracks AI risks that appear in real-world use, including risks you did not predict during design. To operationalize it, implement continuous monitoring tied to intended use, actual performance in production, incident intake, and a risk register with clear escalation and remediation workflows. ¹

Key takeaways:

Assign accountable roles and a recurring cadence to identify and track AI risks post-deployment, not just at launch. ¹
Track “unanticipated” and “emergent” risks through monitoring, user feedback, and incident/problem management, then route them into a controlled risk workflow. ¹
Keep audit-ready documentation: monitoring plans, risk logs, decision records, and evidence that you acted on what you found. ¹

MEASURE-3.1 is an operational requirement: you need an ongoing way to find AI risk in the wild, assign people to manage it, and prove you did. Many AI governance programs overinvest in pre-deployment approvals and underinvest in post-deployment detection, triage, and tracking. This requirement closes that gap by forcing a regular, repeatable cycle that compares intended performance to actual performance in deployed contexts, and that captures risks you did not foresee. ¹

For a Compliance Officer, CCO, or GRC lead, the fastest path is to treat MEASURE-3.1 like a control family that spans model operations (monitoring and evaluation), enterprise risk management (risk register and treatment plans), and incident management (intake, escalation, and corrective actions). You are building a “risk discovery and tracking system” for AI: who watches what, how often, what triggers escalation, where evidence is stored, and how leadership is informed. ¹

This page gives you requirement-level implementation guidance you can hand to engineering, product, and operations teams, then test in audit-style terms.

Regulatory text

Excerpt: “Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts.” ¹

What an operator must do:

Define the approaches you will use to detect AI risks in production (monitoring, evaluations, feedback loops, incident intake).
Assign personnel with clear accountability for running those approaches and escalating issues.
Maintain documentation that shows the process runs regularly and that identified risks are logged, tracked, and managed through to resolution or acceptance.
Ensure the process explicitly covers existing (known) risks, unanticipated risks discovered after deployment, and emergent risks that arise as contexts, users, threats, or dependencies change. ¹

Plain-English interpretation of the requirement

You need a living process that answers four questions for every deployed AI system:

What can go wrong (including new issues we didn’t predict)?
How will we detect it in real use?
Who owns the response and the decision to change the system (or accept risk)?
Where is the evidence that we detected, tracked, and acted? ¹

“Regularly” is the core exam theme here. Auditors typically won’t accept a one-time risk assessment, a launch checklist, or an informal Slack channel as “regular identification and tracking.” They will look for a defined cadence, triggers, and a log of activity over time.

Who it applies to (entity and operational context)

Applies to: Organizations that develop, deploy, or operate AI systems, including AI enabled business processes and AI features embedded in products. ¹

Operational contexts where it matters most:

Customer-impacting decisions (eligibility, pricing, access, moderation, ranking)
Safety- or mission-critical workflows (health, industrial, critical operations)
Employee-impacting decisions (screening, performance, scheduling)
Any setting with changing inputs or behavior (fraud, abuse, adversarial use)
Third-party AI dependencies (hosted models, APIs, scoring services) where you still own outcomes in your environment

What you actually need to do (step-by-step)

Step 1: Define scope and ownership per AI system

Create (or update) an AI system inventory entry for each deployed AI capability and assign:

Business owner (owns impact and risk acceptance)
Technical owner (owns monitoring, fixes, rollbacks)
Risk/compliance owner (owns control operation and reporting)
Incident commander on-call path (who responds when thresholds breach)

Minimum practical output: a RACI that is specific to each system, not generic.

Step 2: Document “intended use” and “deployed context” so monitoring has a target

For each AI system, document:

Intended use, intended users, and prohibited uses
Expected operating environment and key assumptions (data sources, user behavior, constraints)
Primary harms to watch for (financial, safety, discrimination, privacy, security, customer deception)

If you skip this, you cannot credibly compare “intended vs actual performance in deployed contexts.” ¹

Step 3: Establish your risk identification approaches (detection channels)

Implement multiple channels so you catch both known failure modes and surprises:

A. Performance and quality monitoring

Define metrics tied to intended outcomes (accuracy, error rates, latency, stability)
Segment metrics by meaningful slices (region, product line, channel, cohort) where feasible, to spot localized failures
Detect distribution shift and input anomalies (sudden changes in data patterns)

B. Human feedback and complaint intake

Route customer complaints, frontline escalations, and internal user feedback into an AI-specific queue with tags
Require structured capture: what happened, where, impact, screenshots/log IDs, and whether the AI output was followed

C. Incident/problem management integration

Add AI-specific incident categories (model output error, harmful content, unauthorized use, security abuse, privacy exposure)
Define severity levels and escalation paths that reach both technical and compliance stakeholders

D. Change monitoring

Track model updates, prompt/template changes, feature flags, policy changes, data pipeline changes, and third-party model version changes
Require post-change validation and heightened monitoring for a defined period after material changes (duration is your policy choice)

Step 4: Create an AI Risk Register workflow that can track “existing, unanticipated, emergent”

Set up a risk register (or dedicated AI tab in ERM tooling) with fields that support audit and operations:

Risk statement and category (performance, bias/fairness, privacy, security, safety, legal/regulatory, reputational)
Trigger/source (monitoring alert, incident, complaint, red team finding, third-party notice)
Context: intended use vs observed context, affected populations or workflows
Impact assessment (qualitative plus business impact notes)
Owner, status, target date, and treatment plan (mitigate, transfer, avoid, accept)
Linkage to evidence (tickets, dashboards, incident reports, meeting minutes)

Critical: “Track” means you can show lifecycle movement, not just a list.

Step 5: Define thresholds and escalation criteria

Write down what forces action:

Metric thresholds (e.g., error rate spike, drift signal, safety filter failures)
Volume thresholds (complaint clusters, repeated incident types)
Policy violations (use outside intended purpose)
Third-party alerts (provider outages, model changes, new known issues)

Tie each trigger to a required response: investigation ticket, incident declaration, temporary rollback, customer notification evaluation, or governance review.

Step 6: Run a recurring review cadence and record outcomes

Hold a cross-functional AI risk review with a fixed agenda:

New risks identified since last review
Status of open risks and overdue actions
Post-incident learnings and control improvements
Upcoming changes requiring heightened monitoring

Record attendance, decisions, and action items. Your documentation is part of the control. ¹

Step 7: Test the process with a tabletop “unanticipated risk” scenario

Pick a realistic scenario (unexpected harmful outputs, sudden drift, misuse by a user segment) and walk through:

How it is detected
Who triages
Where it is logged
How it is escalated
What evidence is produced

If the tabletop cannot produce artifacts, your control will fail an audit-style challenge.

Required evidence and artifacts to retain

Keep these in a single audit-ready folder structure per AI system:

AI monitoring plan (metrics, thresholds, segmentation, owners, tools)
Deployed context + intended use statement (assumptions, prohibited uses)
Risk register entries with lifecycle history and linked tickets
Incident reports / problem records tied to AI categories
Meeting minutes from recurring AI risk reviews (decisions + actions)
Change logs (model versions, prompts, pipelines, third-party version updates)
Post-change validation results and increased-monitoring notes
User feedback/complaint summaries and triage dispositions
Exception approvals / risk acceptances with rationale and expiry/refresh trigger
Evidence of escalation (pages, emails, ticket routing, on-call logs)

If you use Daydream to manage third-party and AI governance evidence collection, map MEASURE-3.1 directly to a control owner, a documented procedure, and recurring evidence tasks so the “regularly” requirement is provable on demand. ¹

Common exam/audit questions and hangups

Auditors, internal audit, and risk committees tend to press on:

“Show me regular operation.” Produce monitoring outputs and review minutes across multiple cycles, not a one-off snapshot.
“How do you detect unanticipated or emergent risks?” Show multiple intake channels plus drift/behavior monitoring and post-incident learnings feeding back into controls. ¹
“Who is accountable to act?” RACI gaps are a frequent finding; name a person, not a team.
“What happens when the AI is used outside intended context?” Show detection signals and an escalation path to business owners.
“How do third-party models fit?” Show how you monitor performance in your environment and how provider notifications become tracked risks.

Frequent implementation mistakes and how to avoid them

Mistake: Monitoring only model metrics, ignoring real-world harm signals.
Fix: Add complaint intake, incident categories, and qualitative harm reviews alongside quantitative dashboards.
Mistake: No linkage between alerts and risk tracking.
Fix: Require every material alert to generate a ticket, and every material ticket to map to a risk register item or a documented “no risk” disposition.
Mistake: “Regularly” is implied, not scheduled.
Fix: Put the cadence in policy/procedure, calendar the reviews, and retain minutes and dashboards.
Mistake: Ownership sits in engineering only.
Fix: Make a business owner responsible for risk acceptance and customer impact decisions; compliance validates control operation.
Mistake: Emergent risk is treated as theoretical.
Fix: Define “emergent” triggers (new user behavior, new data sources, new jurisdictions, third-party model updates) and route them into the same workflow. ¹

Enforcement context and risk implications

NIST AI RMF is a framework, not a regulator, and the provided sources do not include public enforcement cases tied to MEASURE-3.1. ² Treat this requirement as defensibility infrastructure: if you cannot show ongoing identification and tracking, you will struggle to justify that AI risks are managed after deployment, especially when incidents occur.

A practical 30/60/90-day execution plan

First 30 days (stand up the control skeleton)

Assign owners (business, technical, compliance) for each in-scope AI system.
Publish a short MEASURE-3.1 procedure: detection channels, escalation triggers, risk register fields, and evidence location. ¹
Implement a single intake path for AI issues (ticket form + required fields).
Create initial risk register entries for known risks and open issues.

Days 31–60 (make it operational and repeatable)

Build monitoring dashboards aligned to intended use and deployed context.
Integrate incident/problem categories and escalation paths.
Hold the first recurring AI risk review; record minutes and actions.
Run one tabletop exercise and record gaps and fixes.

Days 61–90 (prove “regularly,” tighten thresholds, expand coverage)

Run subsequent cycles of monitoring review and risk committee reporting.
Add segmentation/slicing where failures could hide in aggregates.
Implement post-change validation gates and heightened monitoring after material changes.
Start trend reporting: top recurring AI risks, time-to-triage, overdue actions (qualitative trends are acceptable if you lack clean metrics).

Frequently Asked Questions

Do we need separate MEASURE-3.1 processes for every model?

You need clear ownership, monitoring, and tracking per deployed AI system. You can standardize the workflow and tooling, but the intended use, thresholds, and risks must be specific to each deployed context. ¹

What counts as “unanticipated” vs “emergent” risk in practice?

Unanticipated risks are discovered post-deployment that you did not identify during design or testing. Emergent risks arise as the environment changes, such as new user behaviors, new data sources, or third-party model updates that shift real-world performance. ¹

We use a third-party AI API. Are we still on the hook?

Yes for your deployment context. Track provider notices as risk inputs, but also monitor actual performance, incidents, and misuse in your environment and route them into your risk workflow. ¹

What documentation is the fastest way to satisfy auditors?

A written monitoring plan, a living risk register with status history, recurring review minutes, and linked tickets/incidents. Auditors want evidence that the loop runs regularly and that findings drive action. ¹

How do we implement this without building a full MLOps platform?

Start with what you have: logging, dashboards, and a ticketing system plus a disciplined review cadence and documentation. Add automation later, but don’t wait to define owners, triggers, and a tracking workflow. ¹

What should trigger escalation to the risk committee or senior management?

Escalate when an issue materially affects intended performance in deployed contexts, indicates use outside intended purpose, causes customer harm, repeats without a clear fix, or reflects a new class of emergent risk. Document the trigger and the decision. ¹

Frequently Asked Questions

Do we need separate MEASURE-3.1 processes for every model?

What counts as “unanticipated” vs “emergent” risk in practice?

We use a third-party AI API. Are we still on the hook?

Yes for your deployment context. Track provider notices as risk inputs, but also monitor actual performance, incidents, and misuse in your environment and route them into your risk workflow. (Source: NIST AI RMF Core)

What documentation is the fastest way to satisfy auditors?

How do we implement this without building a full MLOps platform?

What should trigger escalation to the risk committee or senior management?

Operationalize this requirement

Map requirement text to controls, owners, evidence, and review workflows inside Daydream.

See Daydream

Regulatory text

Plain-English interpretation of the requirement

Who it applies to (entity and operational context)

What you actually need to do (step-by-step)

Step 1: Define scope and ownership per AI system

Step 2: Document “intended use” and “deployed context” so monitoring has a target

Step 3: Establish your risk identification approaches (detection channels)

Step 4: Create an AI Risk Register workflow that can track “existing, unanticipated, emergent”

Step 5: Define thresholds and escalation criteria

Step 6: Run a recurring review cadence and record outcomes

Step 7: Test the process with a tabletop “unanticipated risk” scenario

Required evidence and artifacts to retain

Common exam/audit questions and hangups

Frequent implementation mistakes and how to avoid them

Enforcement context and risk implications

A practical 30/60/90-day execution plan

First 30 days (stand up the control skeleton)

Days 31–60 (make it operational and repeatable)

Days 61–90 (prove “regularly,” tighten thresholds, expand coverage)

Frequently Asked Questions

Do we need separate MEASURE-3.1 processes for every model?

What counts as “unanticipated” vs “emergent” risk in practice?

We use a third-party AI API. Are we still on the hook?

What documentation is the fastest way to satisfy auditors?

How do we implement this without building a full MLOps platform?

What should trigger escalation to the risk committee or senior management?

Footnotes

Frequently Asked Questions

Related Resources

Operationalize this requirement