AU-5(2): Real-time Alerts

AU-5(2) requires you to generate real-time (or near-real-time) alerts when audit logging fails, and to deliver those alerts to the right responders within a defined time window. To operationalize it fast, define “audit failure events,” set alert thresholds and routing, test end-to-end delivery, and retain evidence that alerts fired and were handled. 1

Key takeaways:

  • Define the exact audit failure events and the maximum time-to-alert your system must meet. 1
  • Implement automated alerting with clear routing, on-call ownership, and measurable delivery/acknowledgement outcomes.
  • Keep recurring evidence: configurations, test results, and alert/incident records that prove the control runs in production.

Footnotes

  1. NIST SP 800-53 Rev. 5 OSCAL JSON

The au-5(2): real-time alerts requirement is a reliability and detection control for your audit logging pipeline. If audit logs stop being generated, forwarded, stored, or protected, you can lose the record needed to investigate incidents, meet federal reporting obligations, or support legal/regulatory inquiries. AU-5(2) closes that gap by forcing you to detect audit logging failures quickly and notify the people who can fix them.

This requirement is easy to “paper comply” with and still fail an assessment. Auditors and assessors usually probe for two things: (1) specificity (what exactly counts as an audit failure event, and how fast do you alert), and (2) operational proof (evidence that alerts trigger in real conditions, route to staffed responders, and result in remediation). Your goal is to make audit failure alerting boring: standardized signals, consistent routing, tested delivery, and clean artifacts.

This page gives requirement-level implementation guidance you can hand to a control owner, SIEM engineer, or platform team, then verify quickly as a CCO/GRC lead.

Regulatory text

Requirement excerpt: “Provide an alert within {{ insert: param, au-05.02_odp.01 }} to {{ insert: param, au-05.02_odp.02 }} when the following audit failure events occur: {{ insert: param, au-05.02_odp.03 }}.” 1

Operator interpretation of the text (what you must implement):

  • You must define a time bound for alerting (“within X to Y”) and meet it in operations, not only in policy. 1
  • You must define the audit failure events that trigger alerts (the control uses organization-defined parameters). Typical events include: log source stoppage, forwarder/agent failure, pipeline backlog beyond threshold, storage capacity exhaustion, integrity check failures, or permissions/config changes that disable auditing.
  • You must send alerts to an identified audience (an organization-defined set of roles/teams) that can act, such as SOC, SRE/on-call, system owners, or an IR distribution list. 1

Plain-English requirement (what it means)

If your audit logging capability breaks, you need to know fast and you need the right humans (or an automated response) to be notified fast. AU-5(2) is about detecting audit blind spots in real time and proving that you can restore logging before you lose investigative continuity.

Who it applies to

Entity scope

  • Federal information systems and contractors handling federal data where NIST SP 800-53 is the governing control baseline. 1

Operational context

  • Systems that generate security-relevant logs (identity, endpoints, servers, network, cloud control plane, applications).
  • Centralized logging/SIEM architectures, log aggregation pipelines, and managed logging services.
  • Environments with third parties operating parts of the stack (for example, a managed SOC, MSP, or SaaS logging provider). AU-5(2) still lands on you: you must ensure the third party’s service has equivalent failure alerting and that you receive the alerts you need.

What you actually need to do (step-by-step)

1) Name the control owner and responders (make ownership testable)

  • Assign a primary control owner (often Security Operations or Platform/SRE).
  • Assign alert recipients by role (SOC queue, on-call rotation, system owner group).
  • Define an escalation path if the first-line queue does not acknowledge.

Deliverable: AU-5(2) control implementation statement with owner, responders, and escalation.

2) Define “audit failure events” with clear triggers

Write an “audit failure event catalog” that includes:

  • Event description (what failed).
  • Where it occurs (log source, agent, forwarder, collector, SIEM, archive).
  • Signal/metric (heartbeat missing, error code, queue depth, ingestion rate drop, storage threshold, permission change).
  • Trigger logic (threshold, pattern match, or state change).
  • Severity mapping (page vs ticket vs email).

Make sure the catalog covers at least:

  • Source-level failure (auditing disabled; audit daemon stopped; agent unhealthy).
  • Transport failure (forwarder cannot reach collector; cert expired; DNS failure).
  • Ingestion failure (SIEM connector broken; parsing failures spike).
  • Storage failure (disk full; index frozen; retention job failing).
  • Integrity/immutability failure (hash checks fail; write-once settings disabled, if used).

Deliverable: audit failure event catalog approved by control owner.

3) Define the “within X to Y” time-to-alert requirement and measure it

AU-5(2) requires organization-defined timing. Choose a time-to-alert standard that fits:

  • Your incident response expectations.
  • Your log loss tolerance.
  • Your system criticality.

Then implement measurement:

  • Record event detection timestamp (when failure is detected).
  • Record alert delivery timestamp (when paging/ticketing/email was sent).
  • Record ack timestamp (optional but strongly audit-friendly).

Deliverable: time-to-alert requirement statement and a method to measure it. 1

4) Implement automated alert generation and routing

Implementation patterns that auditors accept because they are observable:

  • SIEM rules for “no logs received from source in N minutes.”
  • Monitoring alerts (Prometheus/CloudWatch/Azure Monitor) on agent health, queue depth, disk usage, and connector errors.
  • Event-driven alerts on configuration changes that disable audit policies.

Routing requirements:

  • Alerts must go to a staffed destination (SOC platform, on-call paging).
  • Avoid single-recipient designs. Use group routing with escalation.

Deliverable: alert rules/config exports; routing policies; on-call schedule reference.

5) Test the alert end-to-end (and repeat on change)

Create test cases for each major failure class:

  • Stop the log agent on a non-production host.
  • Block egress to collector.
  • Fill a test index or simulate storage threshold.
  • Disable an audit policy in a sandbox tenant.

Record:

  • What you did.
  • What alert fired.
  • Who received it.
  • How long it took.
  • How it was resolved.

Deliverable: test plan + executed test evidence.

6) Operationalize response: runbooks and tickets

Alerting without response is where programs fail. For each failure event:

  • Create a runbook with triage steps and ownership handoff.
  • Create an incident/ticket workflow that documents remediation and restoration of logging.
  • Add a step to validate recovery (“logs resumed” and “backlog cleared”).

Deliverable: runbooks and sample tickets/incidents linked to alerts.

7) Include third parties in scope (where they can break logging)

If a third party provides:

  • Managed SIEM,
  • MDR/SOC,
  • Logging SaaS,
  • Infrastructure operations,

then contractually and operationally ensure:

  • You receive audit pipeline failure alerts relevant to your environment.
  • You can access evidence (alert history, SLA reports, incident records).
  • Responsibilities are documented in the RACI.

Deliverable: third-party responsibility matrix and evidence access procedure.

Required evidence and artifacts to retain

Keep artifacts that prove the control is designed and operating:

Design evidence

  • AU-5(2) control narrative (owner, scope, tooling, recipients, escalation). 1
  • Audit failure event catalog (triggers, severity, routing). 1
  • Time-to-alert standard (“within X to Y”) and measurement approach. 1

Operational evidence

  • Screenshots/exports of alert rules and routing configuration (SIEM queries, monitoring alarms).
  • Alert history showing real alerts (sanitized) and their timestamps.
  • Test execution records with results.
  • Tickets/incidents demonstrating response and restoration.
  • Change records showing updates when log sources or pipelines change.

Governance evidence

  • Periodic review sign-off of alert coverage for new systems/log sources.
  • Training/on-call documentation for responders.

Common exam/audit questions and hangups

  • “What are your defined audit failure events, and where are they documented?” 1
  • “Show me an example where logging failed and the alert was generated within your requirement window.” 1
  • “Who receives these alerts, and how do you ensure someone is available?”
  • “How do you know you didn’t silently lose logs during a SIEM outage?”
  • “How do you update alerting when onboarding new applications or cloud accounts?”
  • “What testing do you perform after SIEM/parser/agent upgrades?”

Frequent implementation mistakes (and how to avoid them)

  1. Undefined parameters (X/Y and event list).
    Fix: Put the time-to-alert and event catalog in a controlled document and link it to system scope. 1

  2. Alert goes to email only.
    Fix: Route high-severity audit failures to a monitored queue or paging rotation with escalation.

  3. Monitoring only the SIEM, not the sources.
    Fix: Combine “no logs received” with source/agent health checks and transport checks.

  4. No evidence of tests.
    Fix: Schedule recurring failure-injection tests in non-production and retain results.

  5. Third party runs logging; you can’t produce artifacts.
    Fix: Add evidence access requirements to the third-party management process and confirm delivery paths.

Enforcement context and risk implications

No public enforcement cases were provided in the source catalog for this requirement. Practically, AU-5(2) reduces the risk of undetected audit gaps that can:

  • Delay incident detection and containment.
  • Prevent reconstruction of events during investigations.
  • Weaken your assessment posture when assessors request proof of continuous audit capability. 1

A practical execution plan (30/60/90)

You asked for speed, so use phased execution with concrete outputs.

First 30 days (establish control minimums)

  • Assign owner, responders, and escalation.
  • Publish the audit failure event catalog draft.
  • Set the time-to-alert requirement and measurement method. 1
  • Implement alerting for the highest-impact failure modes (top critical log sources, core collectors, storage capacity).

Next 60 days (expand coverage and prove operations)

  • Extend alerting coverage to remaining log sources and cloud tenants.
  • Build runbooks per failure class.
  • Execute end-to-end tests and retain evidence.
  • Add a lightweight review step to system onboarding so new log sources get failure alerting.

By 90 days (make it durable and assessment-ready)

  • Establish recurring control checks (rule health, alert routing validation, on-call roster verification).
  • Add dashboards for audit pipeline health and missed-heartbeat coverage.
  • Confirm third-party evidence access and document responsibilities.
  • Package evidence for assessment: configurations, test results, and operational alert/ticket samples.

Where Daydream fits (without adding tooling risk)

If you struggle with “proof, not promises,” Daydream can act as the compliance system of record for AU-5(2): map the requirement to a control owner, store the procedure, and schedule recurring evidence requests (rule exports, alert samples, test results) so you can answer assessor questions quickly.

Frequently Asked Questions

What counts as an “audit failure event” for AU-5(2)?

It’s organization-defined, but it should cover failures that stop audit records from being created, transmitted, ingested, stored, or protected. Document the event list and the trigger logic so an assessor can trace each alert to a specific failure mode. 1

Do I need a SIEM to meet au-5(2): real-time alerts requirement?

No. You need reliable automated alerting and evidence that it works. Many teams meet AU-5(2) with cloud-native monitoring plus a ticketing/on-call workflow, as long as it covers audit logging failures and meets the defined time-to-alert. 1

How do we prove the “within X to Y” timing in an audit?

Retain test records and real alert samples with timestamps showing detection and delivery. If your tooling can export alert creation and notification times, store those exports as recurring evidence. 1

What if alerts trigger but no one responds after hours?

Treat that as a design gap. Route critical audit failure alerts to an on-call rotation with escalation, and retain evidence that the rotation is staffed and tested (for example, a paging test record and an escalation policy).

How should we handle audit failure alerts in a third-party managed environment?

Define responsibilities in a RACI and ensure the third party provides you the alert feed and supporting artifacts (alert history, incident records, configuration excerpts). If you cannot obtain evidence, you will struggle to demonstrate operation during an assessment.

How often should we test AU-5(2) alerting?

Test after meaningful changes (new log sources, SIEM connector changes, agent upgrades, routing changes) and on a recurring schedule that matches your change velocity. Keep the test plan and the last executed results ready for assessors.

Footnotes

  1. NIST SP 800-53 Rev. 5 OSCAL JSON

Frequently Asked Questions

What counts as an “audit failure event” for AU-5(2)?

It’s organization-defined, but it should cover failures that stop audit records from being created, transmitted, ingested, stored, or protected. Document the event list and the trigger logic so an assessor can trace each alert to a specific failure mode. (Source: NIST SP 800-53 Rev. 5 OSCAL JSON)

Do I need a SIEM to meet au-5(2): real-time alerts requirement?

No. You need reliable automated alerting and evidence that it works. Many teams meet AU-5(2) with cloud-native monitoring plus a ticketing/on-call workflow, as long as it covers audit logging failures and meets the defined time-to-alert. (Source: NIST SP 800-53 Rev. 5 OSCAL JSON)

How do we prove the “within X to Y” timing in an audit?

Retain test records and real alert samples with timestamps showing detection and delivery. If your tooling can export alert creation and notification times, store those exports as recurring evidence. (Source: NIST SP 800-53 Rev. 5 OSCAL JSON)

What if alerts trigger but no one responds after hours?

Treat that as a design gap. Route critical audit failure alerts to an on-call rotation with escalation, and retain evidence that the rotation is staffed and tested (for example, a paging test record and an escalation policy).

How should we handle audit failure alerts in a third-party managed environment?

Define responsibilities in a RACI and ensure the third party provides you the alert feed and supporting artifacts (alert history, incident records, configuration excerpts). If you cannot obtain evidence, you will struggle to demonstrate operation during an assessment.

How often should we test AU-5(2) alerting?

Test after meaningful changes (new log sources, SIEM connector changes, agent upgrades, routing changes) and on a recurring schedule that matches your change velocity. Keep the test plan and the last executed results ready for assessors.

Operationalize this requirement

Map requirement text to controls, owners, evidence, and review workflows inside Daydream.

See Daydream