SI-11: Error Handling

SI-11: Error Handling requires you to generate error messages that help users and operators correct issues without exposing system details attackers can exploit. Operationalize it by standardizing “safe” error responses, logging detailed diagnostics securely, and testing that applications, APIs, and infrastructure never return sensitive internals to untrusted callers (NIST SP 800-53 Rev. 5 OSCAL JSON).

Key takeaways:

  • Build a two-tier model: minimal external errors, rich internal diagnostics with controlled access (NIST SP 800-53 Rev. 5).
  • Treat error messages as an information disclosure control across UI, API, batch jobs, and third-party integrations.
  • Evidence wins audits: show standards, implementation patterns, test results, and recurring reviews tied to a named control owner.

The si-11: error handling requirement is a deceptively practical control: it’s less about “pretty messages” and more about preventing information disclosure while still enabling rapid corrective action. In real environments, error handling failures show up as stack traces in browsers, verbose JSON error objects in APIs, misconfigured server banners, and “helpful” logs shipped to places they shouldn’t be. Those leaks shorten an attacker’s path by revealing account validity, internal hostnames, database schemas, library versions, file paths, and authentication logic.

SI-11 asks you to strike a balance. End users, admins, developers, and on-call responders need enough information to recover operations. Untrusted parties should receive only what they need to continue safely (for example, “request could not be processed”), plus a reference ID that your team can use to find the full diagnostic record. That balance becomes enforceable in an assessment because it’s testable: assessors can provoke failures and inspect what the system returns.

This page translates SI-11 into requirement-level implementation guidance you can execute quickly: scope, standards, patterns, tests, and the evidence package that makes the control easy to defend during audits.

Regulatory text

Requirement (excerpt): “Generate error messages that provide information necessary for corrective actions without revealing information that could be exploited; and” (NIST SP 800-53 Rev. 5 OSCAL JSON)

What the operator must do:

  • Ensure error messages exposed to untrusted users or external systems do not disclose sensitive implementation details (for example, stack traces, configuration values, account enumeration hints).
  • Ensure your organization can still take corrective actions by capturing richer diagnostics internally (for example, structured logs, correlation IDs, runbook context) with appropriate access controls (NIST SP 800-53 Rev. 5).

Plain-English interpretation (what “good” looks like)

You need an enterprise error-handling standard that separates:

  1. External response (what the caller sees): minimal, consistent, non-sensitive, actionable enough to proceed safely.
  2. Internal diagnostic record (what your team sees): detailed, searchable, correlated, and protected.

A practical “pass” condition: a tester can intentionally break inputs, authentication, authorization, and dependencies, and your systems never reveal internals to the requester; your team can still troubleshoot quickly using a request ID and centralized logs.

Who it applies to (entity + operational context)

Entity scope: Federal information systems and contractor systems handling federal data, where NIST SP 800-53 Rev. 5 is used as the security control baseline (NIST SP 800-53 Rev. 5).

Operational scope (where SI-11 usually fails):

  • Public-facing web apps and portals (UI error pages, form validation, file upload failures)
  • APIs (REST/GraphQL gRPC gateways, API management layers, serverless handlers)
  • Identity flows (login, password reset, MFA, account recovery)
  • Batch and ETL jobs (job failure notifications and operator consoles)
  • Infrastructure and platform components (reverse proxies, WAFs, Kubernetes ingress, service meshes)
  • Third-party integrations (webhooks, payment processors, SSO/SAML/OIDC error callbacks)

If you have multiple product teams, this control must be implemented as a platform pattern plus application verification, not a one-off policy.

What you actually need to do (step-by-step)

1) Assign ownership and define “trusted vs untrusted”

  • Name a control owner (often AppSec or Platform Engineering) and an accountable executive (CCO/CISO depending on your governance model).
  • Define “untrusted” as: public internet clients, partners/third parties without privileged access, and any user without authenticated admin rights.
  • Define “trusted” as: authenticated administrators and engineers with approved access to observability tools and logs.

Deliverable: an SI-11 implementation standard that your teams can follow consistently (NIST SP 800-53 Rev. 5).

2) Create an error message standard (your “allow list”)

Document rules that engineers can implement without debate:

  • No stack traces or exception class names in external responses.
  • No internal identifiers: hostnames, IPs, file paths, database/table names, library versions, config keys.
  • No account enumeration: authentication and recovery errors must not confirm whether an account exists.
  • Use a correlation ID in the external message (for example, error_id) that maps to internal logs.
  • Use stable error codes for client handling (for example, INVALID_INPUT, AUTH_FAILED, RATE_LIMITED) that do not disclose internals.

Tip: write sample responses for UI and API. Engineers copy patterns faster than they read policies.

3) Implement centralized handling patterns (framework-level)

Require teams to implement error handling using framework hooks/middleware:

  • Web apps: global exception handler that returns generic pages/messages; custom 404/500 pages.
  • APIs: centralized error middleware that maps exceptions to safe HTTP status codes and sanitized bodies.
  • Background jobs: standardized failure events with a safe operator summary plus a link to internal logs.

Make “safe by default” the baseline. The goal is to reduce reliance on developer discipline.

4) Split diagnostics: internal logging, access control, and retention

To preserve corrective action capability without external disclosure:

  • Log detailed exceptions internally (stack trace, service name, dependency status, sanitized payload metadata).
  • Protect logs with role-based access and strong authentication in your logging/observability stack.
  • Prevent sensitive data from entering logs where possible (token redaction, PII masking). SI-11 is about error messages, but audits often expand into “what got exposed during failure.”

Evidence focus: show the linkage from error_id in a customer-facing response to the internal log event.

5) Test it like an assessor would (and keep the test output)

Build a repeatable test suite for common failure modes:

  • Invalid input and schema violations
  • Auth failures (wrong password, unknown user, expired sessions)
  • Authorization failures (access denied)
  • Dependency outages (DB down, third-party API timeouts)
  • Unhandled exceptions (force a server-side crash in a non-prod environment)

Testing methods that work:

  • Automated integration tests that assert response bodies do not contain blocked patterns (e.g., “Exception”, “Traceback”, “/var/”, “SELECT”, “org.springframework”).
  • Manual spot checks in staging for high-risk endpoints: login, password reset, file upload, admin pages.

6) Operationalize reviews and exceptions

  • Add SI-11 checks to secure coding standards and code review checklists.
  • Add a lightweight exception process when a product truly needs to expose more detail (rare). Require security sign-off and compensating controls.

7) Map to your broader control set

SI-11 intersects with:

  • Secure development lifecycle practices (design standards, code review)
  • Logging and monitoring controls (diagnostics, access control)
  • Incident response (triage and RCA)
  • Third-party risk (ensure third-party components you expose don’t leak stack traces through your integration surface)

If you use Daydream to run your control library, treat SI-11 as a control with a single owner, a standard implementation procedure, and recurring evidence artifacts across product lines. That mapping prevents “we did it in one app” from being mistaken as enterprise coverage.

Required evidence and artifacts to retain

Keep artifacts that prove both design and operating effectiveness:

Policy/standard

  • Error handling standard (external message rules, prohibited content list, correlation ID requirement)
  • Secure coding guidelines referencing SI-11

Technical implementation

  • Screenshots or configuration exports showing production error pages/middleware settings (sanitized)
  • Code snippets or architectural decision records showing centralized error handling
  • API error schema documentation (error codes, fields, correlation ID)

Logging and access

  • Sample log events showing detailed internal diagnostics tied to an error_id
  • Access control evidence for log platforms (role definitions, access reviews)

Testing and validation

  • Test cases and most recent results demonstrating sanitized responses
  • Vulnerability scan or DAST results showing no stack traces or verbose error leaks (where applicable)

Governance

  • Control owner assignment, review cadence, and exception approvals (if any)

Common exam/audit questions and hangups

Assessors often probe SI-11 by forcing failures. Prepare for:

  • “Show me what a 500 error returns in production. Does it include a stack trace?”
  • “What happens on login failure? Can I infer whether the username exists?”
  • “How do you correlate a customer-reported error to internal diagnostics?”
  • “Do third-party components (reverse proxies, API gateways) expose verbose errors?”
  • “Where is this standardized, and how do you ensure every team follows it?” (NIST SP 800-53 Rev. 5)

Hangup: teams claim “we log everything,” but they cannot show a controlled path from an external error to internal corrective action without disclosing internals.

Frequent implementation mistakes (and how to avoid them)

  1. Returning raw exception messages to API clients
    Fix: enforce centralized exception mapping; block raw exception serialization.

  2. Verbose errors in non-production that accidentally ship to production
    Fix: configuration-as-code with environment guards; deployment checks that reject debug mode.

  3. Account enumeration via “user not found”
    Fix: standardize auth errors; keep user existence checks internal.

  4. Correlation IDs exist but are not searchable
    Fix: ensure error_id is indexed in logs and appears in runbooks/on-call tooling.

  5. Logs become the new leak
    Fix: control access to logs; redact secrets; set retention aligned to your data governance policies.

Enforcement context and risk implications

No public enforcement case sources were provided for this requirement in the supplied catalog, so this page does not cite specific cases. Practically, SI-11 failures create two concrete risks: (1) information disclosure that accelerates exploitation, and (2) slower incident recovery because teams lack consistent correlation between user-facing failures and internal diagnostics (NIST SP 800-53 Rev. 5).

Practical 30/60/90-day execution plan

First 30 days (Immediate stabilization)

  • Appoint SI-11 control owner and publish the error-handling standard (NIST SP 800-53 Rev. 5).
  • Identify your top external surfaces: primary web app, primary API gateway, auth endpoints, and top integrations.
  • Implement or confirm centralized error handling in those surfaces.
  • Add correlation IDs to responses and logs.

Next 60 days (Coverage expansion + testability)

  • Roll the pattern into remaining services and job runners.
  • Build automated tests to detect verbose error disclosures (pattern matching for stack traces and internals).
  • Lock down log access and document the diagnostic workflow (customer report → error_id → internal record → corrective action).

By 90 days (Sustainment + audit package)

  • Add SI-11 checks to SDLC gates (PR checklist, security review templates).
  • Establish a recurring evidence pull: latest test results, sample sanitized errors, access review proof.
  • If you use Daydream, map SI-11 to its owner, procedure, and recurring artifacts so audits can be answered from a single control record instead of chasing teams.

Frequently Asked Questions

Do we have to hide all error details from authenticated users?

Hide sensitive internals from untrusted users. For trusted administrators, you can expose more context in authenticated admin consoles, but keep it role-restricted and avoid exposing secrets (NIST SP 800-53 Rev. 5).

Are correlation IDs required by SI-11?

SI-11 requires corrective-action capability without exploitable disclosure (NIST SP 800-53 Rev. 5 OSCAL JSON). Correlation IDs are a practical way to meet that outcome because they let you keep external responses minimal while preserving internal diagnosability.

What’s the difference between SI-11 and logging controls?

SI-11 governs what you reveal in error messages to the requester. Logging controls govern what you record internally and how you protect it; SI-11 usually depends on good logging to preserve corrective action capability (NIST SP 800-53 Rev. 5).

How do we handle third-party error messages that pass through our API?

Normalize and sanitize upstream errors at your boundary. Return your own safe error schema externally, store the upstream detail internally, and reference it via an error_id.

Does SI-11 apply to internal-only services?

Yes, if “internal” callers include broad groups or less-trusted networks. Treat service-to-service boundaries as potentially exposed and standardize safe errors across internal APIs that could be reachable from multiple segments (NIST SP 800-53 Rev. 5).

What evidence is easiest for auditors to accept?

A written standard, examples of sanitized production responses, logs showing rich internal diagnostics tied to an error_id, and repeatable test results that prove the behavior across representative systems (NIST SP 800-53 Rev. 5).

Frequently Asked Questions

Do we have to hide all error details from authenticated users?

Hide sensitive internals from untrusted users. For trusted administrators, you can expose more context in authenticated admin consoles, but keep it role-restricted and avoid exposing secrets (NIST SP 800-53 Rev. 5).

Are correlation IDs required by SI-11?

SI-11 requires corrective-action capability without exploitable disclosure (NIST SP 800-53 Rev. 5 OSCAL JSON). Correlation IDs are a practical way to meet that outcome because they let you keep external responses minimal while preserving internal diagnosability.

What’s the difference between SI-11 and logging controls?

SI-11 governs what you reveal in error messages to the requester. Logging controls govern what you record internally and how you protect it; SI-11 usually depends on good logging to preserve corrective action capability (NIST SP 800-53 Rev. 5).

How do we handle third-party error messages that pass through our API?

Normalize and sanitize upstream errors at your boundary. Return your own safe error schema externally, store the upstream detail internally, and reference it via an error_id.

Does SI-11 apply to internal-only services?

Yes, if “internal” callers include broad groups or less-trusted networks. Treat service-to-service boundaries as potentially exposed and standardize safe errors across internal APIs that could be reachable from multiple segments (NIST SP 800-53 Rev. 5).

What evidence is easiest for auditors to accept?

A written standard, examples of sanitized production responses, logs showing rich internal diagnostics tied to an error_id, and repeatable test results that prove the behavior across representative systems (NIST SP 800-53 Rev. 5).

Operationalize this requirement

Map requirement text to controls, owners, evidence, and review workflows inside Daydream.

See Daydream