SI-13(2): Time Limit on Process Execution Without Supervision

SI-13(2): time limit on process execution without supervision requirement means you must set and enforce a maximum runtime for specified unsupervised processes (for example, jobs that run without an operator watching), and stop, fail-safe, or escalate when that time limit is exceeded. Operationalize it by defining which processes qualify, setting time thresholds, implementing technical controls to enforce them, and retaining evidence that limits are configured, monitored, and acted on. 1

Key takeaways:

  • Define “unsupervised” for your environment and explicitly scope which process types are subject to runtime limits.
  • Enforce time limits technically (scheduler timeouts, watchdogs, job TTLs), not only via policy.
  • Keep assessor-ready evidence: configuration, runbooks, alerting, and records of time-limit events and response.

A common control gap in real programs is letting automated processes run indefinitely because “they’ve always run that way.” SI-13(2) closes that gap by forcing you to put boundaries around unsupervised execution. The intent is operational: limit the blast radius when a job hangs, loops, consumes resources, stops producing expected outputs, or becomes a persistence mechanism after a compromise.

For a CCO or GRC lead, the fastest path is to treat SI-13(2) as an engineering requirement with three deliverables: (1) a scoped inventory of covered unsupervised processes, (2) enforceable runtime limits with defined outcomes when exceeded, and (3) repeatable evidence that those controls work in production.

This page gives you requirement-level steps you can hand to system owners and platform teams. It assumes you may be implementing in a mixed environment (on-prem, cloud, containers, CI/CD, scheduled ETL, batch, integrations). You’ll also see what auditors typically ask for and the artifacts to retain so you can answer quickly without rebuilding the story each assessment cycle. 1

Requirement: si-13(2): time limit on process execution without supervision requirement

SI-13(2) is an enhancement under NIST SP 800-53 Rev. 5 SI-13, focused on placing time bounds on processes that run without direct operator oversight. The practical compliance expectation is simple: you must prevent “run forever” conditions for specified unsupervised processes, and you must be able to prove it. 1

Plain-English interpretation

You need an explicit maximum execution time for defined unsupervised processes. If the process exceeds that time, your system must take a defined action (stop the process, quarantine, fail over, or alert/escalate), and you must monitor and respond consistently.

What this control is not: a generic performance SLA. It’s a security and resilience safeguard against runaway execution and unattended failure modes.

Regulatory text

Excerpt (as provided): “NIST SP 800-53 control SI-13.2.” 2

Operator meaning: implement a control that limits the time a process can execute without supervision, and enforce that limit for in-scope processes. Your implementation has to be demonstrable in configuration and operational records, not only stated in policy. 1

Who it applies to

Entities

  • Federal information systems implementing NIST SP 800-53 Rev. 5. 3
  • Contractor systems handling federal data where NIST SP 800-53 controls are flowed down contractually or used to meet program requirements. 3

Operational contexts (where this shows up in practice)

Scope it to processes that can run without a person actively watching, especially when they can affect availability, integrity, or confidentiality:

  • Batch jobs (ETL, billing runs, reconciliations)
  • Scheduled tasks (cron, Windows Task Scheduler)
  • Workflow orchestrators (Airflow, Argo, Step Functions)
  • CI/CD and build agents
  • Container jobs and serverless tasks
  • Data pipelines and integration middleware
  • Long-running compute (analytics, training runs) when unsupervised

A useful scoping rule: if a process can degrade service, consume shared resources, or create uncontrolled outputs while unattended, it belongs on the candidate list.

What you actually need to do (step-by-step)

Step 1: Define “without supervision” in your environment

Write a short definition your engineers can apply consistently. Include:

  • What counts as “supervised” (for example, an operator actively monitoring a console, or a staffed NOC with defined response playbooks)
  • What counts as “unsupervised” (scheduled jobs, asynchronous queues, background workers without active oversight)
  • Whether “supervision” can be satisfied by automated monitoring and paging, or requires human presence

Keep it practical. Ambiguity causes scope fights during audits.

Step 2: Build the in-scope process inventory

Produce an inventory that ties processes to owners and execution platforms. Minimum fields:

  • Process name and function
  • Platform (OS scheduler, Kubernetes, orchestration tool, SaaS job runner)
  • Owner (team and individual)
  • Trigger type (schedule, event-driven, manual)
  • Criticality (impact if it runs long or hangs)
  • Current observed runtime pattern (normal vs worst-case; you can start qualitative if you lack metrics)

This becomes your control population for testing.

Step 3: Set runtime limits (time thresholds) by process category

You need documented limits that are defendable. Common patterns:

  • Hard timeout: kill/terminate after max runtime
  • Soft timeout: alert at threshold; terminate at a later threshold
  • Checkpointing: require periodic heartbeats or checkpoints; fail if absent
  • Budget-based controls: cap CPU time, memory, or concurrency alongside wall-clock time

Write the rationale for each category so assessors understand why limits differ (business cycle, batch window, upstream dependencies). Avoid a single blanket value unless your environment is truly uniform.

Step 4: Implement technical enforcement

Policy without enforcement fails quickly under audit. Choose mechanisms that fit the platform:

Examples of enforcement controls (pick what matches your stack):

  • Job schedulers: set per-job timeouts and failure actions
  • Kubernetes: set activeDeadlineSeconds for Jobs; use liveness probes for long-running services; use resource limits to prevent uncontrolled consumption
  • CI/CD: pipeline job timeouts and cancellation rules
  • Serverless/workflows: maximum state duration, retries with ceilings, and dead-letter queues
  • OS-level: watchdog services, systemd RuntimeMaxSec, task scheduler timeouts

You do not need to standardize on one tool. You do need consistent evidence that a limit exists and is enforced.

Step 5: Define “what happens when time is exceeded”

Document and implement the expected system behavior and operational response:

  • Terminate the process safely (or stop further writes) to prevent corruption
  • Alert the right on-call rotation with context (process name, owner, last checkpoint, impacted dependencies)
  • Create an incident or ticket when applicable
  • Require post-event review for repeat offenders (tuning, code fix, capacity change)

A key design decision: for integrity-sensitive jobs (financial postings, identity sync), default to fail-safe behavior that prevents partial writes or inconsistent state.

Step 6: Monitor and test the control

You need proof the limits work:

  • Monitoring: alerts for time-limit warnings, forced terminations, and repeated timeouts
  • Testing: intentionally run a non-production job past the threshold and capture evidence of termination + alert + ticket creation (where feasible)

Make testing part of change management for new job types. Otherwise, timeouts will exist “on paper” but not in real production configurations.

Step 7: Assign ownership and recurring evidence collection

Operationalize it like any other control:

  • Control owner: usually Platform Engineering, SRE, or Security Engineering
  • Process owners: application/data owners for each job
  • GRC owner: tracks scope, evidence, and assessment responses

Daydream (or a similar GRC workflow) fits naturally here as the system of record for: mapping SI-13(2) to owners, attaching implementation procedures, and scheduling recurring evidence pulls so you are not chasing screenshots during an assessment. 2

Required evidence and artifacts to retain

Assessors typically want evidence across design, implementation, and operations. Build an evidence packet that includes:

Design artifacts

  • SI-13(2) control statement (your internal standard for runtime limits)
  • Definition of “unsupervised process”
  • Scope statement and process inventory (with owners)

Implementation artifacts

  • Configuration exports/screenshots showing timeouts set 2
  • Infrastructure-as-code snippets or policy-as-code rules enforcing required timeout fields
  • Runbooks describing actions on timeout (terminate, quarantine, notify, ticket)

Operational artifacts

  • Monitoring rules for timeout alerts
  • Sample alerts and incident/ticket records from real events (redacted as needed)
  • Exception register for any process that cannot be time-limited (include compensating controls and an expiration date)

Tip: store evidence by system boundary and platform. “We do this everywhere” is hard to prove without platform-specific artifacts.

Common exam/audit questions and hangups

Expect these questions and prepare direct, artifact-backed answers:

  1. “Which processes are in scope?”
    Provide the inventory and your scoping definition.

  2. “Show me the configured time limit.”
    Demonstrate configuration in the scheduler/orchestrator plus the enforcement behavior.

  3. “What happens when the limit is exceeded?”
    Show runbook + alert + an example event record.

  4. “How do you prevent teams from creating new jobs without limits?”
    Point to guardrails (templates, CI checks, IaC policies) and onboarding standards.

  5. “Do you grant exceptions?”
    Show a formal exception workflow with approvals and compensating monitoring.

Hangup to avoid: relying on human review (“someone watches the dashboard”). Auditors usually treat that as fragile unless you can show staffing, procedures, and response records.

Frequent implementation mistakes (and how to avoid them)

Mistake Why it fails Better approach
Blanket timeout copied to every job Breaks legitimate long runs or encourages disabling timeouts Categorize jobs and justify limits by risk/criticality
Timeout exists but no alert Jobs die silently; business impact discovered late Pair termination with paging + ticketing
Only soft limits (alert only) Runaway execution continues Use hard termination for high-risk jobs; document exceptions
No ownership per job Incidents bounce between teams Tie each process to a named owner and escalation path
Evidence is screenshots only Hard to maintain; doesn’t show coverage Prefer config exports, IaC, and automated evidence collection

Enforcement context and risk implications

No public enforcement cases were provided in the source catalog for this requirement, so you should treat enforcement risk as indirect: SI-13(2) is often evaluated during security assessments, authorizations, and contract compliance reviews against NIST SP 800-53 Rev. 5. The operational risk is more concrete: uncontrolled processes can cause outages, resource exhaustion, data corruption, and delayed detection of malicious persistence in unattended workflows. 3

Practical 30/60/90-day execution plan

First 30 days: establish scope and minimum enforceable baseline

  • Publish your “unsupervised process” definition and SI-13(2) standard.
  • Build the initial process inventory for the highest-risk platforms (batch/orchestration/CI).
  • Implement default timeout settings in templates for new jobs.
  • Stand up monitoring for timeout events and route alerts to an on-call.

Next 60 days: expand coverage and add guardrails

  • Extend inventory coverage to remaining platforms and critical applications.
  • Add policy checks in CI/IaC to block deployments missing required timeout parameters.
  • Create an exception workflow with compensating controls and documented approvals.
  • Run at least one tabletop or controlled test to prove alerting and response works.

By 90 days: stabilize operations and evidence for assessors

  • Review timeout event trends with system owners and tune limits to reduce noise.
  • Add recurring evidence collection (config exports, alert samples, inventory snapshots).
  • Fold SI-13(2) checks into change management for new processes and major refactors.
  • Prepare an assessor packet: control statement, scope, implementation proofs, and operational records.

(These phases are guidance for sequencing work; tailor them to your environment and assessment calendar.)

Frequently Asked Questions

What counts as a “process” for SI-13(2)?

Treat any scheduled or event-driven execution unit as a process: a cron job, orchestration task, Kubernetes Job, CI pipeline step, or serverless workflow. If it runs without a human actively watching and can cause harm by running too long, include it.

Do we have to terminate the process when it hits the time limit?

The control calls for a time limit; enforcement typically means termination or a fail-safe stop for in-scope processes. If you choose alert-only for a process, document why and add compensating controls (strong monitoring, rapid escalation, and post-event review). 1

How do we set the “right” time limit without historical runtime data?

Start with category-based limits (by job type and criticality) and refine after you collect runtime telemetry and timeout events. Document the rationale and require owners to revisit limits after changes that affect runtime (data volume, code path, dependencies).

What about long-running analytics or training jobs that can legitimately take a long time?

Put them in a distinct category with longer limits plus checkpoints/heartbeats and resource caps. If you truly cannot enforce a wall-clock limit, use an approved exception with compensating monitoring and a defined review date.

Can automated monitoring count as “supervision”?

It can, if you define “supervision” to include monitored execution with alerting, staffed response, and documented runbooks. If you claim supervision, be ready to show on-call coverage and response records, not only the existence of a dashboard.

How do we prove compliance during an assessment?

Provide a scoped inventory, platform configuration evidence showing timeouts, and operational records of alerts or forced terminations with response actions. Use a GRC system like Daydream to map SI-13(2) to owners and attach recurring evidence so the packet is always current. 2

Footnotes

  1. NIST SP 800-53 Rev. 5; NIST SP 800-53 Rev. 5 OSCAL JSON

  2. NIST SP 800-53 Rev. 5 OSCAL JSON

  3. NIST SP 800-53 Rev. 5

Frequently Asked Questions

What counts as a “process” for SI-13(2)?

Treat any scheduled or event-driven execution unit as a process: a cron job, orchestration task, Kubernetes Job, CI pipeline step, or serverless workflow. If it runs without a human actively watching and can cause harm by running too long, include it.

Do we have to terminate the process when it hits the time limit?

The control calls for a time limit; enforcement typically means termination or a fail-safe stop for in-scope processes. If you choose alert-only for a process, document why and add compensating controls (strong monitoring, rapid escalation, and post-event review). (Source: NIST SP 800-53 Rev. 5; NIST SP 800-53 Rev. 5 OSCAL JSON)

How do we set the “right” time limit without historical runtime data?

Start with category-based limits (by job type and criticality) and refine after you collect runtime telemetry and timeout events. Document the rationale and require owners to revisit limits after changes that affect runtime (data volume, code path, dependencies).

What about long-running analytics or training jobs that can legitimately take a long time?

Put them in a distinct category with longer limits plus checkpoints/heartbeats and resource caps. If you truly cannot enforce a wall-clock limit, use an approved exception with compensating monitoring and a defined review date.

Can automated monitoring count as “supervision”?

It can, if you define “supervision” to include monitored execution with alerting, staffed response, and documented runbooks. If you claim supervision, be ready to show on-call coverage and response records, not only the existence of a dashboard.

How do we prove compliance during an assessment?

Provide a scoped inventory, platform configuration evidence showing timeouts, and operational records of alerts or forced terminations with response actions. Use a GRC system like Daydream to map SI-13(2) to owners and attach recurring evidence so the packet is always current. (Source: NIST SP 800-53 Rev. 5 OSCAL JSON)

Operationalize this requirement

Map requirement text to controls, owners, evidence, and review workflows inside Daydream.

See Daydream