SA-11(4): Manual Code Reviews
SA-11(4) requires you to make developers perform manual code reviews for defined code scope, using defined review methods, and to prove it with repeatable evidence. To operationalize it quickly, set a mandatory review gate in your SDLC (or supplier SDLC), define what code must be reviewed and how, and retain review records tied to each release.
Key takeaways:
- Define the review scope (which code) and techniques (how reviews are done) and make them mandatory.
- Make manual review a release gate with separation of duties, tracking, and remediation SLAs you can defend.
- Evidence must be traceable to commits/builds/releases and show findings, fixes, and approvals.
Compliance teams often treat “code review” as a generic engineering hygiene item. SA-11(4) is narrower and more operational: you must require the developer of the system (or component/service) to conduct manual code reviews for a defined scope and using defined procedures or techniques. That means you need clear decisions on (1) what code is in-scope, (2) what “manual review” means in your environment, (3) who must perform it, (4) when it must happen, and (5) what proof you will retain.
This control shows up as an assessment readiness problem more than a tooling problem. Most teams can point to pull requests, but cannot demonstrate consistent review depth, reviewer qualification, independence where required, or that “high-risk” code paths received the prescribed scrutiny. SA-11(4) also applies to third-party developed components when they are part of your system boundary or delivered as a system service. So you need contract hooks, supplier SDLC requirements, and a way to accept or reject evidence from third parties.
If you want an operator-friendly objective: build a documented manual review program that is enforced by your SDLC and produces release-level evidence an assessor can sample and trace end-to-end. 1
Regulatory text
NIST SP 800-53 SA-11(4) (Manual Code Reviews): “Require the developer of the system, system component, or system service to perform a manual code review of [organization-defined scope] using the following processes, procedures, and/or techniques: [organization-defined processes/techniques].” 2
What that means for you as an operator
- You must require manual code reviews (this is a governance obligation, not a suggestion).
- You must define the scope of what gets reviewed (the placeholder parameter is your decision).
- You must define the methods used during review (another explicit decision).
- You must be able to demonstrate the requirement is met for the systems/components/services in scope with auditable records. 1
Plain-English interpretation (what auditors expect you to mean)
Manual code review under SA-11(4) means a human reviews code changes to find security defects and design/logic issues that automated tools may miss. Passing a static scan alone does not satisfy this enhancement. Your program should prove that:
- every in-scope change received human review,
- reviewers knew what to look for (a defined checklist or technique),
- issues were tracked to closure or explicitly risk-accepted,
- the review is tied to a specific change set and release.
Who it applies to (entity + operational context)
Entities
- Federal information systems and programs implementing NIST SP 800-53 controls. 1
- Contractors operating systems handling federal data, or delivering system components/services into a federal system boundary (including SaaS where you are “the developer” of the service). 1
Operational contexts
- Internal application development (product engineering, platform, data engineering).
- Infrastructure-as-code and configuration code that can materially change security posture.
- Third-party developed components where you can and should contractually require evidence (or perform your own manual review for critical code you integrate).
What you actually need to do (step-by-step)
1) Assign ownership and define enforceable scope
- Name a control owner (typically AppSec or Engineering Enablement) and a compliance owner (GRC) responsible for evidence and sampling.
- Define the “organization-defined scope” for manual review. Practical scope categories:
- Tier 1 (always): authN/authZ, crypto, secrets handling, session management, input validation, logging/audit, payment flows, admin functions.
- Tier 2 (conditional): data pipelines, background jobs, integrations, API gateways.
- Tier 3 (as applicable): infrastructure-as-code, CI/CD pipeline definitions, policy-as-code.
- Decide the unit of enforcement: pull request (PR), change request, or release. PR-level gating is easiest to evidence.
Deliverable: SA-11(4) scope statement referenced in your SDLC standard.
2) Define “manual review” techniques that are testable
Pick specific processes/techniques you can train, measure, and audit. Examples you can standardize:
- A required secure code review checklist by language/framework.
- Required review of threat-critical areas (entry points, trust boundaries, authorization checks).
- A rule that reviewers must inspect diffs plus surrounding context (not just line comments).
- For high-risk scope, require a second reviewer with AppSec training or a security champion.
Deliverable: Code review standard + checklists mapped to in-scope code types. 2
3) Build the SDLC gate so the requirement is hard to bypass
Operationalize SA-11(4) with controls that fail closed:
- PRs cannot merge without at least one approved review from an authorized reviewer group.
- For Tier 1 scope, require:
- two approvals, or
- one approval plus AppSec approval, or
- a security champion approval.
- Block self-approval and define exceptions (emergency break-glass) with mandatory post-incident review.
This is where compliance gets real: if review is “policy only” and engineers can merge directly, you will fail sampling.
4) Train reviewers and define qualification
Document who is allowed to review what:
- Minimum reviewer criteria (role, training completion, familiarity with secure coding guidelines).
- A short “what to look for” training per stack, plus refresh when major framework changes occur.
- A documented escalation path for suspected vulnerabilities or design concerns.
Deliverable: Reviewer qualification matrix and training records.
5) Make findings actionable: track, fix, or formally accept risk
Manual review should produce outcomes:
- If issues are found, create a ticket with severity, owner, and expected fix path.
- Require linkage between PR, issue, and remediation PR.
- If a finding is deferred, document risk acceptance with approver and rationale.
Deliverable: Issue tracking workflow tied to code review outcomes.
6) Extend to third parties (supplier-developed code/components)
If a third party develops code for you or provides a system service you configure/extend:
- Add contract clauses requiring manual review practices and evidence for agreed scope.
- Require sample artifacts: anonymized PR records, review checklists, or attestations plus independent validation for critical components.
- Where you cannot get evidence, document compensating controls (for example, your own manual review of integration code, stricter change control, or additional testing).
This is the “require the developer” clause in practice for outsourced development. 2
Required evidence and artifacts to retain (what an assessor will sample)
Keep evidence that is traceable and repeatable:
Core artifacts (expected)
- Code review policy/standard referencing SA-11(4) scope and techniques. 1
- Checklists/templates used by reviewers 3.
- PR records showing:
- reviewer identity,
- timestamps,
- approvals,
- comments indicating substantive review,
- link to commits and build/release.
- Exception logs for emergency merges with after-the-fact review.
- Issue/ticket records showing findings, remediation, and closure.
Helpful artifacts (often requested)
- Secure code review training completion logs for reviewers.
- Security champion roster and responsibilities.
- Sampling report (monthly/quarterly) from GRC or AppSec demonstrating oversight.
Tip: Daydream works well as the system-of-record to map SA-11(4) to an owner, a procedure, and the exact evidence artifacts you will produce every release cycle, so audits become sampling, not archaeology.
Common exam/audit questions and hangups
Assessors tend to ask variations of these:
- “Show me the scope definition.” Which repos, services, and code types require manual review?
- “How do you know every in-scope change was reviewed?” They will test if your SDLC gate prevents bypass.
- “What makes it manual and security-relevant?” Expect scrutiny of whether the review is substantive or a rubber stamp.
- “Who is qualified to review security-sensitive changes?” They may sample reviewer training and role definitions.
- “How do you handle exceptions?” Break-glass without documentation is a common finding.
- “What about third-party-developed components?” They will look for contractual requirements and evidence intake.
Frequent implementation mistakes (and how to avoid them)
| Mistake | Why it fails | Fix |
|---|---|---|
| Treating “PR approval exists” as sufficient | Approvals can be superficial and inconsistent | Require a checklist and capture it in PR template or linked artifact |
| No written scope | You cannot prove what was required | Publish a scope statement and tie it to repo tags or SDLC tiers |
| Review gate can be bypassed | Sampling finds direct-to-main merges | Enforce branch protections and log exceptions |
| No linkage to remediation | Findings disappear into comments | Require tickets for material issues and link them to PRs |
| Third-party code ignored | “Require the developer” includes suppliers | Add contract clauses and evidence review steps |
| No independence for critical code | Self-review or same-author approvals | Enforce separation-of-duties for Tier 1 scope |
Enforcement context and risk implications (practical, not speculative)
No public enforcement cases were provided in the source data for SA-11(4). Operationally, the risk is straightforward: manual review gaps increase the chance that authorization flaws, insecure business logic, and misuse of cryptography reach production, and those failures tend to create incident response, reporting, and customer impact obligations. From an assessment standpoint, the most common “pain” is failing evidence sampling because review records are incomplete or not tied to releases. 1
Practical 30/60/90-day execution plan
First 30 days (establish the control and stop the bleeding)
- Appoint control owner(s) and publish the SA-11(4) scope statement.
- Turn on branch protection for in-scope repos to require approvals and block self-approval.
- Publish initial checklists for your top stacks (web API, frontend, IaC).
- Define the exception process (who can approve, how it’s logged, required post-review).
Next 60 days (make it measurable and auditable)
- Roll out reviewer qualification criteria and baseline training for reviewers.
- Add PR templates that require reviewers to confirm checklist completion.
- Implement a tracking mechanism for findings (tickets linked to PRs) and define risk acceptance workflow.
- Start monthly sampling: select PRs, verify checklist use, verify closure of findings, document results.
By 90 days (extend and operationalize across your supply chain)
- Expand scope coverage to remaining repos and “high-impact” pipelines (CI/CD, deployment scripts).
- Add third-party contract language and an evidence intake process for outsourced development.
- Produce an assessment-ready evidence package: policy, scope, techniques, samples, exception log, and sampling reports.
- Put SA-11(4) on a recurring control calendar in Daydream so evidence collection happens continuously, not during audit week.
Frequently Asked Questions
Does SA-11(4) allow automated tools to replace manual code review?
No. Automated testing can supplement your program, but SA-11(4) specifically requires a manual code review for your defined scope and techniques. Keep automated scan results as supporting evidence, not the primary evidence. 2
What counts as “manual” in a pull request workflow?
A human reviewer inspects the change for security-relevant issues using defined methods (for example, a checklist) and records approval with traceability to the code change. A simple “LGTM” without a defined technique is hard to defend in sampling.
Can the code author be one of the reviewers?
For low-risk changes, you may allow peer review within the same team, but you should block self-approval. For high-risk scope, define independence requirements (for example, security champion or AppSec approval) and enforce them in branch protection rules.
How do we scope SA-11(4) across hundreds of repositories?
Start by tagging repos and services into tiers based on security impact, then enforce review gates for Tier 1 immediately. Expand scope iteratively, but keep the tiering logic written and consistent so you can explain exclusions.
What evidence is strongest for auditors?
PR records with enforced branch protections, a documented checklist, and linked remediation tickets are the most defensible. The key is traceability from requirement → procedure → specific reviewed changes → fixes or risk acceptance.
How do we handle third-party developers or outsourced teams?
Put the manual review requirement into contracts and SOWs, require periodic evidence samples, and define what you will do if evidence is unavailable (for example, increased internal review of integration code). This aligns with “require the developer” language in SA-11(4). 2
Footnotes
-
language/framework
Frequently Asked Questions
Does SA-11(4) allow automated tools to replace manual code review?
No. Automated testing can supplement your program, but SA-11(4) specifically requires a manual code review for your defined scope and techniques. Keep automated scan results as supporting evidence, not the primary evidence. (Source: NIST SP 800-53 Rev. 5 OSCAL JSON)
What counts as “manual” in a pull request workflow?
A human reviewer inspects the change for security-relevant issues using defined methods (for example, a checklist) and records approval with traceability to the code change. A simple “LGTM” without a defined technique is hard to defend in sampling.
Can the code author be one of the reviewers?
For low-risk changes, you may allow peer review within the same team, but you should block self-approval. For high-risk scope, define independence requirements (for example, security champion or AppSec approval) and enforce them in branch protection rules.
How do we scope SA-11(4) across hundreds of repositories?
Start by tagging repos and services into tiers based on security impact, then enforce review gates for Tier 1 immediately. Expand scope iteratively, but keep the tiering logic written and consistent so you can explain exclusions.
What evidence is strongest for auditors?
PR records with enforced branch protections, a documented checklist, and linked remediation tickets are the most defensible. The key is traceability from requirement → procedure → specific reviewed changes → fixes or risk acceptance.
How do we handle third-party developers or outsourced teams?
Put the manual review requirement into contracts and SOWs, require periodic evidence samples, and define what you will do if evidence is unavailable (for example, increased internal review of integration code). This aligns with “require the developer” language in SA-11(4). (Source: NIST SP 800-53 Rev. 5 OSCAL JSON)
Operationalize this requirement
Map requirement text to controls, owners, evidence, and review workflows inside Daydream.
See Daydream