The entity maintains, monitors, and evaluates current processing capacity
To meet the the entity maintains, monitors, and evaluates current processing capacity requirement, you must define what “capacity” means for your in-scope services, continuously monitor the right leading indicators (compute, storage, network, queue depth, database limits), and run a documented, recurring review that results in clear decisions (scale, tune, cap demand, or accept risk) with retained evidence.
Key takeaways:
- Capacity compliance is a control loop: define thresholds, monitor, review, decide, and record.
- Auditors look for repeatable evaluation, not just dashboards; meeting minutes and change records matter.
- Tie capacity monitoring to incident management and change management so scaling decisions are provable.
Availability controls fail most often in the handoff between engineering reality and audit evidence. Teams may have excellent observability, autoscaling, and SRE practices, but still struggle to prove that capacity is maintained and actively evaluated as a control. SOC 2 Availability criteria expects more than “we have graphs.” It expects you to show that you know your current processing limits, that you monitor signals that predict saturation, and that you routinely decide what to do about emerging constraints.
This requirement sits directly in the operational path of customer impact: performance degradation, timeouts, delayed batch processing, failed jobs, and cascading outages. It also drives uncomfortable exam questions: “How do you know you have enough capacity today?” “Who reviews this?” “What happens when you exceed thresholds?” “Where is the evidence that the review occurred and decisions were executed?”
This page translates TSC-A1.1 into a requirement you can run weekly and defend during a SOC 2 examination. It includes a practical control design, step-by-step operating guidance, an evidence checklist, and a 30/60/90-day plan to get from ad hoc monitoring to an auditable capacity management program.
Regulatory text
SOC 2 Trust Services Criteria (Availability), TSC-A1.1: “The entity maintains, monitors, and evaluates current processing capacity.” 1
Operator meaning: you must (1) maintain sufficient processing capacity for the system(s) in scope, (2) monitor capacity indicators continuously or at defined intervals, and (3) evaluate the results on a recurring cadence to decide whether action is required. “Evaluate” is the differentiator: auditors want proof that humans (or a documented automated process with oversight) reviewed capacity and produced decisions, not just telemetry.
Plain-English interpretation (what this requires in practice)
You comply when you can answer, with evidence:
- Maintain: What is your current capacity posture for each critical service component (compute, database, storage, network, dependencies), and how do you keep it within acceptable bounds (autoscaling, reservations, quotas, runbooks, architectural limits)?
- Monitor: What signals warn you before customer impact (headroom, saturation, queue depth, throttling, connection pool exhaustion), and how are alerts routed and handled?
- Evaluate: Who reviews trends and forecasts, how often, what decisions are made, and how are decisions tracked to completion?
A practical interpretation: capacity management must be a closed-loop operational control with defined thresholds, accountable owners, documented reviews, and recorded outcomes.
Who this applies to (scope and operational context)
This applies to any service organization pursuing SOC 2 where system availability is in scope, including:
- SaaS platforms and APIs (web, mobile, partner integrations)
- Background processing systems (queues, workers, schedulers)
- Data platforms (ETL, analytics pipelines, customer reporting)
- Shared infrastructure that materially affects uptime or performance (databases, caches, CDNs, identity services)
Operationally, it applies wherever capacity constraints can cause:
- latency and timeouts,
- job backlogs,
- resource exhaustion (CPU, memory, disk, IOPS),
- throttling/limits at cloud providers or third parties,
- single-tenant “noisy neighbor” conditions in multi-tenant systems.
If you rely on third parties (cloud hosting, managed databases, messaging), your control still must cover how you monitor your consumption and provider limits, and how you respond when those limits are approached.
What you actually need to do (step-by-step)
Use the steps below as a control procedure you can hand to an SRE lead and also defend to an auditor.
Step 1: Define “processing capacity” for your in-scope system
Create a simple capacity register for each critical service or tier:
- Service/component name
- Primary function (API, queue workers, database, cache, file processing)
- Capacity unit (requests/sec, jobs/min, concurrent connections, throughput, IOPS)
- Hard limits (cloud quotas, database max connections, partition limits)
- Soft thresholds (alert thresholds that trigger action before failure)
- Owner (role + team)
- Customer impact if saturated (latency, backlog, data delay, outage)
Keep it short. Auditors prefer clarity over exhaustive lists.
Step 2: Instrument leading indicators (not only “CPU is high”)
For each component, pick signals that indicate loss of headroom:
Common leading indicators to monitor
- Compute: CPU saturation, memory pressure, autoscaling events, container restarts
- Databases: connection pool utilization, slow query rate, CPU/IO wait, storage growth rate, replica lag
- Queues/streams: queue depth, consumer lag, retry rate, dead-letter volume
- Storage: disk utilization, IOPS throttling, error rates
- Network/edge: 4xx/5xx, latency percentiles, upstream dependency latency, rate limiting
Write down: metric name, where it lives (tool), threshold, alert route, and response runbook link.
Step 3: Establish alerting and response expectations
Define operational rules that are testable:
- Which alerts are page-worthy vs ticket-only
- Who is on-call for capacity events
- Required response actions (scale out, reduce batch concurrency, fail over, block abusive traffic, increase quotas, tune queries)
- When a capacity event becomes an incident and triggers post-incident review
Tie this to incident management so capacity alarms are not “best effort.”
Step 4: Run a recurring capacity evaluation (documented)
Set a recurring cadence (weekly, biweekly, or monthly based on volatility) and standardize the output.
Capacity review agenda (minimum)
- Review top services: peak utilization, headroom, and trend
- Review alert history: threshold breaches and near-misses
- Review upcoming changes: product launches, migrations, large customers, marketing events
- Confirm provider limits: quotas, reserved instances/commitments, database tier constraints
- Decide actions: scale, optimize, add caching, adjust limits, accept risk with sign-off
Required output: a capacity review record with date, attendees, systems reviewed, findings, and tracked action items.
Step 5: Convert findings into controlled changes
Capacity evaluation must produce execution evidence:
- A ticket/change request for scaling or tuning
- Approval as required by your change management process
- Implementation record (deployment, infra change, quota request)
- Validation evidence (post-change metrics, synthetic tests, load test results if used)
This is where many SOC 2 programs fail: they can show graphs and meetings, but cannot show closure.
Step 6: Prove ongoing operation (sampling-friendly evidence)
Auditors will sample. Make evidence easy to retrieve:
- Keep capacity review notes in a consistent location
- Use consistent naming for dashboards and alerts
- Tag capacity-related changes so you can pull a list for the period
Daydream is often used here to keep the control narrative, owners, and evidence requests organized so engineering doesn’t rebuild the same packet each audit cycle.
Required evidence and artifacts to retain
Keep artifacts that prove design and operation:
Control design artifacts
- Capacity Management Policy or Standard (lightweight is fine)
- Capacity register (services, thresholds, owners)
- Monitoring/alerting inventory (dashboards, alert definitions, routes)
- Runbooks for capacity alarms (scale, throttle, failover)
Operating evidence artifacts (what auditors ask for)
- Screenshots/exports showing active alerts and thresholds (dated)
- Capacity review records (calendar invite + minutes/notes)
- Action item tracker (tickets) tied to review findings
- Change records for scaling/optimization work
- Incident records where capacity contributed, plus post-incident actions
Retention should align with your SOC 2 audit period and internal retention policy; auditors typically test within the report period.
Common exam/audit questions and hangups
Auditors often get stuck on these points:
-
“What is processing capacity for your system?”
If you cannot define capacity in measurable terms per critical component, monitoring looks ad hoc. -
“Show me the evaluation.”
Dashboards are monitoring. Evaluation is a human-reviewed record with decisions and follow-through. -
“How do you know alerts are meaningful?”
Expect questions about thresholds, tuning, and whether alerts page the right team. -
“How do third-party limits factor in?”
You must monitor consumption against quotas/limits for cloud services and managed platforms you depend on. -
“How do you handle planned demand spikes?”
Auditors want to see pre-event planning records (release readiness, launch reviews, load tests where applicable).
Frequent implementation mistakes (and how to avoid them)
| Mistake | Why it fails | Fix |
|---|---|---|
| Only monitoring CPU/memory | Capacity bottlenecks often appear in DB connections, queues, or rate limits first | Monitor service-specific saturation metrics (queue depth, conn pools, throttling) |
| No recurring review cadence | You cannot prove “evaluates” without a routine | Schedule a recurring capacity review with a template and required outputs |
| Actions not tracked to closure | Findings without execution do not reduce risk | Require tickets and change records for every material capacity decision |
| Thresholds not owned | Alerts drift and get ignored | Assign metric owners and require periodic threshold tuning |
| Ignoring third-party constraints | Quotas and throttles cause outages too | Track quotas/limits and alert on approaching limits |
Enforcement context and risk implications
No public enforcement cases were provided for this requirement in the source catalog. Practically, the risk shows up in SOC 2 outcomes and customer trust: poor capacity controls lead to repeated incidents, missed SLAs, and qualified SOC 2 opinions when you cannot show consistent operation of Availability controls. The control is also a dependency for incident response effectiveness; without capacity signals and reviews, root cause analysis tends to end at “traffic spike” instead of a fixable constraint.
Practical 30/60/90-day execution plan
First 30 days: define scope and make monitoring defensible
- Identify in-scope services/components for Availability.
- Build the capacity register for critical components (start with the top tier that can cause an outage).
- Confirm dashboards exist for each component and add missing leading indicators.
- Define initial thresholds and routes; link runbooks.
- Create the capacity review template and schedule recurring meetings.
Exit criteria: you can show an auditor a list of components, the metrics you monitor, alert routes, and a scheduled evaluation process.
Days 31–60: operate the control and generate evidence
- Run capacity reviews on schedule and capture notes consistently.
- Create tickets for findings; tie them to change management.
- Tune noisy alerts; document threshold changes.
- Add monitoring for third-party quotas/limits where missing.
- Run at least one “capacity scenario” tabletop (launch spike, dependency slowdown) and record decisions as capacity actions.
Exit criteria: at least one full cycle of review → actions → implemented changes → post-change validation evidence.
Days 61–90: mature evaluation and forecasting
- Add trend reporting (weekly headroom and saturation trend for key services).
- Establish a simple forecast input process (roadmap events, customer growth assumptions, batch workload changes).
- Define risk acceptance criteria for deferred capacity work and require sign-off.
- Prepare an audit-ready evidence packet: last reviews, key alerts, key changes, and incident linkages.
Exit criteria: repeatable evidence retrieval and a clear story that connects monitoring to decisions and executed improvements.
Frequently Asked Questions
What counts as “processing capacity” in a SaaS environment?
Define it as the measurable limit that matters for each critical component (for example, requests per second at the API tier and concurrent connections at the database). Auditors accept component-level definitions when they are documented, monitored, and tied to review decisions.
Do we need formal capacity planning forecasts to satisfy TSC-A1.1?
You need evaluation, not a complex forecasting model. A lightweight trend review plus discussion of known upcoming demand drivers, recorded in capacity review notes, usually satisfies the “evaluates” expectation when paired with actions.
Are autoscaling and cloud elasticity enough by themselves?
No. Autoscaling helps maintain capacity, but you still must monitor saturation signals and evaluate whether scaling works, hits quotas, or causes cost and stability issues. Keep evidence that autoscaling events and limits are reviewed.
How do we handle capacity for third-party managed services?
Track provider limits (quotas, max throughput, connection caps) and your consumption against them. Set alerts for approaching limits and include those services in your recurring capacity review, even if the infrastructure is not directly operated by you.
What evidence is strongest for auditors: dashboards or meeting notes?
You need both. Dashboards prove monitoring exists; review records, tickets, and change logs prove evaluation and follow-through. If you must prioritize, prioritize evidence that shows decisions and completed remediation.
Our engineering team reviews metrics informally in standups. Can that count?
It can, if you standardize it. Capture a consistent record (agenda, metrics reviewed, decisions, and action items) and store it where you can retrieve it for the audit period.
Related compliance topics
- 2025 SEC Marketing Rule Examination Focus Areas
- Access and identity controls
- Access Control (AC)
- Access control and identity discipline
- Access control management
Footnotes
Frequently Asked Questions
What counts as “processing capacity” in a SaaS environment?
Define it as the measurable limit that matters for each critical component (for example, requests per second at the API tier and concurrent connections at the database). Auditors accept component-level definitions when they are documented, monitored, and tied to review decisions.
Do we need formal capacity planning forecasts to satisfy TSC-A1.1?
You need evaluation, not a complex forecasting model. A lightweight trend review plus discussion of known upcoming demand drivers, recorded in capacity review notes, usually satisfies the “evaluates” expectation when paired with actions.
Are autoscaling and cloud elasticity enough by themselves?
No. Autoscaling helps maintain capacity, but you still must monitor saturation signals and evaluate whether scaling works, hits quotas, or causes cost and stability issues. Keep evidence that autoscaling events and limits are reviewed.
How do we handle capacity for third-party managed services?
Track provider limits (quotas, max throughput, connection caps) and your consumption against them. Set alerts for approaching limits and include those services in your recurring capacity review, even if the infrastructure is not directly operated by you.
What evidence is strongest for auditors: dashboards or meeting notes?
You need both. Dashboards prove monitoring exists; review records, tickets, and change logs prove evaluation and follow-through. If you must prioritize, prioritize evidence that shows decisions and completed remediation.
Our engineering team reviews metrics informally in standups. Can that count?
It can, if you standardize it. Capture a consistent record (agenda, metrics reviewed, decisions, and action items) and store it where you can retrieve it for the audit period.
Operationalize this requirement
Map requirement text to controls, owners, evidence, and review workflows inside Daydream.
See Daydream