Annex A 8.6: Capacity Management

To meet the annex a 8.6: capacity management requirement, you must proactively plan, monitor, and adjust compute, storage, network, and people/process capacity so critical services stay available and secure under normal and peak conditions. Operationalize it by defining capacity targets, implementing monitoring and forecasting, setting alert thresholds, and proving recurring review and action.

Key takeaways:

  • Capacity management is a security control because resource exhaustion causes outages, control failures, and incident conditions.
  • Auditors look for an end-to-end loop: targets → telemetry → thresholds → forecasting → change execution → evidence.
  • Evidence quality matters more than tooling; show decisions, approvals, and follow-through, not just dashboards.

Annex A 8.6 in ISO/IEC 27001:2022 expects you to manage capacity in a deliberate, repeatable way so systems can meet business and security needs. Capacity is broader than “server size.” It includes cloud quotas, database throughput, network saturation, storage growth, job queues, SaaS limits, logging pipelines, IAM rate limits, and even the human capacity needed to operate and recover services.

For a Compliance Officer, CCO, or GRC lead, the fastest path is to translate “capacity management” into a small set of artifacts and routines that are easy to sustain: a documented standard, an inventory of capacity-critical services, defined thresholds and SLO-aligned targets, and a recurring review cadence that produces tickets and change records. The practical risk is straightforward: if capacity is not managed, you get brownouts, failed backups, dropped logs, monitoring blind spots, and degraded incident response, which quickly becomes an availability and integrity problem.

This page gives requirement-level implementation guidance you can hand to engineering and operations, then audit without guesswork, aligned to ISO 27001 expectations 1.

Regulatory text

Excerpt (provided): “ISO/IEC 27001:2022 Annex A control 8.6 implementation expectation (Capacity Management).” 1

Operator interpretation: You need an implemented, maintained process to (1) identify capacity-dependent services and components, (2) set capacity requirements/limits, (3) monitor actual consumption, (4) forecast and test for expected demand, and (5) trigger timely action (scaling, optimization, procurement, quota increases, architectural changes). Evidence must show the process runs and results in changes, not just documentation.

Plain-English interpretation (what Annex A 8.6 means in practice)

Capacity management means you prevent “running out of room” events that break availability and security operations. Examples:

  • Your log pipeline hits throughput limits and drops security logs.
  • Backups fail because storage fills up.
  • A DDoS or traffic spike saturates network capacity and takes authentication offline.
  • Your cloud account hits API rate limits and automation fails during an incident.
  • A database reaches connection limits and critical jobs time out.

Annex A 8.6 expects you to treat these as foreseeable risks and manage them with planning, telemetry, thresholds, and change management, as part of operating an ISMS 1.

Who it applies to (entity and operational context)

Applies to:

  • Service organizations delivering services where availability and integrity depend on technology capacity 2.

Operational contexts where auditors expect strong implementation:

  • Cloud-hosted production systems (autoscaling still requires quota and cost guardrails).
  • On-prem infrastructure with procurement lead times.
  • High-ingest security tooling (SIEM, EDR telemetry, logging, metrics).
  • Customer-facing platforms with seasonal or marketing-driven spikes.
  • Regulated workloads where availability is part of customer commitments.

Teams typically involved (you need named ownership):

  • SRE/Operations (monitoring, thresholds, scaling)
  • Infrastructure/Platform (quotas, network, storage, Kubernetes)
  • Application owners (performance constraints, release-driven demand)
  • Security Operations (logging/telemetry capacity, incident needs)
  • Change management / CAB (approval and tracking)
  • Finance/Procurement (reserved capacity, contracts) when relevant

What you actually need to do (step-by-step)

1) Define scope: “capacity-critical services”

Create a short list of services where capacity failure becomes an incident. Start with:

  • Authentication/SSO, API gateway, primary application, databases, message queues
  • Security logging + alerting pipeline (log collectors, SIEM ingestion, retention)
  • Backup/restore systems, vulnerability scanning infrastructure

Output: Capacity-Critical Services Register (one page per service is enough).

2) Set capacity objectives and limits per service

For each capacity-critical service, document:

  • Primary constraints (CPU, memory, disk, IOPS, connections, QPS, bandwidth, queue depth, quotas)
  • Target operating bands (green zone) and hard limits (red zone)
  • Dependency limits (managed database max connections; SaaS API limits; WAF throughput)

Tie targets to internal availability goals or service objectives if you have them. If you do not, set pragmatic thresholds and iterate.

Output: Capacity targets table (service → metric → threshold → owner → response action).

3) Implement monitoring and alerting that matches the targets

You need telemetry that answers two questions:

  • “Are we close to a limit right now?”
  • “Are we trending toward a limit before the next review?”

Minimum expectations:

  • Dashboards showing key capacity metrics for each in-scope service
  • Alerts routed to an on-call or ticketing system with defined severity
  • Alert runbooks with “what to do if threshold breached” steps

Output: Monitoring screenshots/export, alert definitions, on-call/ticket integration proof, runbooks.

4) Establish forecasting and review cadence

Pick a recurring forum where capacity is reviewed and decisions are made. The cadence should match volatility:

  • High-change environments (frequent releases, variable traffic): review more often.
  • Stable back-office systems: review less often, but still recurring.

Forecasting can be lightweight:

  • Trend analysis from monitoring (growth rate, seasonality)
  • Known business events (campaigns, product launches)
  • Planned engineering changes (new features, new customers)

Output: Calendar invites/agenda templates, capacity review meeting notes, trend reports.

5) Connect capacity signals to change execution

This is where many programs fail. Define triggers and required actions:

  • Threshold breach → incident or operational ticket
  • Trend indicates limit breach before next review → capacity change request
  • Business event planned → pre-scale plan + rollback plan

Route changes through your standard change process. Keep approvals and implementation records.

Output: Tickets, change requests, CAB approvals (if applicable), implementation evidence, post-change validation.

6) Include third parties and external dependencies

Capacity issues often sit outside your infrastructure:

  • SaaS rate limits
  • CDN/WAF throughput commitments
  • Managed database sizing constraints
  • MSP-operated components

Add third-party capacity assumptions to:

  • Contracts/SOWs where possible
  • Architecture docs and runbooks
  • Vendor/third-party due diligence where the dependency is critical

Output: Third-party dependency list per service, contract excerpts or emails on limits/quotas, escalation paths.

7) Test capacity where it matters (targeted, not theatrical)

Focus on realistic tests:

  • Load test the top customer workflows after major releases
  • Validate autoscaling behavior (including quota ceilings)
  • Simulate log surges to ensure no drops in security logging

Evidence should show test execution and remediation actions.

Output: Test plan, results summary, remediation tickets.

Required evidence and artifacts to retain

Auditors commonly accept a compact evidence set if it demonstrates operation:

Evidence What it proves Owner
Capacity Management Standard/Procedure Control design and responsibilities GRC + Ops
Capacity-Critical Services Register Scope and prioritization SRE/Platform
Metrics/Threshold Matrix Defined requirements and triggers Service owners
Monitoring dashboards + alert rules Ongoing detection SRE/Platform
Capacity review notes + action log Recurring governance Ops lead
Tickets/changes for scaling/optimization Control operation and follow-through Engineering/Ops
Post-change validation notes Changes worked and risk reduced Service owners
Third-party limits documentation External constraints addressed Procurement/TPRM

Tip: store evidence in a single “Annex A 8.6” folder with an index page that links to live sources (dashboards, ticket searches) plus static exports for point-in-time proof.

Common exam/audit questions and hangups

Expect questions like:

  • “Which services are in scope for capacity management, and why?”
  • “Show the thresholds for your top critical service and the last time they were reviewed.”
  • “What happens when you breach a threshold? Show me the last two examples.”
  • “How do you ensure security logging capacity is sufficient during incidents?”
  • “How do you account for cloud quotas and third-party rate limits?”
  • “Where is the link between forecasting and approved changes?”

Hangups that slow audits:

  • Dashboards exist, but no documented targets or owners.
  • Reviews occur, but no minutes or action tracking.
  • Lots of alerts, but no evidence of triage and closure.

Frequent implementation mistakes (and how to avoid them)

  1. Treating autoscaling as “capacity management done.”
    Autoscaling fails at quota ceilings, dependency bottlenecks, and cost guardrails. Document quota monitoring and dependency limits.

  2. Ignoring security tooling capacity.
    Dropped logs are a security incident multiplier. Add SIEM/log pipeline throughput, storage, retention, and backlog as capacity-critical items.

  3. No decision trail.
    A pretty dashboard does not show governance. Keep a lightweight action log: decision, ticket link, due date, closure evidence.

  4. Over-scoping.
    Trying to capacity-manage every system creates fatigue. Start with a risk-based list, then expand based on incidents and change volume.

  5. Not assigning one accountable owner per service.
    Shared responsibility becomes no responsibility. Assign a service owner and an ops owner for monitoring/alerts.

Enforcement context and risk implications

No public enforcement cases were provided in the source catalog for this requirement. Operationally, capacity failures usually surface as availability incidents, missed security monitoring, failed backups, and degraded incident response. Those outcomes can become contractual breaches, audit findings, and customer trust issues even if they are not tied to a specific public regulator action in your materials 2.

Practical 30/60/90-day execution plan

First 30 days (stand up the control)

  • Publish a Capacity Management Standard: scope, roles, review cadence, evidence expectations.
  • Identify capacity-critical services and owners.
  • Define top metrics and initial thresholds for each critical service.
  • Ensure monitoring and alert routing exists for those metrics.
  • Create the evidence index (single folder + link map).

By 60 days (operate and prove it)

  • Run recurring capacity reviews; capture minutes and action items.
  • Build trend views for key metrics; record forecasts for known events.
  • Execute at least a few capacity-driven changes (quota increases, scaling policy updates, DB tuning) and retain tickets and validation notes.
  • Document third-party limits for each critical service dependency.

By 90 days (harden and audit-proof)

  • Add capacity testing for the highest-risk workflows and log pipeline.
  • Calibrate thresholds to reduce noise and increase signal quality.
  • Add executive reporting: top constraints, upcoming risks, and planned remediation.
  • Map Annex A 8.6 to your control library and set recurring evidence capture. Daydream can help keep the mapping, evidence requests, and review reminders in one place so the control stays “always audit-ready” instead of rebuilt during assessment season 1.

Frequently Asked Questions

Does Annex A 8.6 require formal capacity modeling and complex forecasting?

No. It requires a working process that anticipates capacity needs and prevents capacity-related failures. Trend-based forecasting plus documented reviews and executed changes is usually sufficient if it is consistent and evidence-backed.

We’re fully cloud-native with autoscaling. Is that enough?

Autoscaling helps, but auditors will still ask about quotas, dependency bottlenecks, and third-party limits. Show you monitor those ceilings and have a path to raise limits or re-architect before outages.

How do we scope “capacity-critical” without boiling the ocean?

Start with services whose failure triggers an incident, impacts customer access, or causes loss of security telemetry or backups. Use incident history and architecture dependency maps to justify the list.

What evidence is most persuasive in an ISO 27001 audit for capacity management?

A clear metric/threshold matrix, monitoring and alert definitions, and a small set of tickets/changes showing capacity signals led to action. Meeting notes that reference the same metrics close the loop.

How should we cover third-party capacity constraints (SaaS limits, managed services)?

Document the specific limits, where they are monitored, and the escalation process. Where possible, keep contractual or support documentation that shows agreed throughput, rate limits, or service quotas.

Who should own the control: Security, IT, or Engineering?

Security can govern the requirement, but service owners and SRE/Platform usually operate it day to day. Assign one accountable owner per service and one program owner responsible for reviews and evidence capture.

Footnotes

  1. ISO/IEC 27001 overview; ISMS.online Annex A control index

  2. ISO/IEC 27001 overview

Frequently Asked Questions

Does Annex A 8.6 require formal capacity modeling and complex forecasting?

No. It requires a working process that anticipates capacity needs and prevents capacity-related failures. Trend-based forecasting plus documented reviews and executed changes is usually sufficient if it is consistent and evidence-backed.

We’re fully cloud-native with autoscaling. Is that enough?

Autoscaling helps, but auditors will still ask about quotas, dependency bottlenecks, and third-party limits. Show you monitor those ceilings and have a path to raise limits or re-architect before outages.

How do we scope “capacity-critical” without boiling the ocean?

Start with services whose failure triggers an incident, impacts customer access, or causes loss of security telemetry or backups. Use incident history and architecture dependency maps to justify the list.

What evidence is most persuasive in an ISO 27001 audit for capacity management?

A clear metric/threshold matrix, monitoring and alert definitions, and a small set of tickets/changes showing capacity signals led to action. Meeting notes that reference the same metrics close the loop.

How should we cover third-party capacity constraints (SaaS limits, managed services)?

Document the specific limits, where they are monitored, and the escalation process. Where possible, keep contractual or support documentation that shows agreed throughput, rate limits, or service quotas.

Who should own the control: Security, IT, or Engineering?

Security can govern the requirement, but service owners and SRE/Platform usually operate it day to day. Assign one accountable owner per service and one program owner responsible for reviews and evidence capture.

Operationalize this requirement

Map requirement text to controls, owners, evidence, and review workflows inside Daydream.

See Daydream