Capacity Management
The HITRUST capacity management requirement means you must continuously monitor key system resources, tune them when performance degrades, and forecast future demand so systems meet required performance levels. To operationalize it, define what “required performance” means for critical services, instrument monitoring across storage/compute/memory/network, set actionable thresholds, and run a recurring capacity review that produces documented projections and decisions. (HITRUST CSF v11 Control Reference)
Key takeaways:
- You need proof of ongoing monitoring plus a repeatable capacity planning cadence, not a one-time sizing exercise. (HITRUST CSF v11 Control Reference)
- Forecasting must cover storage, processing power, memory, and network bandwidth, tied to business demand drivers. (HITRUST CSF v11 Control Reference)
- Auditors will look for thresholds, tuning actions taken, and evidence that projections drive change tickets and funding decisions. (HITRUST CSF v11 Control Reference)
Capacity management is a reliability control that auditors treat as operational hygiene: do you know when you are running out of headroom, can you prevent resource exhaustion incidents, and can you show that you plan ahead instead of reacting during outages. HITRUST CSF v11 09.h sets a straightforward expectation: monitor resource use, tune systems based on what you observe, and project future capacity so required performance is maintained across the core resource domains. (HITRUST CSF v11 Control Reference)
For a Compliance Officer, CCO, or GRC lead, the fastest path to “audit-ready” is to translate this into an operating rhythm with unambiguous outputs: a defined scope of in-scope systems, a standard set of capacity metrics and thresholds, a recurring review meeting with documented decisions, and a paper trail that connects forecasts to tickets, budgets, and configuration changes. This page gives you requirement-level guidance you can hand to Infrastructure/SRE/Cloud Ops and then govern through evidence checks, without turning it into a theoretical exercise.
Regulatory text
HITRUST CSF v11 09.h requires that: “The use of resources shall be monitored, tuned, and projections made of future capacity requirements to ensure the required system performance. Capacity planning shall address system resources including storage, processing power, memory, and network bandwidth.” (HITRUST CSF v11 Control Reference)
Operator interpretation (what this means in practice):
- Monitored: You have instrumentation and dashboards (or equivalent reporting) that show real consumption and trends for the required resource categories. (HITRUST CSF v11 Control Reference)
- Tuned: You take action when monitoring indicates risk to performance (config changes, scaling, query optimization, queue tuning, resizing, bandwidth changes), and you keep evidence of those actions. (HITRUST CSF v11 Control Reference)
- Projections made: You produce forward-looking capacity expectations tied to demand (customer growth, new integrations, batch windows, seasonal peaks, new product features), not just current utilization. (HITRUST CSF v11 Control Reference)
- Required system performance: You define performance expectations for critical services (availability and latency targets, batch completion windows, RTO/RPO alignment) and ensure capacity work supports them. (HITRUST CSF v11 Control Reference)
Plain-English requirement
You must prevent “running out of system” as a predictable cause of downtime or degraded service. Do that by (1) measuring core resource usage, (2) keeping enough headroom through tuning and scaling, and (3) documenting forecasts and decisions so performance stays within agreed expectations. (HITRUST CSF v11 Control Reference)
Who it applies to
Entities: All organizations using HITRUST CSF. (HITRUST CSF v11 Control Reference)
Operational context (where auditors expect this to exist):
- Production environments that store, process, or transmit regulated data (including cloud and on-prem).
- Shared services that can create systemic impact (identity services, logging pipelines, EDR consoles, VPN/ZTNA, message queues, databases).
- High-dependency third parties where your performance depends on their capacity (cloud platforms, managed databases, managed EDI, outsourced call centers). You cannot control their internals, but you can monitor consumption signals, manage quotas, and maintain escalation paths.
What you actually need to do (step-by-step)
1) Set scope and “required performance” targets
- List in-scope services/systems that support regulated workloads and business-critical processes.
- Define performance objectives per service in plain terms: user-facing latency expectations, batch job completion times, uptime expectations, and any internal SLO/SLA constructs you already use.
- Map dependencies (database, cache, queue, network egress, third-party APIs) so capacity signals match real bottlenecks.
Practical tip: If you cannot define targets for every system, start with “critical path” services where resource exhaustion would cause an incident, a missed processing window, or a security control failure (for example, logging not ingesting due to storage saturation).
2) Standardize the capacity metrics you will monitor
Create a minimum metric set that covers the four required domains. (HITRUST CSF v11 Control Reference)
Baseline metric set (examples):
- Storage: volume/database growth rate, free space, IOPS, throughput, snapshot/backup repository growth.
- Processing power: CPU utilization (average and peak), run queue, throttling events, autoscaling saturation.
- Memory: committed memory, page faults/swap, OOM kills, cache hit ratios (where relevant).
- Network bandwidth: throughput, packet loss, retransmits, saturation of NICs, egress constraints, load balancer capacity.
Make owners for each metric set (SRE, Infra, DBAs, Network) and define where the “system of record” lives (APM, cloud monitoring, NMS, SIEM summaries).
3) Define thresholds, headroom rules, and alert routing
- Set thresholds that are meaningful for performance degradation (warning vs critical).
- Define headroom targets for key resources (for example, “keep enough free storage to tolerate growth and restore operations”).
- Route alerts to an on-call or operations queue with clear runbooks: what to check, what to change, and when to escalate.
Audit-ready expectation: Alerts alone are not enough. You need evidence that alerts drive action and that tuning happens before performance is harmed. (HITRUST CSF v11 Control Reference)
4) Establish a recurring capacity review with forecasting output
Run a recurring capacity review meeting (Ops + service owners + GRC observer as needed) with consistent agenda and artifacts:
- Trend review: top resource consumers, growth rates, recurring saturation points.
- Demand drivers: planned releases, onboarding, migrations, marketing events, seasonality, new data feeds.
- Forecast: projected resource requirements for each critical service.
- Decisions: scale up/out actions, tuning work, quota increases, architecture changes, budget requests.
- Tickets created: link decisions to change records and backlog items.
Minimum viable forecast: A documented projection per critical service for storage, compute, memory, and network, plus assumptions and confidence notes. (HITRUST CSF v11 Control Reference)
5) Tuning and remediation workflow (make it provable)
- Create a standard change path for capacity changes (normal change vs emergency).
- Capture tuning actions: what changed, why, before/after graphs, and validation notes.
- Post-incident integration: if any incident involved resource saturation, create a preventive capacity action item and track it to closure.
6) Include third-party capacity dependencies
Where a third party constrains capacity (API rate limits, egress caps, service quotas, managed database limits):
- Track quotas/limits and current consumption.
- Keep support escalation procedures and response expectations.
- Ensure your forecasts include third-party constraints so you request quota changes before you hit limits.
Daydream fit (where it naturally helps): If your capacity evidence is spread across cloud consoles, monitoring tools, and tickets, Daydream can act as the control workspace where you map the requirement to owners, collect the recurring artifacts, and keep the narrative tight for assessment without chasing screenshots across teams.
Required evidence and artifacts to retain
Auditors will ask for proof that monitoring, tuning, and forecasting are operating, not aspirational. Keep:
- Capacity management standard / procedure (scope, roles, metrics, review cadence, escalation).
- Service inventory with criticality and performance expectations.
- Dashboards or periodic reports for storage/CPU/memory/network, including trend views.
- Alert policy configuration and routing (who receives what, during what hours).
- Capacity review minutes with forecasts, assumptions, and decisions.
- Change tickets for capacity-related scaling/tuning, with before/after evidence (graphs, logs, test results).
- Incident records tied to capacity constraints (if any) and corrective actions.
- Third-party quota/limit documentation and requests for increases (where applicable).
Common exam/audit questions and hangups
Expect questions like:
- “Show me how you monitor storage, compute, memory, and network for the production environment.” (HITRUST CSF v11 Control Reference)
- “Where are thresholds documented, and who responds to alerts?”
- “How do you project future capacity needs? What assumptions do you use?” (HITRUST CSF v11 Control Reference)
- “Give examples of tuning actions taken based on monitoring.” (HITRUST CSF v11 Control Reference)
- “How do you ensure performance requirements are met during growth or peak load?” (HITRUST CSF v11 Control Reference)
- “How do you manage cloud quotas and third-party service limits?”
Hangup to avoid: Treating capacity planning as a procurement spreadsheet with no operational signal. HITRUST expects monitoring + tuning + projections tied together. (HITRUST CSF v11 Control Reference)
Frequent implementation mistakes (and how to avoid them)
- Monitoring exists, but no one owns it. Fix: assign metric owners per platform and make alert response part of on-call responsibilities.
- Only CPU is monitored. Fix: meet the explicit requirement by covering storage, processing power, memory, and network bandwidth across in-scope services. (HITRUST CSF v11 Control Reference)
- Forecasts are verbal. Fix: produce a written forecast artifact each cycle with assumptions, decisions, and links to tickets. (HITRUST CSF v11 Control Reference)
- No evidence of tuning. Fix: require before/after graphs and a short validation note in change tickets.
- Capacity reviews ignore third-party constraints. Fix: track quotas and rate limits as first-class capacity items with lead time for increases.
- No linkage to “required system performance.” Fix: document performance expectations per critical service and reference them in capacity decisions. (HITRUST CSF v11 Control Reference)
Risk implications (why auditors care)
Capacity failures are predictable and preventable. If you cannot show monitoring, tuning, and projections, you are exposed to:
- Availability incidents caused by resource exhaustion.
- Data processing delays that break downstream controls (logging gaps, failed backups, delayed security scans).
- Emergency changes under pressure, which increase operational and security risk. HITRUST frames this as maintaining required system performance through disciplined resource management. (HITRUST CSF v11 Control Reference)
Practical 30/60/90-day execution plan
First 30 days (Immediate)
- Define in-scope systems and owners; document “required performance” for critical services. (HITRUST CSF v11 Control Reference)
- Confirm monitoring coverage for storage/CPU/memory/network; identify gaps and assign remediation.
- Stand up a single capacity dashboard view per critical platform (even if rough).
- Draft the capacity management procedure and evidence checklist.
Days 31–60 (Near-term)
- Implement thresholds and alert routing with runbooks for the most failure-prone resources.
- Start the recurring capacity review; produce the first written forecast and decisions log. (HITRUST CSF v11 Control Reference)
- Ensure capacity-related changes go through ticketing with before/after evidence requirements.
- Add third-party quota tracking for critical dependencies.
Days 61–90 (Operationalize)
- Stabilize the cadence: consistent forecasts, consistent minutes, consistent ticket linkage.
- Validate that at least one tuning/scaling action has complete evidence end-to-end (monitoring signal → decision → change → validation).
- Build an audit packet in Daydream (or your GRC system): procedure, dashboards, sample alerts, meeting minutes, tickets, and a short narrative explaining how projections protect performance. (HITRUST CSF v11 Control Reference)
Frequently Asked Questions
What counts as “projections made of future capacity requirements”?
A written, forward-looking estimate of needed storage/compute/memory/network for critical services, plus assumptions and resulting decisions or tickets. A dashboard with trends helps, but you still need an explicit forecast output. (HITRUST CSF v11 Control Reference)
Do we need capacity planning for every single system?
Start with in-scope production systems that support regulated workloads and business-critical processes, then expand. Auditors mainly test whether your approach is systematic and covers the required resource categories. (HITRUST CSF v11 Control Reference)
How do we prove “tuning” happened?
Keep change tickets that cite a capacity or performance trigger, document what was changed, and attach before/after graphs or measurements plus a validation note. That shows monitoring drove action. (HITRUST CSF v11 Control Reference)
We’re cloud-native with autoscaling. Is that enough?
Autoscaling reduces risk but does not replace capacity management. You still need monitoring, thresholds, and forecasts, and you must account for quotas, scaling limits, and non-scaling constraints like storage growth and bandwidth caps. (HITRUST CSF v11 Control Reference)
How should a GRC team sample evidence without becoming the SRE team?
Define a quarterly evidence pull: pick a few critical services, collect dashboards, a capacity review artifact, and one or two capacity-related change tickets with validation. Track gaps as control issues with owners and dates in your GRC workflow.
What if a third party controls the platform (SaaS/PaaS)?
Monitor what you can (usage, quotas, rate limits, ingestion backlogs), document vendor-provided limits, and keep a process for requesting increases and escalating. Your obligation is to manage performance risk even if you do not manage the underlying hardware. (HITRUST CSF v11 Control Reference)
Frequently Asked Questions
What counts as “projections made of future capacity requirements”?
A written, forward-looking estimate of needed storage/compute/memory/network for critical services, plus assumptions and resulting decisions or tickets. A dashboard with trends helps, but you still need an explicit forecast output. (HITRUST CSF v11 Control Reference)
Do we need capacity planning for every single system?
Start with in-scope production systems that support regulated workloads and business-critical processes, then expand. Auditors mainly test whether your approach is systematic and covers the required resource categories. (HITRUST CSF v11 Control Reference)
How do we prove “tuning” happened?
Keep change tickets that cite a capacity or performance trigger, document what was changed, and attach before/after graphs or measurements plus a validation note. That shows monitoring drove action. (HITRUST CSF v11 Control Reference)
We’re cloud-native with autoscaling. Is that enough?
Autoscaling reduces risk but does not replace capacity management. You still need monitoring, thresholds, and forecasts, and you must account for quotas, scaling limits, and non-scaling constraints like storage growth and bandwidth caps. (HITRUST CSF v11 Control Reference)
How should a GRC team sample evidence without becoming the SRE team?
Define a quarterly evidence pull: pick a few critical services, collect dashboards, a capacity review artifact, and one or two capacity-related change tickets with validation. Track gaps as control issues with owners and dates in your GRC workflow.
What if a third party controls the platform (SaaS/PaaS)?
Monitor what you can (usage, quotas, rate limits, ingestion backlogs), document vendor-provided limits, and keep a process for requesting increases and escalating. Your obligation is to manage performance risk even if you do not manage the underlying hardware. (HITRUST CSF v11 Control Reference)
Authoritative Sources
Operationalize this requirement
Map requirement text to controls, owners, evidence, and review workflows inside Daydream.
See Daydream