Service continuity management

Service continuity management (ISO/IEC 20000-1:2018 Clause 8.7.2) requires you to plan and implement how critical services stay available through disruptions, define measurable recovery targets (RTO/RPO), and test continuity plans at planned intervals. To operationalize it fast, inventory critical services, set RTO/RPO per service, build and test runbooks, and retain evidence that tests met objectives.

Key takeaways:

  • Define service-specific RTO and RPO, then design recovery capabilities to meet them.
  • Test continuity plans on a schedule, capture results, and drive corrective actions to closure.
  • Treat third parties as dependencies; your continuity target fails if a key provider can’t recover.

“Service continuity management requirement” questions usually come down to one thing: can you prove your services will recover within agreed limits, and can you prove you test that claim. ISO/IEC 20000-1:2018 Clause 8.7.2 is short, but it is operationally demanding because it forces measurable objectives (RTO and RPO) and repeatable testing. The clause also implies governance: you need defined ownership, documented plans, and a way to learn from tests and real incidents.

For a Compliance Officer, CCO, or GRC lead, the fastest path is to translate continuity into a service-by-service obligation with clear artifacts: a continuity plan (often a set of runbooks), RTO/RPO definitions, a test plan and schedule, test records, and remediation tracking. Auditors rarely accept “we have backups” or “we have DR” as a control. They want evidence that the recovery design matches the objective and that tests validate the objective.

This page gives requirement-level implementation guidance you can hand to service owners and technology leads and then audit against.

Regulatory text

ISO/IEC 20000-1:2018 Clause 8.7.2 states: “The organization shall plan and implement continuity of services. Continuity plans shall be tested at planned intervals with recovery time and recovery point objectives defined.” 1

Operator interpretation of the text

  • “Plan and implement continuity of services” means documented continuity plans are not enough. You need implemented capabilities (people, process, tooling, contracts, access, backups/replication, alternate sites, communications) that can actually deliver recovery.
  • “RTO and RPO objectives defined” means you must set explicit, measurable targets. If you cannot state the maximum tolerated downtime (RTO) and maximum tolerated data loss (RPO) for a service, you have not met the requirement.
  • “Tested at planned intervals” means continuity tests are scheduled, repeatable, scoped, and recorded. “We tested once a while ago” fails this clause because it is not planned and not periodic.

Plain-English requirement

You must (1) identify what “continuity” means for each service, (2) set recovery targets (RTO/RPO), (3) build plans and recovery methods that can meet those targets, and (4) test those plans on a schedule, recording results and fixing gaps.

Who it applies to

Entity types

  • Any organization providing or operating services under an ISO/IEC 20000-1 service management system, including internal IT organizations and external service providers. 1

Operational context

  • Customer-facing production services (SaaS, hosted platforms, managed services).
  • Shared internal services that support regulated or revenue-critical operations (identity, payroll, ERP, network, service desk).
  • Services with external dependencies (cloud providers, payment processors, critical subcontractors). Your continuity posture must cover these dependencies because they affect end-to-end service recovery.

What you actually need to do (step-by-step)

1) Establish scope and ownership

Create a scoped list of “in-scope services” covered by your service management system, then assign:

  • Service owner accountable for continuity outcomes.
  • Technical recovery owner(s) accountable for runbooks and recovery execution.
  • GRC/compliance owner accountable for evidence, test scheduling discipline, and audit readiness.

Deliverable: service register with named owners.

2) Define RTO and RPO per service (and document the rationale)

For each in-scope service, set:

  • RTO (Recovery Time Objective): the maximum time the service can be down before unacceptable impact.
  • RPO (Recovery Point Objective): the maximum acceptable data loss, expressed as a time window.

How to make this audit-proof:

  • Tie RTO/RPO to business impact (customer impact, contractual obligations, safety, operational dependency).
  • Record assumptions (peak hours vs off hours, regional scope, degraded mode acceptance).
  • Get formal approval from the business owner, not only IT.

Deliverables: RTO/RPO register; approvals; dependency notes.

3) Map dependencies that can break recovery

Create a dependency map per critical service:

  • Upstream/downstream internal systems (identity, DNS, monitoring, ticketing).
  • Data stores and replication paths.
  • Third parties required to restore service (cloud, telecom, managed database, security tooling).
  • Required people access (break-glass accounts, MFA recovery, privileged access).

Practical checkpoint: if a third party’s outage would prevent you from meeting your RTO, record that as a continuity risk and address it through architecture changes or contractual commitments.

Deliverables: dependency maps; third-party continuity requirements; risk register entries.

4) Build continuity plans that match the objectives

A continuity plan should be executable under stress. Keep it “runbook-first,” with references to architecture diagrams and configuration repositories.

Minimum contents per service:

  • Trigger criteria (what events invoke continuity procedures).
  • Roles and contact paths (primary and alternate).
  • Step-by-step recovery actions (including validation checks).
  • Data recovery steps aligned to RPO (restore points, replication failover, integrity checks).
  • Customer and internal communications steps.
  • Decision points (failover vs restore, partial service, read-only mode).
  • Post-recovery actions (monitoring, backlog handling, incident/problem record linkage).

Deliverables: service continuity runbooks; communications templates; call trees.

5) Define a test program (scope, schedule, success criteria)

The clause requires testing at planned intervals, so you need a documented test plan that states:

  • Test types you will run (tabletop, technical recovery, failover, restore validation).
  • Scope and boundaries (which regions, which components, what “good” looks like).
  • Success criteria tied to RTO/RPO and service health checks.
  • Evidence to collect during tests (timestamps, logs, screenshots, ticket references).
  • Exception handling (how you record partial tests or deferrals).

Deliverables: continuity test plan; test calendar; success criteria.

6) Execute tests and capture results like an auditor will read them

Each test should produce a record that answers:

  • What was tested, by whom, and when.
  • What objective applied (RTO/RPO).
  • What actually happened (timeline of actions, observed recovery time, observed data recovery point).
  • Whether objectives were met and why.
  • Issues found and their corrective actions with owners and due dates.

If you can’t demonstrate RTO/RPO attainment, the correct outcome is not “pass.” It is “did not meet objective,” plus remediation.

Deliverables: test reports; objective attainment statements; raw evidence.

7) Close the loop: corrective actions and continual improvement

A continuity program fails in audits when findings linger. Create a simple workflow:

  • Log gaps as problems/risks.
  • Assign an owner and target date.
  • Retest or validate fixes.
  • Update runbooks and training.

Tip: connect continuity findings to change management. If major architecture changes occur, update the plan and consider a targeted retest.

Deliverables: corrective action log; retest records; updated plans.

Required evidence and artifacts to retain

Keep evidence in a single, reviewable place (GRC repository, service management tool, or an auditable document store). Minimum evidence set:

  • Approved RTO/RPO register per service.
  • Service continuity plans/runbooks (versioned, with owners).
  • Dependency maps and third-party continuity requirements.
  • Test plan, test schedule, and completed test reports.
  • Test evidence pack: timelines, logs, screenshots, restore validation output, incident tickets.
  • Corrective action tracking and closure evidence.
  • Training/awareness records for on-call and recovery roles (attendance, acknowledgments).
  • Records of continuity plan review after material service changes.

If you manage this in Daydream, structure it as a control with mapped evidence requests to service owners (RTO/RPO approval, latest test report, corrective action closure) so collection stays continuous rather than a pre-audit scramble.

Common exam/audit questions and hangups

Auditors typically probe these areas:

  • RTO/RPO definition quality: “Show me who approved the RTO/RPO and why those numbers are acceptable.”
  • Objective alignment: “Prove your technical design can meet the objective, not just that you want to.”
  • Testing discipline: “Show planned intervals. Were tests executed as scheduled? If not, where is the exception and approval?”
  • Test realism: “Was this a tabletop only? What was technically validated?”
  • Evidence integrity: “How do we know the timestamps and outcomes are accurate?”
  • Third-party dependency coverage: “What happens if your cloud region or key provider fails?”

Hangup to expect: teams often have disaster recovery artifacts but no service-level RTO/RPO traceability. Fixing traceability is usually the fastest audit win.

Frequent implementation mistakes (and how to avoid them)

  1. One RTO/RPO for everything.
    Avoidance: set tiers (critical, high, medium) but still assign a tier per service with explicit RTO/RPO and documented approval.

  2. Plans that are essays, not runbooks.
    Avoidance: write step-by-step recovery actions with prerequisites, commands/links, and validation checks. If a new on-call engineer can’t run it, it’s not a recovery plan.

  3. Testing that doesn’t measure RTO/RPO.
    Avoidance: include a timeline in every test report. Record start, restore milestones, and the point you declare the service recovered. Record the restore point used to evaluate RPO.

  4. No corrective action closure.
    Avoidance: treat missed objectives as formal findings. Track them like production defects with owners and verification.

  5. Ignoring third parties.
    Avoidance: add continuity requirements to third-party onboarding and renewals (recovery capabilities, test participation, notification duties). Record compensating controls if the third party cannot meet your target.

Enforcement context and risk implications

No public enforcement cases were provided in the source catalog for this requirement, so you should treat this as a standards-conformance and assurance risk rather than a documented enforcement trend. Operationally, failure modes are predictable: inability to restore service on time, unplanned data loss beyond the RPO, and audit findings that undermine customer trust and contract eligibility. Continuity testing also becomes a discovery mechanism; without it, you tend to learn the real recovery time during a real outage.

Practical 30/60/90-day execution plan

First 30 days (stabilize and define)

  • Confirm in-scope services and owners.
  • Draft the RTO/RPO register for critical services first; route for approval.
  • Identify top dependencies and third parties per critical service.
  • Collect existing DR/backup/runbook materials and assess gaps against RTO/RPO.

By 60 days (document and implement)

  • Complete continuity runbooks for critical services, including communications steps.
  • Publish a continuity test plan and test calendar (planned intervals).
  • Set up evidence capture templates (test report format, timeline, success criteria, sign-off).

By 90 days (test, measure, remediate)

  • Execute continuity tests for critical services; record RTO/RPO outcomes.
  • Open corrective actions for gaps; prioritize items that block objective attainment.
  • Update runbooks based on test results; retrain responders where procedures changed.
  • Put continuity evidence collection on a recurring cadence in your GRC workflow (Daydream or your existing system), so each test automatically generates the audit trail.

Frequently Asked Questions

Do we need RTO and RPO for every service, or only “critical” ones?

The clause requires RTO and RPO objectives to be defined, so set them for each in-scope service. You can tier services to simplify, but you still need an explicit RTO/RPO assignment and approval per service. 1

What counts as “tested at planned intervals”?

You need a documented test plan and schedule, then executed tests that produce records. The standard does not prescribe a specific frequency, so define an interval that matches service criticality and change rate, then follow it. 1

Are tabletop exercises enough to meet the requirement?

Tabletop exercises help validate roles, decisions, and communications, but they often fail to prove RTO/RPO attainment. For higher criticality services, include technical recovery steps and validation evidence so you can demonstrate measurable outcomes. 1

How should we handle third-party dependencies in continuity planning?

Treat each critical third party as a continuity dependency with documented expectations, escalation paths, and failure assumptions. If a third party can’t support your RTO/RPO, document compensating controls (architecture changes, alternate providers, manual workarounds) and track the residual risk.

What evidence will auditors ask for first?

Expect requests for the RTO/RPO register with approvals, the latest continuity test report for a critical service, and proof that gaps were tracked and closed. If you can produce those quickly, the audit usually becomes a sampling exercise rather than a debate.

We met RTO but missed RPO in a test. Is that a fail?

Treat it as not meeting the defined objective for that service, and document corrective actions. Auditors focus less on perfection and more on whether you measure against objectives and fix what you find. 1

Footnotes

  1. ISO/IEC 20000-1:2018 Information technology — Service management

Frequently Asked Questions

Do we need RTO and RPO for every service, or only “critical” ones?

The clause requires RTO and RPO objectives to be defined, so set them for each in-scope service. You can tier services to simplify, but you still need an explicit RTO/RPO assignment and approval per service. (Source: ISO/IEC 20000-1:2018 Information technology — Service management)

What counts as “tested at planned intervals”?

You need a documented test plan and schedule, then executed tests that produce records. The standard does not prescribe a specific frequency, so define an interval that matches service criticality and change rate, then follow it. (Source: ISO/IEC 20000-1:2018 Information technology — Service management)

Are tabletop exercises enough to meet the requirement?

Tabletop exercises help validate roles, decisions, and communications, but they often fail to prove RTO/RPO attainment. For higher criticality services, include technical recovery steps and validation evidence so you can demonstrate measurable outcomes. (Source: ISO/IEC 20000-1:2018 Information technology — Service management)

How should we handle third-party dependencies in continuity planning?

Treat each critical third party as a continuity dependency with documented expectations, escalation paths, and failure assumptions. If a third party can’t support your RTO/RPO, document compensating controls (architecture changes, alternate providers, manual workarounds) and track the residual risk.

What evidence will auditors ask for first?

Expect requests for the RTO/RPO register with approvals, the latest continuity test report for a critical service, and proof that gaps were tracked and closed. If you can produce those quickly, the audit usually becomes a sampling exercise rather than a debate.

We met RTO but missed RPO in a test. Is that a fail?

Treat it as not meeting the defined objective for that service, and document corrective actions. Auditors focus less on perfection and more on whether you measure against objectives and fix what you find. (Source: ISO/IEC 20000-1:2018 Information technology — Service management)

Authoritative Sources

Operationalize this requirement

Map requirement text to controls, owners, evidence, and review workflows inside Daydream.

See Daydream
ISO/IEC 20000-1: Service continuity management | Daydream