Skip to main content

COMPEL Glossary / GL-56

Mean Time To Recovery (MTTR)

The average elapsed time from detection of an AI incident or SLO breach to restoration of the system to an operationally healthy state.

What this means in practice

MTTR is measured across the incident lifecycle — detect, triage, mitigate, recover — and is tracked separately for model-specific failures (drift, hallucination spikes) and infrastructure failures (latency, availability), providing a concrete reliability signal tied to user impact.

Context in the COMPEL framework

A core metric of the Reliability dimension. Captured in Evaluate and used during Learn to drive runbook and observability improvements.

Where you see this

Mean Time To Recovery (MTTR) is most commonly referenced when teams work across the Evaluate and Learn stages — especially within the Operational Readiness layer . It appears in governance artifacts, assessment instruments, and delivery playbooks wherever COMPEL is operationalized.

Related COMPEL stages

Related domains

Synonyms

mean time to recover , recovery time , MTTR

See also

  • Trust & Performance Dimensions — The eight continuous-measurement axes against which every AI transformation is evaluated in COMPEL: Value, Reliability, Safety, Responsibility, Compliance, Security, Sustainability, and Adoption.
  • Operational Readiness — The assessed capability of an organization to sustain AI operations across 10 interdependent dimensions: strategy alignment, governance maturity, operating model, workforce capability, data readiness, technology infrastructure, monitoring and observability, vendor dependency management, compliance readiness, and change and adoption.
  • Governance Control — A defined mechanism — preventive, detective, or corrective — that enforces policy compliance, mitigates identified risks, or ensures operational integrity for AI systems.