Site Reliability Engineering

Terminology

  • SLI - Service Level Indicator
    • A quantitative aspect of the level of service
    • What you and your users care about.
    • Sharing the user’s pain.
  • SLO - Service Level Objective
    • Targets or range values on top of your SLIs
    • You can define more than one SLO per SLI
    • They help to set user expectations
  • SLA - Service Level Agreement
    • They imply a contract
    • Financially oriented
  • Error Budget
    • For a given SLO and a time window, how much time we can be outside of the SLO.
    • A 99.9% SLO service has 0.1% error budget