#42 - Hitting Software SLA Targets through…

May 21, 2024

In this first part of a 2-part coverage, Sebastian Vietz and I work out how to meet SLAs through SLOs and SLIs.

2 Comments

May 21, 2024Edited

This is a little off topic, but somewhere along this journey could you discuss the narrative that has emerged over the last few years that the only thing that should every be alerted on is a SLO, because that is the only thing worth getting someone out of bed for - everything else is probably an ephemeral resource that will self-remediate. I am curious as to whether you think this is the right mindset, and if you believe it is practical given the specialization of different groups, their individual KPIs, the need to see things before they become problems (potentially managing with severity levels), etc.

Reply (1)

Ash Patel

May 22, 2024

Not the ideal mindset but it does make it easier for parties suffering from high cognitive load to do the "alert only for SLO breaches". Unless ICs and groups are incentivized to do otherwise, they will only focus on when things become known problems. One is rarely rewarded in a typical org for fighting (supposedly) imaginary enemies.

Reliability Enablers

#42 - Hitting Software SLA Targets through…