This is a little off topic, but somewhere along this journey could you discuss the narrative that has emerged over the last few years that the only thing that should every be alerted on is a SLO, because that is the only thing worth getting someone out of bed for - everything else is probably an ephemeral resource that will self-remediate. I am curious as to whether you think this is the right mindset, and if you believe it is practical given the specialization of different groups, their individual KPIs, the need to see things before they become problems (potentially managing with severity levels), etc.
Not the ideal mindset but it does make it easier for parties suffering from high cognitive load to do the "alert only for SLO breaches". Unless ICs and groups are incentivized to do otherwise, they will only focus on when things become known problems. One is rarely rewarded in a typical org for fighting (supposedly) imaginary enemies.
This is a little off topic, but somewhere along this journey could you discuss the narrative that has emerged over the last few years that the only thing that should every be alerted on is a SLO, because that is the only thing worth getting someone out of bed for - everything else is probably an ephemeral resource that will self-remediate. I am curious as to whether you think this is the right mindset, and if you believe it is practical given the specialization of different groups, their individual KPIs, the need to see things before they become problems (potentially managing with severity levels), etc.
Not the ideal mindset but it does make it easier for parties suffering from high cognitive load to do the "alert only for SLO breaches". Unless ICs and groups are incentivized to do otherwise, they will only focus on when things become known problems. One is rarely rewarded in a typical org for fighting (supposedly) imaginary enemies.