Reliability Enablers (SREpath)
Reliability Enablers
#50 Making Better Sense of Observability Data
2
0:00
-24:37

#50 Making Better Sense of Observability Data

In this email, we explore ideas to push our thinking of observability data like the 5th signal, adding π to the observability mix, and more.
2

Jack Neely is a DevOps observability architect at Palo Alto Networks and has a few interesting ways of extracting value from o11y data.

We crammed into just under 25 minutes ideas like these 7 takeaways:

  1. Reasserting the Need to Monitor Four Golden Signals: Focus on latency, traffic, errors, and saturation for effective system monitoring and management.

  2. Prioritize Customer Health: in Jack’s words, the 5th golden signal. Go beyond traditional metrics to monitor the health of your customers for a more comprehensive view of your system's impact.

  3. Apply Mathematical Techniques: Incorporate advanced mathematical concepts, like the Nyquist Shannon law and T Digest algorithm, to enhance data accuracy and observability metrics.

  4. Build Accurate Percentiles: Implement techniques to accurately reproduce percentiles from raw data to ensure reliable performance metrics.

  5. Manage High Cardinality Data: Develop strategies to handle high cardinality data without overwhelming your resources, ensuring you extract valuable insights.

  6. Standardize Log Records: Use readily available frameworks to emit standardized log records makes data easier to process and visualize.

  7. Handle High-Velocity Data Efficiently: Develop methods for collecting and processing high-velocity data without incurring prohibitive costs.

Watch Jack’s Monitorama talk here:

Discussion about this podcast

Reliability Enablers (SREpath)
Reliability Enablers
Software reliability is a tough topic for engineers in many organizations. The Reliability Enablers (Ash Patel and Sebastian Vietz) know this from experience. Join us as we demystify reliability jargon like SRE, DevOps, and more. We interview experts and share practical insights. Our mission is to help you boost your success in reliability-enabling areas like observability, incident response, release engineering, and more.