SLOs eased
You can either love running or hate running, but you will definitely love this analogy - take a fresh look at SLOs!
SLOs eased

Since the start of COVID, 👟 Runkeeper reports a 62% spike globally in people heading out for a weekly run. This statistic put in context; there is a +47.3% (globally) increase in people running compared to last year. And every one of those runners has one objective, to Run more and Run better.

⏱️ Which one of these stats do you associate with better?

  1. ✅ Yesterday, I ran 5 km in 25 minutes.
  2. ❌ Yesterday, I ran at 13 kmph.

❤️ Which one of these do you associate with better?

  1. ✅ During Yesterday's run, my average Heart rate was 140 bpm.
  2. ❌ During Yesterday's run, my heart pumped 3500 times.

Specific measurements, like the total distance over time, only make sense as a cumulative sum. Whereas some measures like heart rate only make sense as an aggregate over time.

Something about Targets and Objectives

🏃 Every runner sets themselves targets that would look like these:

👟 What do I need to achieve these goals?

  • Consistent motivation
  • Maximum performance. Performance is the relative measure, so we will have to define things we need to track to measure performance.

⏮️ When will you come to know that you have achieved your target?

  • End of the year.

Objectives here are what we call the Lagging indicators

How do I ensure that I consistently progress towards achieving those goals?

  • For this, now we need a continuous measure.

The continuous measure here is my average active days/week, OR moderate pace should be less than x min/km, which we call Leading indicators.

When we set annual running targets, we only talk about outcomes. So, for example, we do not speak about Heartbeat as a yearly target.

That is the difference between a Leading Indicator and a Lagging Indicator of performance.

Remember, the best way to run a fast marathon is to run most of your laps at 4 mins/km.

Applying another variation, most people rely on average Heart Rate as a metric to measure the capacity to do more in the remainder of the run. If you keep a reasonable heart rate, the chances of hitting the goal amplify.

But, How do I define a reasonable Heart rate?
Option A: 120 bpm in the first minute and 119 bpm in the second
Option B: 209 bpm in the first minute and 20 bpm in the second ☹️

So what do we need to do?

  • We need to improve the number of good minutes, in the case of heart rate. 95% of the time, my heart rate must be < 120bpm
  • We need to improve the number of good km in interval time. For example, I  must run 90% of my km < 4.5 minutes.
  • We need to improve the number of active days; in the case of yearly goals, I must be active on 80% of the days.
Keep the leading indicators in check, and success at your objectives will be a by-product

🤷🏻‍♂️ How is all of this tied to SREs and SLOs?

The responsibility of choosing the right indicators and knowing what to aggregate is an integral part of an SRE's job. Unfortunately, they often overlook customer experience in favor of perceived "real" metrics like CPU, Memory, Disk, etc.

SREs are trained coaxed to measure what is easy rather than right. But modern cloud-native systems, where Infrastructure perishes and resurrects by the minute, monitoring components and servers feel like an outdated trick. In the contemporary age of Server-Less, Services are the only experience the customer cares about. And hence, Service Level Objectives.

Observe a Service; Not a Server
Gone are the days of yore when we named are our servers Etsy, Betsy, and Momo, fed them fish, and cleaned their poop.

🥇 However, relying on a Single percentage to identify the overall health of service may feel risky at first. Please refer to our detailed SLO workbook to learn how to adopt SLOs.

🎁 To summarize,

  • To achieve predictable progress - have a clear objective to know what success looks like.
  • But, objectives can only have success/failure as an outcome. So, to avoid disappointments, you need a Leading Indicator of continuous performance for faster course correction.
  • Service Level Objectives are a practical framework for measuring and preventing undesired outcomes.
Let us know your running, availability, and reliability targets at the beginning of this new year. We might not wake you up every morning to achieve your running target, but we will wake you up whenever your availability and reliability targets are under threat.

Until then, Happy Running! 🏃🏻‍♂️ 👟
Share to:
#SLO #devops #Last9 Engineering #sre #Observability

You might also like...

How we won Dukaan over
How we won Dukaan over

5 meetings. 1 month. From introductions, to a demo, and ultimately winning Dukaan over. Subhash and his team’s velocity on decision-making, moving fast, and radical candor, is a breath of fresh air in the Indian startup ecosystem.

Read ->
Sample vs Metrics vs Cardinality
Sample vs Metrics vs Cardinality

When dealing with Time Series databases, I always got confused with Sample vs Metrics vs Cardinality. Here’s an explanation as I have understood it.

Read ->
How to calculate HTTP content-length metrics on cli
How to calculate HTTP content-length metrics on cli

A simple guide to crunch numbers for understanding overall HTTP content length metrics.

Read ->

SRE with Last9 is incredibly easy. But don’t just take our word for it.