SLOs eased
You can either love running or hate running, but you will definitely love this analogy - take a fresh look at SLOs!
SLOs eased


Since the start of COVID, ๐Ÿ‘Ÿ Runkeeper reports a 62% spike globally in people heading out for a weekly run. This statistic put in context; there is a +47.3% (globally) increase in people running compared to last year. And every one of those runners has one objective, to Run more and Run better.


โฑ๏ธ Which one of these stats do you associate with better?

  1. โœ… Yesterday, I ran 5 km in 25 minutes.
  2. โŒ Yesterday, I ran at 13 kmph.

โค๏ธ Which one of these do you associate with better?

  1. โœ… During Yesterday's run, my average Heart rate was 140 bpm.
  2. โŒ During Yesterday's run, my heart pumped 3500 times.

Specific measurements, like the total distance over time, only make sense as a cumulative sum. Whereas some measures like heart rate only make sense as an aggregate over time.


Something about Targets and Objectives

๐Ÿƒ Every runner sets themselves targets that would look like these:

๐Ÿ‘Ÿ What do I need to achieve these goals?

  • Consistent motivation
  • Maximum performance. Performance is the relative measure, so we will have to define things we need to track to measure performance.

โฎ๏ธ When will you come to know that you have achieved your target?

  • End of the year.

Objectives here are what we call the Lagging indicators

โฉ How do I ensure that I consistently progress towards achieving those goals?

  • For this, now we need a continuous measure.

The continuous measure here is my average active days/week, OR moderate pace should be less than x min/km, which we call Leading indicators.

When we set annual running targets, we only talk about outcomes. So, for example, we do not speak about Heartbeat as a yearly target.

That is the difference between a Leading Indicator and a Lagging Indicator of performance.

Remember, the best way to run a fast marathon is to run most of your laps at 4 mins/km.

Applying another variation, most people rely on average Heart Rate as a metric to measure the capacity to do more in the remainder of the run. If you keep a reasonable heart rate, the chances of hitting the goal amplify.

But, How do I define a reasonable Heart rate?
Option A: 120 bpm in the first minute and 119 bpm in the second
Option B: 209 bpm in the first minute and 20 bpm in the second โ˜น๏ธ

So what do we need to do?

  • We need to improve the number of good minutes, in the case of heart rate. 95% of the time, my heart rate must be < 120bpm
  • We need to improve the number of good km in interval time. For example, I ย must run 90% of my km < 4.5 minutes.
  • We need to improve the number of active days; in the case of yearly goals, I must be active on 80% of the days.
๐Ÿ’ก
Keep the leading indicators in check, and success at your objectives will be a by-product

๐Ÿคท๐Ÿปโ€โ™‚๏ธ How is all of this tied to SREs and SLOs?

The responsibility of choosing the right indicators and knowing what to aggregate is an integral part of an SRE's job. Unfortunately, they often overlook customer experience in favor of perceived "real" metrics like CPU, Memory, Disk, etc.

SREs are trained coaxed to measure what is easy rather than right. But modern cloud-native systems, where Infrastructure perishes and resurrects by the minute, monitoring components and servers feel like an outdated trick. In the contemporary age of Server-Less, Services are the only experience the customer cares about. And hence, Service Level Objectives.

Observe a Service; Not a Server
Gone are the days of yore when we named are our servers Etsy, Betsy, and Momo, fed them fish, and cleaned their poop.

๐Ÿฅ‡ However, relying on a Single percentage to identify the overall health of service may feel risky at first. Please refer to our detailed SLO workbook to learn how to adopt SLOs.


๐ŸŽ To summarize,

  • To achieve predictable progress - have a clear objective to know what success looks like.
  • But, objectives can only have success/failure as an outcome. So, to avoid disappointments, you need a Leading Indicator of continuous performance for faster course correction.
  • Service Level Objectives are a practical framework for measuring and preventing undesired outcomes.
๐Ÿ’ก
Let us know your running, availability, and reliability targets at the beginning of this new year. We might not wake you up every morning to achieve your running target, but we will wake you up whenever your availability and reliability targets are under threat.

Until then, Happy Running! ๐Ÿƒ๐Ÿปโ€โ™‚๏ธ ๐Ÿ‘Ÿ
Share to:
Twitter
Reddit
Linkedin
#SLO #devops #Last9 Engineering #sre #Observability

You might also like...

Weโ€™ve raised a $11M Series A led by Sequoia Capital India!
Weโ€™ve raised a $11M Series A led by Sequoia Capital India!

Change is the only constant in a cloud environment. The number of microservices is constantly growing and each of these is being deployed several times a day or week, all hosted on ephemeral servers. A typical customer request depends on at least 3 internal and 1 external service. Itโ€™s

Read ->
Why Service Level Objectives?
Why Service Level Objectives?

Understanding how to measure the health of your servcie, benefits of using SLOs, how to set compliances and much more...

Read ->
How to Improve On-Call Experience!
How to Improve On-Call Experience!

Better practices and tools for management of on-call practices

Read ->

SRE with Last9 is incredibly easy. But donโ€™t just take our word for it.