Sample vs Metrics vs Cardinality
When dealing with Time Series databases, I always got confused with Sample vs Metrics vs Cardinality. Here’s an explanation as I have understood it.
Sample vs Metrics vs Cardinality

When dealing with Time Series databases, I always got confused with Sample vs Metrics vs Cardinality. Here’s an explanation as I have understood it.

A sample is the most atomic record of observability that cannot be explored beyond its attributes.

A sample has a few key attributes.

  • Timestamp: Of the observation.
  • Value: Of the observation.
  • Name: What is being recorded?
  • Duration: Time between two recordings.
Dimensions are City: Austin, State: Texas, Country: USA, ZIP: 94043 Metric name is `total_water_consumed` The period is 19:00–19:10 on 20th January 2022

Key things

  • The data is not AT the time of observation, but the duration between two observations. Ex. 17th Jan 2022, 18:05 till 17th Jan 2022, 18:06
  • Some call this Period.
  • No matter your service, the data is always an aggregate for a minimum time range.

As your Observability grows, you face three key challenges.

  • More metrics are to be observed. Also called Coverage.
  • Same metric for more entities. Also called Cardinality.
  • Save metrics for a longer duration. Also called Retention.

Retention

As you start saving data for longer, at a given time, you have more historical data that can be used for trend analysis. But like any Data system, this adds additional strain to keep the system reliable for a larger volume of data.

We were earlier saving data for only 10 minutes. When we increase the retention to 30 minutes, 2 more samples will be preserved.

Cardinality

It is the number of unique attributes observed for a given metric. Cardinality is the most formidable challenge because the computation and memory needed to tackle queries grow exponentially.

We were monitoring total_water_consumed for a particular zip code, say we need 2 additional zip codes to be monitored, It would mean 2 more samples for the same metric + duration.

More Observability

Coverage increases when you observe more metrics or more dimensions. But, Saving more metrics does not mean better Observability; it may also just mean more unused data.

For the same duration AND for all the 3 ZIP Codes, we decide to observe total_power_consumed in addition to the total_water_consumed. Therefore, we will save 3 more samples, 1 for each zip_code AND timestamp.

Growth in Real Data

Modern Time Series systems don’t have to grow along a single axis of Cardinality, Coverage, or Retention alone. Instead, the rate of ingestion and exploration warrants an expansion on all three axes.

It’s crucial to understand how samples get involved while querying and the scale and performance limitations.

Say, I want water AND power consumption for 20 minutes but only for {austin, texas, USA, 94043}.

It will only involve one Dimension, two Coverage, and two Time units.

Say, I want ONLY water consumption for 20 minutes, but for {austin, texas, USA, 94043} AND {austin, texas, USA, 95058}

It only involves one Coverage, two Dimensions, and two Time units.

Conversely, If I want Power AND Water consumption across {austin, texas, USA, 94043} AND {austin, texas, USA, 95058} but ONLY for 10 minutes.

It will involve one Time, two Dimensions, and two Coverage units.

Whereas, If I want Water and Power consumed by all Females of {austin, texas, USA, 94043}, I won’t find any results.

Remember: A sample is the most atomic record of observability that cannot be explored beyond its attributes.

Concluding

  • For better observability, you need to have more exemplary attributes. Resulting in more samples.
  • For better coverage, you need more metrics. Resulting in more samples.
  • For historical knowledge, you need more retention. Resulting in more samples.
A typical time-series database is subjected to various workloads, all requesting different kinds and ranges of data.

Found this interesting? Got a different PoV? Reach out to me on Twitter — @realmeson10 . Oh, also, you should check out last9.io; we're building reliability tools to make running systems at scale, fun, and embarrassingly easy. Check us out. :)

Share to:
Twitter
Reddit
Linkedin
#devops #sre #SRE Tooling #Last9 #Last9 Engineering #Observability #monitoring #reliability #Time Series Database

You might also like...

How we won Dukaan over
How we won Dukaan over

5 meetings. 1 month. From introductions, to a demo, and ultimately winning Dukaan over. Subhash and his team’s velocity on decision-making, moving fast, and radical candor, is a breath of fresh air in the Indian startup ecosystem.

Read ->
How to calculate HTTP content-length metrics on cli
How to calculate HTTP content-length metrics on cli

A simple guide to crunch numbers for understanding overall HTTP content length metrics.

Read ->
Last9 completes SOC II Type 2 Certification
Last9 completes SOC II Type 2 Certification

The comprehensive audit validates Last9 as a trusted SRE partner; a crucial process to work with highly regulated industries.

Read ->

SRE with Last9 is incredibly easy. But don’t just take our word for it.