1 of 50

Unlocking Cost Savings & New Possibilities:

Your Guide to Prometheus Remote Write 2.0

PromCon; Berlin; 11.09.2024

Alex Greenbank, Senior SWE at Grafana Labs

Bartek Płotka, Senior SWE at Google

alexgreenbank

bwplotka

alexgreenbank

bwplotka

2 of 50

Remote Storage

alexgreenbank

bwplotka

3 of 50

Remote Storage

alexgreenbank

bwplotka

4 of 50

Remote Storage

alexgreenbank

bwplotka

5 of 50

Remote Storage

alexgreenbank

bwplotka

6 of 50

Remote Storage

alexgreenbank

bwplotka

7 of 50

Remote Storage

alexgreenbank

bwplotka

8 of 50

Remote Storage

🎉🎉🎉🎉

alexgreenbank

bwplotka

9 of 50

Who are we?

Alex Greenbank

Senior Software Engineer @ Grafana Labs

  • Far too many years dev/maint for IBM Tivoli Netcool/OMNIbus
  • Escaped to do more with OSS
  • Newly minted Prometheus maintainer, OTLP contributor

alexgreenbank

bwplotka

10 of 50

Who are we?

Alex Greenbank

Senior Software Engineer @ Grafana Labs

  • Far too many years dev/maint for IBM Tivoli Netcool/OMNIbus
  • Escaped to do more with OSS
  • Newly minted Prometheus maintainer, OTLP contributor
  • 5-a-side football

alexgreenbank

bwplotka

11 of 50

Who are we?

Bartłomiej Płotka

Senior Software Engineer @ Google

  • Tech Lead for Google Cloud Managed Service for Prometheus
  • OSS Maintainer e.g. Prometheus, client_golang, Thanos and more (mostly Go libs/projects)
  • Tech Lead for CNCF TAG Observability

alexgreenbank

bwplotka

12 of 50

Who are we?

Bartłomiej Płotka

Senior Software Engineer @ Google

  • Tech Lead for Google Cloud Managed Service for Prometheus
  • OSS Maintainer e.g. Prometheus, client_golang, Thanos and more (mostly Go libs/projects)
  • Tech Lead for CNCF TAG Observability
  • Efficient Go book author: https://www.bwplotka.dev/book

alexgreenbank

bwplotka

13 of 50

Who are we?

Bartłomiej Płotka

Senior Software Engineer @ Google

  • Tech Lead for Google Cloud Managed Service for Prometheus
  • OSS Maintainer e.g. Prometheus, client_golang, Thanos and more (mostly Go libs/projects)
  • Tech Lead for CNCF TAG Observability
  • Efficient Go book author: https://www.bwplotka.dev/book
  • Motorcycling!

alexgreenbank

bwplotka

14 of 50

Why Remote Write 2.0

alexgreenbank

bwplotka

15 of 50

8 Years Ago

alexgreenbank

bwplotka

16 of 50

8 Years Ago

alexgreenbank

bwplotka

17 of 50

Adoption

alexgreenbank

bwplotka

18 of 50

Retroactive 1.0 Spec

alexgreenbank

bwplotka

19 of 50

Stable for years

8y ago

1.0 today

alexgreenbank

bwplotka

20 of 50

Closely following Prometheus storage data model

1.0 today

PromQL

alexgreenbank

bwplotka

21 of 50

…but Prometheus storage evolves!

1.0 today

  • Native Histograms

(both exponential and custom bucketing)

  • Exemplars
  • Metadata
  • Created Timestamp
  • UTF-8

?

alexgreenbank

bwplotka

22 of 50

1.x Experiments

1.0 today

Experimental 1.x proto today

alexgreenbank

bwplotka

23 of 50

So the 2.0 work started!

alexgreenbank

bwplotka

24 of 50

Requirements

Prometheus RW requirements:

  • low complexity, light dependencies
  • easy upgradability of existing clients
  • easy scalability of receivers
  • close to Prometheus storage model

alexgreenbank

bwplotka

25 of 50

Ideas and alternatives considered

Prometheus RW requirements:

  • low complexity, light dependencies
  • easy upgradability of existing clients
  • easy scalability of receivers
  • close to Prometheus storage model

What if…

alexgreenbank

bwplotka

26 of 50

Ideas and alternatives considered

Prometheus RW requirements:

  • low complexity, light dependencies
  • easy upgradability of existing clients
  • easy scalability of receivers
  • close to Prometheus storage model

What if…

…we bring back gRPC? 🤔

…we make the protocol stateful? 🤔

alexgreenbank

bwplotka

27 of 50

Ideas and alternatives considered

Prometheus RW requirements:

  • low complexity, light dependencies
  • easy upgradability of existing clients
  • easy scalability of receivers
  • close to Prometheus storage model

What if…

…we bring back gRPC? ✖️

…we make the protocol stateful? ✖️

…we use OpenTelemetry’s OTLP? 🤔

alexgreenbank

bwplotka

28 of 50

Ideas and alternatives considered

Prometheus RW requirements:

  • low complexity, light dependencies
  • easy upgradability of existing clients
  • easy scalability of receivers
  • close to Prometheus storage model

What if…

…we bring back gRPC? ✖️

…we make the protocol stateful? ✖️

…we use OpenTelemetry’s OTLP?❓

…we use Arrow format? 🤔

alexgreenbank

bwplotka

29 of 50

Ideas and alternatives considered

Prometheus RW requirements:

  • low complexity, light dependencies
  • easy upgradability of existing clients
  • easy scalability of receivers
  • close to Prometheus storage model

What if…

…we bring back gRPC? ✖️

…we make the protocol stateful? ✖️

…we use OpenTelemetry’s OTLP?❓

…we use Arrow format?❓

NOTE: We plan to revisit and experiment with those in the future! 🤗

alexgreenbank

bwplotka

30 of 50

Development Process

  • Originally a Grafana Labs Hackathon project by @cstyan
  • Working group on CNCF slack #prometheus-prw2-dev
  • Fortnightly meetings
  • prometheus/proposal repo (“Why decisions were made”)
  • Spec file in the prometheus/docs repo (“What has been decided”)

alexgreenbank

bwplotka

31 of 50

What 2.0 Enables?

alexgreenbank

bwplotka

32 of 50

Spec: Partial write statistics

  • Problem:
    • Send 100 metrics via Remote Write, 99 are ingested ok, one has a problem (too old, duplicate, etc)
    • How do you know?
      • A single HTTP status code is too coarse
  • Solution:
    • Headers to provide insight into partial writes

alexgreenbank

bwplotka

33 of 50

Spec: Native Histograms, CT, Exemplars, UTF-8

  • PRW 2.0 Spec has native support for many of recently added features
    • Native Histograms
    • Created Timestamp
    • Exemplars
    • UTF-8 support
  • Structure and Transactionality
  • Separates Remote Read and Remote Write

alexgreenbank

bwplotka

34 of 50

Spec: Metadata Always On

  • Problem(s):
    • Metadata cache only updated upon scrape
    • Only per metric family name (unique “__name__” value)
  • Solutions:
    • Metadata in the WAL (thanks: Paschalis Tsilias et al)
      • Could remove chunks of caching code
    • More repeated Data vs Stateful Implementation
      • Trade-off of more data balanced by string interning
    • Metadata now per series

alexgreenbank

bwplotka

35 of 50

Wait, more features == overhead, no?

alexgreenbank

bwplotka

36 of 50

Spec: String Interning

alexgreenbank

bwplotka

37 of 50

Spec - String Interning

{“__name__”=”kube_pvc_capacity_bytes”,

cluster=”ops-us-east-0”,job=”arglefoo”,node=”node_a”}

{“__name__”=”kube_pvc_capacity_bytes”,

cluster=”ops-us-east-0”,job=”arglefoo”,node=”node_b”}

{“__name__”=”kube_pvc_capacity_bytes”,

cluster=”ops-us-east-0”,job=”arglebar”,node=”node_a”}

{“__name__”=”kube_pvc_capacity_bytes”,

cluster=”ops-us-east-0”,job=”arglebar”,node=”node_b”}

__name__” -> 1

kube_pvc_capacity_bytes” -> 2

"cluster" -> 3

"ops-us-east-0" -> 4

"job" -> 5

"arglefoo" -> 6

"node" -> 7

"node_a" -> 8

"node_b" -> 9

"arglebar" -> 10

alexgreenbank

bwplotka

38 of 50

Spec - String Interning

Remote Write 2.0

Remote Write 1.0

VS

alexgreenbank

bwplotka

39 of 50

Spec - String Interning

// RW 1.0

message TimeSeries {

repeated Label labels = 1;

}

message Label {

string name = 1;

string value = 2;

Remote Write 1.0

// RW 2.0

message WriteRequest {

repeated string symbols = 1;

}

message TimeSeries {

// labels_refs is a list of label

// name-value pair references,

// encoded as indices to the

// WriteRequest.symbols array.

repeated uint32 labels_refs = 1;

VS

Remote Write 2.0

alexgreenbank

bwplotka

40 of 50

[1.0] vs [2.0 + features]

+7.83%

-23.06%

-68.49%

alexgreenbank

bwplotka

41 of 50

1.0 + features vs 2.0 + features

-46.68%

-61.48%

-84.32%

alexgreenbank

bwplotka

42 of 50

1.0 + features vs 2.0 + features & NHCB

-58.16%

-72.64%

-88.86%

alexgreenbank

bwplotka

43 of 50

How you can use or adopt it?

alexgreenbank

bwplotka

44 of 50

How to use it

  • Enable feature flag to write Metadata records to WAL
    • --enable-features=metadata-wal-records
  • Enable config for remote write:

alexgreenbank

bwplotka

45 of 50

How to adopt it

  • Upgrade your client/receiver to the new proto, simple content negotiation and write statistics
    • If you use Go and Prometheus remote package (client or handler): simply upgrade and tell it what protocol you want to use/accept.

Planned:

  • For senders you’re welcome to integrate with Prometheus our compliance test.
  • For testing and benchmarking we plan to upstream promtool rwtest tool.
  • For load generation you can use avalanche and k6.

alexgreenbank

bwplotka

46 of 50

How to adopt it

alexgreenbank

bwplotka

47 of 50

How to adopt it

alexgreenbank

bwplotka

48 of 50

Summary

alexgreenbank

bwplotka

49 of 50

Key takeaways

  • 2.0 enables new features while maintaining efficiency aspect.
  • Specify io.prometheus.write.v2.Request to use with Prometheus
  • If you are owner of 3P project/tool – upgrade, test compatibility, and enjoy!
  • Help us evolve remote storage story!

alexgreenbank

bwplotka

50 of 50

Thank You! Questions?

Alex Greenbank, Senior SWE at Grafana

Bartek Płotka, Senior SWE at Google

Kudos to (hidden heroes)

  • Callum, Nico, Paschalis, Tom
  • Juraj, Arthur
  • Original 1.0 authors!

Links:

alexgreenbank

bwplotka