Unlocking Cost Savings & New Possibilities:
Your Guide to Prometheus Remote Write 2.0
PromCon; Berlin; 11.09.2024
Alex Greenbank, Senior SWE at Grafana Labs
Bartek Płotka, Senior SWE at Google
alexgreenbank
bwplotka
alexgreenbank
bwplotka
Remote Storage
alexgreenbank
bwplotka
Remote Storage
alexgreenbank
bwplotka
Remote Storage
alexgreenbank
bwplotka
Remote Storage
alexgreenbank
bwplotka
Remote Storage
alexgreenbank
bwplotka
Remote Storage
alexgreenbank
bwplotka
Remote Storage
🎉🎉🎉🎉
alexgreenbank
bwplotka
Who are we?
Alex Greenbank
Senior Software Engineer @ Grafana Labs
alexgreenbank
bwplotka
Who are we?
Alex Greenbank
Senior Software Engineer @ Grafana Labs
alexgreenbank
bwplotka
Who are we?
Bartłomiej Płotka
Senior Software Engineer @ Google
alexgreenbank
bwplotka
Who are we?
Bartłomiej Płotka
Senior Software Engineer @ Google
alexgreenbank
bwplotka
Who are we?
Bartłomiej Płotka
Senior Software Engineer @ Google
alexgreenbank
bwplotka
Why Remote Write 2.0
alexgreenbank
bwplotka
8 Years Ago
alexgreenbank
bwplotka
8 Years Ago
alexgreenbank
bwplotka
Adoption
alexgreenbank
bwplotka
Retroactive 1.0 Spec
alexgreenbank
bwplotka
Stable for years
8y ago
1.0 today
alexgreenbank
bwplotka
Closely following Prometheus storage data model
1.0 today
PromQL
alexgreenbank
bwplotka
…but Prometheus storage evolves!
1.0 today
(both exponential and custom bucketing)
?
alexgreenbank
bwplotka
1.x Experiments
1.0 today
Experimental 1.x proto today
alexgreenbank
bwplotka
So the 2.0 work started!
alexgreenbank
bwplotka
Requirements
Prometheus RW requirements:
alexgreenbank
bwplotka
Ideas and alternatives considered
Prometheus RW requirements:
What if…
alexgreenbank
bwplotka
Ideas and alternatives considered
Prometheus RW requirements:
What if…
…we bring back gRPC? 🤔
…we make the protocol stateful? 🤔
alexgreenbank
bwplotka
Ideas and alternatives considered
Prometheus RW requirements:
What if…
…we bring back gRPC? ✖️
…we make the protocol stateful? ✖️
…we use OpenTelemetry’s OTLP? 🤔
alexgreenbank
bwplotka
Ideas and alternatives considered
Prometheus RW requirements:
What if…
…we bring back gRPC? ✖️
…we make the protocol stateful? ✖️
…we use OpenTelemetry’s OTLP?❓
…we use Arrow format? 🤔
alexgreenbank
bwplotka
Ideas and alternatives considered
Prometheus RW requirements:
What if…
…we bring back gRPC? ✖️
…we make the protocol stateful? ✖️
…we use OpenTelemetry’s OTLP?❓
…we use Arrow format?❓
NOTE: We plan to revisit and experiment with those in the future! 🤗
alexgreenbank
bwplotka
Development Process
alexgreenbank
bwplotka
What 2.0 Enables?
alexgreenbank
bwplotka
Spec: Partial write statistics
alexgreenbank
bwplotka
Spec: Native Histograms, CT, Exemplars, UTF-8
alexgreenbank
bwplotka
Spec: Metadata Always On
alexgreenbank
bwplotka
Wait, more features == overhead, no?
alexgreenbank
bwplotka
Spec: String Interning
alexgreenbank
bwplotka
Spec - String Interning
{“__name__”=”kube_pvc_capacity_bytes”,
cluster=”ops-us-east-0”,job=”arglefoo”,node=”node_a”}
{“__name__”=”kube_pvc_capacity_bytes”,
cluster=”ops-us-east-0”,job=”arglefoo”,node=”node_b”}
{“__name__”=”kube_pvc_capacity_bytes”,
cluster=”ops-us-east-0”,job=”arglebar”,node=”node_a”}
{“__name__”=”kube_pvc_capacity_bytes”,
cluster=”ops-us-east-0”,job=”arglebar”,node=”node_b”}
“__name__” -> 1
”kube_pvc_capacity_bytes” -> 2
"cluster" -> 3
"ops-us-east-0" -> 4
"job" -> 5
"arglefoo" -> 6
"node" -> 7
"node_a" -> 8
"node_b" -> 9
"arglebar" -> 10
alexgreenbank
bwplotka
Spec - String Interning
Remote Write 2.0
Remote Write 1.0
VS
alexgreenbank
bwplotka
Spec - String Interning
// RW 1.0
message TimeSeries {
…
repeated Label labels = 1;
…
}
…
message Label {
string name = 1;
string value = 2;
Remote Write 1.0
// RW 2.0
message WriteRequest {
…
repeated string symbols = 1;
…
}
message TimeSeries {
// labels_refs is a list of label
// name-value pair references,
// encoded as indices to the
// WriteRequest.symbols array.
…
repeated uint32 labels_refs = 1;
…
VS
Remote Write 2.0
alexgreenbank
bwplotka
[1.0] vs [2.0 + features]
+7.83%
-23.06%
-68.49%
alexgreenbank
bwplotka
1.0 + features vs 2.0 + features
-46.68%
-61.48%
-84.32%
alexgreenbank
bwplotka
1.0 + features vs 2.0 + features & NHCB
-58.16%
-72.64%
-88.86%
alexgreenbank
bwplotka
How you can use or adopt it?
alexgreenbank
bwplotka
How to use it
alexgreenbank
bwplotka
How to adopt it
Planned:
alexgreenbank
bwplotka
How to adopt it
alexgreenbank
bwplotka
How to adopt it
alexgreenbank
bwplotka
Summary
alexgreenbank
bwplotka
Key takeaways
alexgreenbank
bwplotka
Thank You! Questions?
Alex Greenbank, Senior SWE at Grafana
Bartek Płotka, Senior SWE at Google
Kudos to (hidden heroes)
alexgreenbank
bwplotka