1 of 34

The future is bright, the future is remote write.

Tom Wilkie, Grafana Labs

2 of 34

Tom Wilkie, VP Product

Prometheus team, Cortex and Loki Author

@tom_wilkie tomwilkie

3 of 34

What is Remote Write?

V1: Standardising Remote Write

Next: Metadata, Exemplars

Remote Write NG: Future Ideas

| 3

4 of 34

What is Remote Write?

| 4

5 of 34

| 5

6 of 34

us-central-1

asia-east-1

eu-west-1

Central Cortex Cluster

7 of 34

8 of 34

https://prometheus.io/docs/operating/integrations/

9 of 34

10 of 34

us-central-1

asia-east-1

eu-west-1

Global Federation

11 of 34

us-central-1

asia-east-1

eu-west-1

Central Prometheus

12 of 34

V1: Standardising Remote Write

| 12

13 of 34

| 13

14 of 34

| 14

15 of 34

V1: Standardising Remote Write

A precise, RFC 2119 “style” specification of the remote write protocol.

Explains the “why” behind a lot of the decisions.

Includes design for “future-proofing” and upgrades.

| 15

16 of 34

Testing Against the Standard

| 16

Agent

Test

scrape

remote_write

17 of 34

Gauge

Sorted Labels

Histograms

Job Label

Up Metric

Staleness Markers

Prometheus

2.26.0

Grafana Agent

0.13.1

vmagent

1.58.0

❌*

Influx Telegraf

1.18.0

✅*

OpenTelemetry Collector

0.24.0

18 of 34

Next: Metadata, Examplars

| 18

19 of 34

Metric Metadata

| 19

20 of 34

Metric Metadata

| 20

21 of 34

Exemplars

| 21

22 of 34

Exemplars

| 22

23 of 34

Remote Write NG: Future Ideas

| 23

24 of 34

Atomicity

# HELP prometheus_http_request_duration_seconds Histogram of latencies for HTTP requests.

# TYPE prometheus_http_request_duration_seconds histogram

prometheus_http_request_duration_seconds_bucket{handler="/",le="0.1"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="0.2"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="0.4"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="1"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="3"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="8"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="20"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="60"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="120"} 17

prometheus_http_request_duration_seconds_bucket{handler="/",le="+Inf"} 17

| 24

25 of 34

Atomicity

Prometheus ensures that queries only see a “consistent” snapshot of a scrape.

Remote write does not.

| 25

26 of 34

429 Handling

| 26

WAL

27 of 34

429 Handling

| 27

WAL

28 of 34

429 Handling

| 28

WAL

29 of 34

429 Handling

| 29

WAL

30 of 34

429 Handling

| 30

WAL

31 of 34

Bandwidth

For our internal monitoring at Grafana Labs, we’re doing >90MB/s in remote write.

| 31

32 of 34

What is Remote Write?

V1: Standardising Remote Write

Next: Metadata, Examplars

Remote Write NG: Future Ideas

| 32

33 of 34

Thank you!

Questions?

| 33

34 of 34

Score

Prometheus

2.26.0

100%

Grafana Agent

0.13.1

100%

vmagent

1.58.0

76%

Influx Telegraf

1.18.0

64%

OpenTel Collector

0.25.0

47%

OpenTel Collector

(main)

59%

Update

2021-05-03