The future is bright, the future is remote write.
Tom Wilkie, Grafana Labs
Tom Wilkie, VP Product
Prometheus team, Cortex and Loki Author
@tom_wilkie tomwilkie
What is Remote Write?
V1: Standardising Remote Write
Next: Metadata, Exemplars
Remote Write NG: Future Ideas
| 3
What is Remote Write?
| 4
| 5
us-central-1
asia-east-1
eu-west-1
Central Cortex Cluster
https://prometheus.io/docs/operating/integrations/
us-central-1
asia-east-1
eu-west-1
Global Federation
us-central-1
asia-east-1
eu-west-1
Central Prometheus
V1: Standardising Remote Write
| 12
| 13
| 14
V1: Standardising Remote Write
A precise, RFC 2119 “style” specification of the remote write protocol.
Explains the “why” behind a lot of the decisions.
Includes design for “future-proofing” and upgrades.
| 15
Testing Against the Standard
| 16
Agent
Test
scrape
remote_write
| Gauge | Sorted Labels | Histograms | Job Label | Up Metric | Staleness Markers |
Prometheus 2.26.0 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Grafana Agent 0.13.1 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
vmagent 1.58.0 | ✅ | ✅ | ✅ | ✅ | ❌* | ❌ |
Influx Telegraf 1.18.0 | ✅ | ✅ | ✅ | ✅* | ❌ | ❌ |
OpenTelemetry Collector 0.24.0 | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
Next: Metadata, Examplars
| 18
Metric Metadata
| 19
Metric Metadata
| 20
Exemplars
| 21
Exemplars
| 22
Remote Write NG: Future Ideas
| 23
Atomicity
# HELP prometheus_http_request_duration_seconds Histogram of latencies for HTTP requests.
# TYPE prometheus_http_request_duration_seconds histogram
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.1"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.2"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.4"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="1"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="3"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="8"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="20"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="60"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="120"} 17
prometheus_http_request_duration_seconds_bucket{handler="/",le="+Inf"} 17
| 24
Atomicity
Prometheus ensures that queries only see a “consistent” snapshot of a scrape.
Remote write does not.
| 25
429 Handling
| 26
WAL
429 Handling
| 27
WAL
429 Handling
| 28
WAL
429 Handling
| 29
WAL
429 Handling
| 30
WAL
Bandwidth
For our internal monitoring at Grafana Labs, we’re doing >90MB/s in remote write.
| 31
What is Remote Write?
V1: Standardising Remote Write
Next: Metadata, Examplars
Remote Write NG: Future Ideas
| 32
Thank you!
Questions?
| 33
| Score |
Prometheus 2.26.0 | 100% |
Grafana Agent 0.13.1 | 100% |
vmagent 1.58.0 | 76% |
Influx Telegraf 1.18.0 | 64% |
OpenTel Collector 0.25.0 | 47% |
OpenTel Collector (main) | 59% |
Update
2021-05-03