The Prometheus Time Series Database
Björn “Beorn” Rabenstein, Production Engineer, SoundCloud Ltd.
This is about sample storage…
…not indexing (as needed for PromQL)
Sample: 64bit timestamp + 64bit floating point value
The fundamental problem of TSDBs
Orthogonal write and read patterns.
Time (~weeks)
Time
Series
(~millions)
Writes
Reads
External storage needed
Key-Value store (with BigTable semantics) seems suitable.
...
http_requests_total{status="200",method="GET"}@1434317560938 ⇒ 94355
http_requests_total{status="200",method="GET"}@1434317561287 ⇒ 94934
http_requests_total{status="200",method="GET"}@1434317562344 ⇒ 96483
http_requests_total{status="404",method="GET"}@1434317560938 ⇒ 38473
http_requests_total{status="404",method="GET"}@1434317561249 ⇒ 38544
http_requests_total{status="404",method="GET"}@1434317562588 ⇒ 38663
http_requests_total{status="200",method="POST"}@1434317560885 ⇒ 4748
http_requests_total{status="200",method="POST"}@1434317561483 ⇒ 4795
http_requests_total{status="200",method="POST"}@1434317562589 ⇒ 4833
http_requests_total{status="404",method="POST"}@1434317560939 ⇒ 122
...
Metric name
Dimensions aka Labels
Timestamp
Sample Value
VALUE
KEY
Google Cloud Bigtable Schema Design
https://cloud.google.com/bigtable/docs/schema-design-time-series
Why is in-memory compression needed?
Gorilla vs. Prometheus
Gorilla: A Fast, Scalable, In-Memory Time Series Database, T. Pelkonen et al.
Proceedings of the VLDB Endowment
Volume 8 Issue 12, August 2015
Pages 1816-1827
1.37 bytes/sample 3.3 bytes/sample
ΔΔ
v1
v2
-storage.local.chunk-encoding-version
Prometheus’s chunked storage
Series iterator
Chunk iterators
Chunks
Timestamp compression
“Pretty” regular sample intervals.
sample count
timestamp
0 → 1000s
1 → 1015s
2 → 1029s
3 → 1046s
4 → 1060s
0 → 1000s
1 → 1015s
2 → 1029s
3 → 1046s
4 → 1060s
15s
29s
46s
60s
15s
14s
17s
14s
-1s
1s
0s
-1s
3s
-3s
store with
variable
bit width
(1, 9, 12, 16, 36) as required per sample
store in fixed bit width (8, 16, 32) as required by sample set
v1
Prometheus v2 timestamp encoding
Almost like Gorilla, with different bit buckets…
BUT:
Value compression
Way more tricky...
64bit floating point numbers. Ugh...
Constant value time series
Prometheus v1/2
→ 0bit/sample.
Gorilla
→ 1bit/sample.
The best case for a Prometheus v2 chunk
Constant metric value, perfectly regular scraping.
124,547 samples
(3w with 15s scrape interval)
0.066 bits/sample
Regularly increasing values
Prometheus v1
Gorilla
v1
More or less random values
Prometheus
Gorilla
v1
Prometheus v2 value encoding
Picks the first that works from the following list:
If you dare, check out storage/local/varbit.go.
1.28 bytes/sample (typical SoundCloud server)
Constant-size chunks.
1024 bytes.
chunk in memory
(complete and immutable)
head chunk
(incomplete)
Sample
Ingestion
memory
disk
evictable chunks (LRU)
chunk on disk
(complete and immutable)
PromQL
Query Engine
one file per time series
-storage.local.max-chunks-to-persist
-storage.local.memory-chunks
Series maintenance.
memory
disk
older than retention time
(and larger than 10% of file)
-storage.local.retention
-storage.local.series-file-shrink-ratio
-storage.local.series-sync-strategy
Chunk preloading.
memory
disk
PromQL
Query Engine
Checkpointing.
On shutdown and regularly to limit data loss in case of a crash.
memory
disk
checkpoint file
-storage.local.checkpoint-interval
-storage.local.checkpoint-dirty-series-limit