Open Source Time Series DB Comparison
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
Comment only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZAAAB
1
read this blog before commentingDalmatinerDBInfluxDBPrometheusRiak TSOpenTSDBKairosDBElasticsearchDruidBluefloodGraphite (whisper)AtlasChronix ServerHawkularWarp 10 (distributed)HeroicAkumuliBtrDBMetricTankTgresGnocchi
2
Websitehttps://dalmatiner.io/https://influxdata.com/https://prometheus.io/
http://basho.com/products/riak-ts/
http://opentsdb.net/
https://kairosdb.github.io/
https://www.elastic.co/products/elasticsearch
http://druid.io/http://blueflood.io/https://graphiteapp.org/
https://github.com/Netflix/atlas
http://www.chronix.io/
http://www.hawkular.org/
http://www.warp10.io/https://spotify.github.io/heroichttp://akumuli.org/http://btrdb.io/
https://github.com/raintank/metrictank
https://github.com/tgres/tgreshttp://gnocchi.xyz
3
DescriptionFast distributed purpose built metric storePerformant and simple to use time series databaseAn open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.Enterprise grade time series database engineered to be faster than CassandraStores and serves massive amounts of time series data without losing granularity.Fast Time Series Database on top of Cassandra.Distributed, scalable, and highly available lucene based document store. Built for full text searches over event data.High-performance, column-oriented, distributed data store.Multi-tenant distributed metric processing system.Graphite is an enterprise-ready monitoring tool that runs equally well on cheap hardware or Cloud infrastructure.In memory database built for an extreme volume of metrics with real time search over 2 - 3 weeks of historic dataFast and efficient time series storage
based on Apache Lucene and Apache Solr.
Hawkular Metrics is a scalable, asynchronous, multi tenant, long term metrics storage engine that uses Cassandra as the data store and REST as the primary interface.The differentiating factor of Warp 10 is that both space (location) and time are considered first class citizens.Large scale time series database written by SpotifyFast, efficient, standalone time series database written in C++Very fast storage of scalar-valued timeseries data.Cassandra-backed, metrics2.0 based, multi-tenant timeseries database for Graphite and friends(Still under development) PostgreSQL-backend Golang implementation of most of Graphite API and Statsd. Internally borrows a lot from RRDTool.Gnocchi is a multi-tenant timeseries, metrics and resources database
4
CategoryReal-time AnalyticsReal-time AnalyticsMonitoring SystemReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time SearchBatch AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsBatch and Real-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time AnalyticsReal-time Analytics
5
Supported Measurementsmetricsmetrics, eventsmetricsmetricsmetricsmetricsmetrics, eventsmetricsmetricsmetricsmetricsmetricsmetrics, eventsmetrics, eventsmetrics, eventsmetricsmetricsmetricsmetrics
6
Consistency Model (CAP theorem)AP (EC)--APAPAPCP (weak consistency)APAP-APAPAPAPAP-APAPCP
7
Sharding and ReplicationAutomaticManualManual (supports federation)AutomaticAutomaticAutomaticAutomaticAutomaticAutomaticManualAutomaticAutomaticAutomaticAutomaticAutomaticManualAutomaticManualReplication (master / slave)
8
High Availability (HA)ClusteringDouble writing 2 serversDouble writing 2 serversClusteringClusteringClusteringClusteringClusteringClusteringManualClusteringClusteringClustering (multi-dc)ClusteringClusteringManualClusteringManualFailover
9
Underlying TechnologyErlang, Riak Core, ZFS, PostgreSQLGolangGolangErlang, Riak KVJava, HadoopJava, CassandraJavaJava, Zookeeper, Postgres/MySQL, HDFS/S3Java, Cassandra, ElasticsearchPythonScala, S3, EMRJava, SolrJava, CassandraJava, Hbase, Kafka, ZookeeperJava, Cassandra, Kafka, Zookeeper, ElasticsearchC++Golang, MongoDB, Ceph (optional)Golang, Cassandra, Elasticsearch (optional)Golang, PostgreSQLPython, Ceph
10
Operational ComplexityMediumLow (medium with HA)LowMediumHighMediumMediumHighHighMediumHighHighMediumHighHighLowMediumMediumMedium
11
Storage BackendCustomCustomCustomleveldbHadoop (Columnar)Cassandra (Columnar)DocumentColumnarCassandra (Columnar)CustomCustomDocumentCassandra (Columnar)HBase (Columnar)Cassandra (Columnar)CustomCustomCassandra (Columnar)PostgreSQL arrays
File (default), Ceph, OpenStack Swift, S3 or Redis
12
Supported Data Typesfloat62, int56int64, float64, bool, and stringfloat64
string, int64, double, bool, timestamp
int64, float32, float64string, float32, float64
string, int32, int64, float32, float64, bool, null
int32, float64float64float64float64float64float32, float64, stringint64, float64, bool, stringfloat64, stringfloat64float64float64float64
13
Bytes per point after compression12.21.312122212812
0.4 (with lossy compression)
129122.51.38
14
Metric Precisionvariable per bucket (milli second)nano secondmilli secondmilli secondmilli secondmilli secondmilli secondmilli secondmilli secondsecondmilli secondmilli secondmilli secondnano secondmilli secondnano secondnano secondsecondmillisecond
15
Recording typefixed intervaleventsfixed intervaleventsfixed intervalfixed intervaleventsfixed intervalfixed intervalfixed intervaleventsfixed interval
16
Write Performance - Single Node2.5 - 3.5 million metrics / sec
470k metrics / sec (custom HW)
800k metrics / sec
32k metrics / sec (calculated 130/5/0.8)
32k metrics / sec (calculated 16*2k)
60k metrics / sec30k metrics / sec25k metrics / sec60k metrics / sec300k metrics / sec60k metrics / sec60k metrics / sec2 million metrics / sec60k metrics / sec
17
Write Performance - 5 Node Cluster
15 - 20 million metrics / sec (calculated based on past tests)
--130k metrics / sec
128k metrics / sec (calculated 1 server * 5 * 0.8)
250k metrics / sec (calculated)
120k metrics / sec (calculated)100k metrics / sec (calculated)250k metrics / sec (calculated)-250k metrics / sec250k metrics /sec-250k metrics /sec
18
Query PerformanceFastMedium to FastModerateModerateModerateSlowModerateModerateSlowModerateSlowmoderate
19
Query LanguageDQL (SQL like)InfluxQL (SQL like)PromQLSQL subsetlookup onlylookup onlyQuery DSLlookup onlylookup onlylookup onlystack languageSolr querylookup onlyWarpscriptHQLlookup onlylookup onlyGraphite-like DSLGraphite-like DSL, SQL
20
Data Model
metric names, namespaced dimensions
metric names, fields, labelsmetric names, labelsmetric names, labelsmetric names, labelsmetric names, labelsmetric names, labelsmetric names, labelsmetric namesmetric namesmetric names, labelsmetric names, labelsmetric names, labels
metric names, labels, attributes
metric names, labelsmetric names, labelsmetric names, labelsmetric names, labelsmetric names
21
Ingresstcp (binary protocol), OpenTSDB (text), Graphite (text), Prometheus (text), Metrics 2.0 (text), InfluxDB (http)InfluxDB (http), InfluxDB (udp), OpenTSDB (text), OpenTSDB (http), Graphite (text) and a few othersscraping (text, protobuff)tcp (text, protobuff)http, tcp(text)tcp (text protocol), httphttphttphttp, udp (text protocol)udp (text protocol), tcp (text protocol, pickle), picklehttphttphttphttpkafka, binary (collectd binary protocol)tcp (redis text protocol)tcp (capn proto)udp (graphite), udp (statsd), tcp (graphite), http (pixel), pickle (graphite)
22
Egresshttp, tcp raw binary (no dql)httphttptcp (text, protobuff)httphttphttphttphttphttphttphttphttphttphttphttphttp, tcp (capn proto)httphttp, postgres
23
Query Language Functionality3/54/55/52/53/53/53/54/51/53/53/53/55/53/52/51/53/53/5
24
Query Language Usability4/55/54/54/51/51/53/54/51/54/51/53/41/51/54/51/51/54/54/5
25
Dynamic Cluster ManagementYes--YesYesYesYesNoYesNoYesYesYesYesNoNoNo
26
Continuous Query / Rollups / Downsampling
NoYesYesNoNoNoNoNoNoYes (rollups, downsampling)NoNoYes (downsampling)YesNoYes (continuous queries)NoNo
27
Security and ACL'sNoYesNoNoNoNoYesNoNoNoNoNoNoYesNoNoNoNo
28
Data TTL (retention policy)per bucketper database (retention policy)globalglobalnoneper metricper metricnoneglobalglobal, per metric (regex)globalper tenantYesglobalnoneNoglobalRound-Robin, per metric
29
Commercial SupportYesYesYesYesNoNoNoNoNoNoNoNoNoYesNoNoNoNoNo
30
Commercial Support Linkhttps://project-fifo.net/#supporthttps://portal.influxdata.com/
http://www.robustperception.io/
http://basho.com/contact/--https://www.elastic.co/subscriptions------http://www.cityzendata.com/-----
31
Community Sizesmalllargelargemediummediumsmalllargemediumsmalllargesmallsmallmediummediumsmallsmallsmallsmalltiny
32
LicenseMITMITApache 2Apache 2
LGPLv2.1+ and GPLv3+.
Apache 2Apache 2Apache 2Apache 2Apache 2Apache 2Apache 2Apache 2Apache 2Apache 2Apache 2GPL 3AGPL 3Apache 2Apache 2
33
Latest Versionv0.2.1v1.3.5v2.0.0-beta.2v1.4v2.3.0v1.1v5v0.9v2v0.9v1.5v0.3v0.18v1.2.1-v0.3v3.4v0.5.2-v4.0.2
34
MaturityEarly adopterStableStableEarly adopterStableStableStableStableEarly adopterStableEarly adopterEarly AdopterEarly AdopterStableEarly AdopterEarly AdopterVery Early AdopterVery Early AdopterVery Early AdopterStable
35
Pro'sReasonable to operate and scale (built on well known mature technologies). Clustering and fault tolerance is a first class citizen. High performance reads and writes and expressive query language. A steadily growing number of functions. The best option if you want TSDB features and need to scale to high reads and writes in future.Easy to operate, highly customisable, lots of cool features and good performance on a single node. Documentation is well polished. The best option if you only want TSDB features and don't need to horizontally scale.Easy to operate, good data model, high performance, lots of query functionality. The best option if you want an all in one monitoring system with a few weeks of history. Fits in really well with the container ecosystem.Extremely simple to operate, good set of features and moderate performance. Documentation and community looks good. Based on Riak KV which is excellent.Tried and tested and scales reasonably well. Was one of the first databases to use metric labels in its data model.Reasonable to operate, moderately fast writes and good data model.Easy to operate, highly customisable, moderately fast. A good option if you already have Elasticsearch in-house and don't have too much data or high performance requirements.Good data model and cool set of analytics features. Mostly designed for fast queries over large batch loaded data sets which it's great at.Good performance, highly scalableSimple to operate. Very popular online so lots of helpful blogs. The data isn't dimensional but the Graphite API makes up for some of that.Very fast and highly scalable if you have lots of money for ram. Probably good if you are Netflix or Facebook (who created Gorilla which looks similar but isn't open sourced yet).Some cutting edge ideas like semantic compression and analysis functions that can search for similar metrics.Backed by Redhat and used in ManageIQ so should be good quality.Looks good for sensor data use cases given the geo features. Extremely powerful query language. Has security ACL's which is rare. You can also setup runners to execute WarpScript jobs to do rollups and downsampling and a bunch of other stuff.The transparent federation between clusters is pretty cool. Heroic has the concept of 'suggestions' to help browse the data easily.Incredibly fast for reads and writes. Storage compression is also impressive. Looks like a well designed database. Has some cool analytics features for anomaly detection. Looks like a great database to use alongside a C++ app.Exciting to see research on different time series storage mechanisms. Not yet released fully so need a bit more info than the current docs provide.The native metrics v2.0 support could be cool in future if Grafana can make use of it to improve widgets (and alerts when that's released).PostgreSQL. Should out perform Graphite/Statsd on incoming data. Data can share the database with your application(s), tgres can be used as a Go package to easily add TS functionality to any program. There is work under way to add tags. As Tgres is agnostic to what happens at the Postgres layer there are options for active-active clustering.
36
Con'sWorks best with locally attached storage (for ZFS). Erlang may make it harder for people to dig into the code and troubleshoot or submit changes. Not much community activity and the docs are all over the place. Client library support is limited, however, a metrics proxy supporting common protocols can be used.History of bugs and breaking changes although seems better recently. Clustering no longer developed in open source edition which would make it terribly difficult to scale.More than just a TSDB and not designed to be used as a backend. Designed to use alternative backend for long term storage which is a pro for a resilient monitoring system but a con for time series database comparison.Very new database. Unknown storage efficiency.Painful to operate. The Hadoop dependency usually scares most people away.Quite slow to query. Storage is slightly inefficient.Wasn't really designed for time series. Inefficient storage.Painful to operate, not very fast write throughput. Real time ingestion is tricky to setup.Outdated data model. Needs to support labels to move up the ranking.Outdated data model. Scaling it is dreadful.In memory queries mean atlas is only good for near real time (a few weeks of data). Query language is a bit weird. More of the Netflix software around the edges of Atlas needs to be released to make it work well.Very new project. Difficult to work out write performance from figures in presentations. Query performance benchmarks appear to be on a relatively small data set.Same as all of the other Cassandra backed time series databases.No idea about performance. The distributed version depends on Hbase which can be onerous to operate. There is a standalone version based on LevelDB as an alternative. The query language may be an issue for people. See http://www.warp10.io/howto/from-influxdb/ for examples.It's really new and there aren't any releases or even tags on Github, just commits to master. Not much data released in the way of benchmarks so I've estimated based on typical performance of Cassandra storage and Elasticsearch as an index.Designed to be standalone. Still quite a new project although it looks pretty usable. Doesn't focus on a pretty query language so not quite as nice to use as something like InfluxDB. Looks incredibly impressive on the page. However, I can't use any of the benchmark claims in the table. 16.7 million writes per second is in batches of 4k metrics. This massively inflates the numbers. Similarly I can't work out bytes per datapoint from the advertised 2.93x compression. Very early stage so doesn't include a few features like clustering on top of Cassandra. No data around storage or any benchmarks although I would expect it to be comparable to other Cassandra implementations. Could use some features to alleviate the warming of in-memory metadata when adding nodes. Clustering is still flimsy and shares the back-end PG instance.
37
38
Leave a comment on a cell to contribute
39
40
Reproducible from the project docs
These fields contain links to benchmarking setup that will reproduce the advertised results
41
Not reproducible from the project docs
Benchmark results published that can't be verified by an outside source either due to lack of information of crazy choice of hardware
42
Random sourceProvided by a 3rd party unverified source. Some databases based on Cassandra have been inferred from KairosDB and Blueflood results.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...
 
 
 
Feature Comparison
Query Performance
 
 
Main menu