1 of 49

Тестирование производительности

NoSQL БД

Денис Нелюбин

2 of 49

Thumbtack Technology Inc.

3 of 49

4 of 49

A.K.A.:

Citrusleaf

Creator:

Aerospike, August 2012

License:

Proprietary,

Community edition

Category:

Key-value,

Complex data types + Secondary indexes (from v.3.0)

5 of 49

A.K.A.:

CouchDB + Membase

Creator:

Couchbase, Inc.

(CouchOne + Membase), January 2012

License:

Apache 2.0,

Proprietary (Enterprise edition)

Category:

Key-value, Document + Secondary indexes (from v.2.0)

6 of 49

A.K.A.:

Apache Cassandra

Creator:

Facebook, July 2008

License:

Apache 2.0

Category:

Key-value, BigTable, Column-oriented

7 of 49

A.K.A.:

Mongo

Creator:

10gen (MongoDB, Inc.),

March 2010

License:

AGPL, Commercial license

Category:

Document-oriented

8 of 49

A.K.A.:

Yahoo! Cloud Serving Benchmark

Creator:

Yahoo! Research,

June 2010

License:

Apache 2.0

Category:

NoSQL benchmark

YCSB

9 of 49

YCSB

Data set

  • key: "user" + 64-bit Fowler-Noll-Vo hash
  • value: 10 fields of random data

Load

  • insert N records

Run

  • update and read on N records by the key

10 of 49

YCSB

11 of 49

YCSB

Does NOT do, does NOT check:

  • join
  • secondary index
  • where clause
  • partial update

12 of 49

Why YCSB?

  • applicable to any database
  • popular
  • de-facto standard

13 of 49

Hardware

Servers:

4 * (8 * Xeon + 32GB RAM + 4 * 120GB SSD)

Clients:

8 * (4 * i5 + 4GB RAM)

Single client is not enough

14 of 49

Hardware: CPU

8 cores Xeon ≈ 4 cores i5 *

*(unproved)

15 of 49

Hardware: Network

1 Gbps is not enough

1 Gbit/sec / 1 KB of data ≈ 100 000 ops/sec

Single IO queue on single CPU is not enough

# cat /proc/interrupts | grep eth

90: 0 0 0 0 IR-PCI-MSI-edge eth0

91: 275107859 0 0 0 IR-PCI-MSI-edge eth0-TxRx-0

92: 227858040 0 0 0 IR-PCI-MSI-edge eth0-TxRx-1

93: 242082684 0 0 0 IR-PCI-MSI-edge eth0-TxRx-2

94: 230651008 0 0 0 IR-PCI-MSI-edge eth0-TxRx-3

95: 217273950 0 0 0 IR-PCI-MSI-edge eth0-TxRx-4

96: 240149262 0 0 0 IR-PCI-MSI-edge eth0-TxRx-5

97: 194736879 0 0 0 IR-PCI-MSI-edge eth0-TxRx-6

98: 270089080 0 0 0 IR-PCI-MSI-edge eth0-TxRx-7

16 of 49

Hardware: SSD

Overprovisioning

  • hdparm
  • fdisk

17 of 49

OS (GNU/Linux)

ulimit

  • nofile > 4k

RAID (RAID 0?)

  • mdadm
  • lvm

Read-ahead

  • minimal

18 of 49

Test

Data sets:

  • RAM: 50M * 100 byte ≈ 5GB
  • SSD: 500M * 100 byte ≈ 50GB
  • replication factor = 2

Workloads:

  • Heavy Write: 50% update / 50% read
  • Mostly Read: 5% update / 95% read

Consistency:

  • Sync replication
  • Async replication

19 of 49

Insert, RAM

20 of 49

Insert, SSD

21 of 49

Heavy Update, RAM

22 of 49

Heavy Update, SSD

23 of 49

Heavy Update, Latency

24 of 49

Mostly Read, RAM

25 of 49

Mostly Read, SSD

26 of 49

Speed

Insert

Couchbase*

Aerospike

Cassandra

MongoDB

Update

Couchbase*

Aerospike

Cassandra

MongoDB

Read

Aerospike

Couchbase*

MongoDB

Cassandra

* in memory or on smaller data set

27 of 49

Failover test

  • 50%, 75%, 100% of max throughput
  • Heavy Update

  • 10min warmup
  • kill -9
  • 10min without one node
  • service start
  • 20min after restore

28 of 49

Aerospike, Sync, SSD, 50%

29 of 49

Cassandra, Async, SSD, 50%

30 of 49

Couchbase, Async, RAM, 100%

31 of 49

MongoDB, Async, SSD, 50%

32 of 49

33 of 49

34 of 49

35 of 49

36 of 49

Replication

MongoDB

Cassandra

Couchbase/Aerospike

37 of 49

Data storing reliability

Cassandra

MongoDB

Aerospike

Couchbase

archive

live data

fast cache, eviction

cache (async only)

38 of 49

Capacity

Cassandra

MongoDB

Aerospike

Couchbase*

packed archive

unpacked live data

indexes in RAM + SSD

metadata and cache in RAM

* was able to take only 200M records

39 of 49

Deployment

Couchbase

Aerospike

Cassandra

MongoDB

four clicks

powerful config

config+config+calculator

shards of replica-sets

40 of 49

Managing

Couchbase

MongoDB

Cassandra

Aerospike

superduperwebconsole

commands and docs *

exists **

raw ***

*

**

***

use MMS (MongoDB Management Service)

use DataStax products

try AMC (Aerospike Monitoring Console)

41 of 49

Unique features

Aerospike

  • SSD support, speed

Couchbase

  • good web console, easy deployment

Cassandra

  • writes faster than reads ;)

MongoDB

  • documents

42 of 49

Troublesomes

Aerospike

  • eviction
  • secret config options
  • long start

Couchbase

  • big data
  • strange client behaviour
  • long start
  • long shutdown

43 of 49

Troublesomes

Cassandra

  • need to think about the config ;)

MongoDB

  • mongos have to be restarted
  • replica-set is too surviving ;)

44 of 49

When to use: Aerospike

Big Fast Cache

45 of 49

When to use: Couchbase

In-memory Cache with Persistence

46 of 49

When to use: Cassandra

Big-Data Archive

47 of 49

When to use: MongoDB

Universal DB for Web

48 of 49

Not only YCSB

  • From scratch, inspired by YCSB
  • More tests
    • Secondary indexes (cardinality, overhead)
    • Aggregation (average value)
    • Collection data types (stack, array, wide row)

49 of 49

Thanks

Denis Nelubin <dnelubin@gmail.com>

Alexey Remnev <aremnev@thumbtack.net>