Taking Omid to the Clouds:�Fast, Scalable Transactions for Real-Time Cloud Analytics
1
Ohad Shacham
Edward Bortnikov
Eshcar Hillel
Idit Keidar
Aran Bergman
Yonatan Gottesman
Agenda
Introduction to Omid (FAST’17)
Omid low latency optimizations (VLDB’18)
Apache Phoenix integration
2
Agenda
Introduction to Omid
Omid low latency optimizations
Apache Phoenix integration
3
Omid (Hope in Persian)
Transactional API over NoSQL key value
Client Library + Runtime Service
Snapshot Isolation consistency
Highly available
Open source Apache incubator
4
Transactions and Snapshot Isolation
Aborts only write-write conflict
5
Begin
Commit
read(x)
Write(x)
Begin
Commit
Write(x)
Read(x)
Omid Architecture - Write
6
Transaction Manager
Client
Begin
Data store
Data store
Data store
Commit table
Timestamp - tsr
commit
Put(k1,v1,tsr)
API
tsr
tsc
Commit(k1,k2…)
tsr
Conflict Detection
Timestamp – tsc
K1
tsr
V1
tsc
Omid Architecture – Read
7
Transaction Manager
Client
Begin
Data store
Data store
Data store
Commit table
Timestamp – tsr’
Get tsc
API
Get(k1,t<=tsr’)
tsr’
K1
tsr
V1
tsr
tsc
Omid Bottleneck
8
Transaction Manager
Commit table
Bottleneck
Taking Omid to the Cloud
Introduction to Omid
Omid low latency optimizations
Apache Phoenix integration
9
Omid Low Latency
Distribute commit table writes
Fast path transactions
10
Distribute Commit Table Updates
11
Transaction Manager
Client
Begin
Data store
Data store
Data store
Commit table
Timestamp - tsr
Put(k1,v1,tsr)
API
tsr
tsc
Commit(k1,k2…)
tsr
Conflict Detection
Timestamp – tsc
K1
tsr
V1
Not so trivial!
SI Violation Example
12
Transaction Manager
Client 1
Data store
Data store
Data store
Commit table
API
tsr
tsc
K1
tsr
V1
K2
tsr
V2
Client 2
API
Put
Get(k1)
Begin
Timestamp - tsr
Commit(k1,k2…)
tsr
Timestamp – tsc
Get(k2)
SI Violation Solution - Invalidation
13
Transaction Manager
Client 1
Data store
Data store
Data store
Commit table
API
K1
tsr
tsr
INVALID
V1
K2
tsr
V2
Client 2
API
Get(k1)
Timestamp - ts
Commit(k1,k2…)
tsr
Invalidate tsr
Fast Path Transactions
Many workloads have singe key transactions
Wasteful access to TM for timestamps
New API – Only access data table without TM
14
Fast Path Transactions
15
Transaction Manager
Client
Data store
Commit table
API
Local Clock
Global Clock
Data store
Local Clock
bwc(k1,v1)
K1
tsr
V1
tsc
SI Requires Local Validation
16
Transaction Manager
Client 1
Commit table
API
Global Clock
Client 2
API
Begin
Timestamp - tsr
bwc(k1,v1)
Put
Conflict Detection
Data store
Local Clock
Data store
Local Clock
K1
tsr
V1
tsc
Evaluation
HBase cluster
YCSB
Transaction sizes 1-10
17
Throughput Latency
18
Transaction Size = 1
Transaction Size = 10
7X
2.5X
Latency Breakdown
19
Single key transactions
Begin Time
Data Time
Commit Time
Agenda
Introduction to Omid
Omid low latency optimizations
Apache Phoenix integration
20
Apache Phoenix
SQL interface over HBase
Transforms SQL queries into native HBase API calls
Requires transaction manager for
21
Omid Integration
Support Phoenix coprocessors
Add functionality
22
Secondary Index Creation
Phoenix on the fly index creation
New fence API
23
Snapshot Isolation Exclude Current - SIX
BEGIN;
INSERT INTO T
SELECT ID+10 FROM T;
COMMIT;
24
read
write
Checkpoints
BEGIN;
INSERT INTO T
SELECT ID+10 FROM T;
COMMIT;
25
checkpoint
read
write
Summary
Omid is a mature transactional layer over HBase
Omid low latency improves throughput latency and scalability
Integrated into Phoenix with new features
Available in Omid release 1.0.1 and Phoenix releases 4.15 and 5.1
26