1 of 26

Taking Omid to the Clouds:�Fast, Scalable Transactions for Real-Time Cloud Analytics

1

Ohad Shacham

Edward Bortnikov

Eshcar Hillel

Idit Keidar

Aran Bergman

Yonatan Gottesman

2 of 26

Agenda

Introduction to Omid (FAST’17)

Omid low latency optimizations (VLDB’18)

Apache Phoenix integration

2

3 of 26

Agenda

Introduction to Omid

Omid low latency optimizations

Apache Phoenix integration

3

4 of 26

Omid (Hope in Persian)

Transactional API over NoSQL key value

Client Library + Runtime Service

Snapshot Isolation consistency

Highly available

Open source Apache incubator

4

5 of 26

Transactions and Snapshot Isolation

Aborts only write-write conflict

5

Begin

Commit

read(x)

Write(x)

Begin

Commit

Write(x)

Read(x)

6 of 26

Omid Architecture - Write

6

Transaction Manager

Client

Begin

Data store

Data store

Data store

Commit table

Timestamp - tsr

commit

Put(k1,v1,tsr)

API

tsr

tsc

Commit(k1,k2…)

tsr

Conflict Detection

Timestamp – tsc

K1

tsr

V1

tsc

7 of 26

Omid Architecture – Read

7

Transaction Manager

Client

Begin

Data store

Data store

Data store

Commit table

Timestamp – tsr’

Get tsc

API

Get(k1,t<=tsr’)

tsr’

K1

tsr

V1

tsr

tsc

8 of 26

Omid Bottleneck

  • Omid preferred throughput over latency – batch
  • High latency

8

Transaction Manager

Commit table

Bottleneck

9 of 26

Taking Omid to the Cloud

Introduction to Omid

Omid low latency optimizations

Apache Phoenix integration

9

10 of 26

Omid Low Latency

Distribute commit table writes

    • Remove Omid bottleneck

Fast path transactions

    • New API for single key transactions

10

11 of 26

Distribute Commit Table Updates

11

Transaction Manager

Client

Begin

Data store

Data store

Data store

Commit table

Timestamp - tsr

Put(k1,v1,tsr)

API

tsr

tsc

Commit(k1,k2…)

tsr

Conflict Detection

Timestamp – tsc

K1

tsr

V1

Not so trivial!

12 of 26

SI Violation Example

12

Transaction Manager

Client 1

Data store

Data store

Data store

Commit table

API

tsr

tsc

K1

tsr

V1

K2

tsr

V2

Client 2

API

Put

Get(k1)

Begin

Timestamp - tsr

Commit(k1,k2…)

tsr

Timestamp – tsc

Get(k2)

13 of 26

SI Violation Solution - Invalidation

13

Transaction Manager

Client 1

Data store

Data store

Data store

Commit table

API

K1

tsr

tsr

INVALID

V1

K2

tsr

V2

Client 2

API

Get(k1)

Timestamp - ts

Commit(k1,k2…)

tsr

Invalidate tsr

14 of 26

Fast Path Transactions

Many workloads have singe key transactions

Wasteful access to TM for timestamps

New API – Only access data table without TM

    • brc(key)
    • bwc(key,val)
    • br(key) + wc(key,val)

14

15 of 26

Fast Path Transactions

15

Transaction Manager

Client

Data store

Commit table

API

Local Clock

Global Clock

Data store

Local Clock

bwc(k1,v1)

K1

tsr

V1

tsc

16 of 26

SI Requires Local Validation

16

Transaction Manager

Client 1

Commit table

API

Global Clock

Client 2

API

Begin

Timestamp - tsr

bwc(k1,v1)

Put

Conflict Detection

Data store

Local Clock

Data store

Local Clock

K1

tsr

V1

tsc

17 of 26

Evaluation

HBase cluster

YCSB

Transaction sizes 1-10

17

18 of 26

Throughput Latency

18

Transaction Size = 1

Transaction Size = 10

7X

2.5X

19 of 26

Latency Breakdown

19

Single key transactions

Begin Time

Data Time

Commit Time

20 of 26

Agenda

Introduction to Omid

Omid low latency optimizations

Apache Phoenix integration

20

21 of 26

Apache Phoenix

SQL interface over HBase

Transforms SQL queries into native HBase API calls

Requires transaction manager for

  • SQL transactions
  • Consistent secondary index

21

22 of 26

Omid Integration

Support Phoenix coprocessors

    • Integrate Omid’s code within HBase coprocessors

Add functionality

    • Construction of secondary indexes
    • Snapshot isolation exclude current - SIX

22

23 of 26

Secondary Index Creation

Phoenix on the fly index creation

    • Secondary index from non empty table
    • During creation table get updated

New fence API

    • Abort transaction that overlap fence

23

24 of 26

Snapshot Isolation Exclude Current - SIX

BEGIN;

INSERT INTO T

SELECT ID+10 FROM T;

COMMIT;

24

read

write

25 of 26

Checkpoints

BEGIN;

INSERT INTO T

SELECT ID+10 FROM T;

COMMIT;

25

checkpoint

read

write

26 of 26

Summary

Omid is a mature transactional layer over HBase

Omid low latency improves throughput latency and scalability

Integrated into Phoenix with new features

Available in Omid release 1.0.1 and Phoenix releases 4.15 and 5.1

26