1 of 65

Transactions

2 of 65

3 of 65

SQL Writes

UPDATE Product�SET Price = Price – 1.99�WHERE pname = ‘Gizmo’

INSERT INTO SmallProduct(name, price)

SELECT pname, price� FROM Product

WHERE price <= 0.99

DELETE Product� WHERE price <=0.99

4 of 65

SQL Writes

UPDATE Product�SET Price = Price – 1.99�WHERE pname = ‘Gizmo’

INSERT INTO SmallProduct(name, price)

SELECT pname, price� FROM Product

WHERE price <= 0.99

DELETE Product� WHERE price <=0.99

5 of 65

How?

Example

Game App

DB v0

(Recap lectures)

DBMS

Real-Time

User Events

Report & Share�Business/Product Analysis

Q1: 1000 users/sec?

Q2: Offline?

Q3: Support v1, v1’ versions?

Q4: Which user cohorts?

Q5: Next features to build? Experiments to run?

Q6: Predict ads demand?

Mobile Game

Q7: How to model/evolve game data?

Q8: How to scale to millions of users?

Q9: When machines die, restore game state gracefully?

App designer

Systems designer

DB

Product/Biz designer

6 of 65

How?

Example

Game App

DB v0

(Recap lectures)

DBMS

Real-Time

User Events

Report & Share�Business/Product Analysis

Q1: 1000 users/sec?

Q2: Offline?

Q3: Support v1, v1’ versions?

Q4: Which user cohorts?

Q5: Next features to build? Experiments to run?

Q6: Predict ads demand?

Mobile Game

Q7: How to model/evolve game data?

Q8: How to scale to millions of users?

Q9: When machines crash, restore game state gracefully?

App designer

Systems designer

DB

Product/Biz designer

7 of 65

Today’s Lecture

  1. Why Transactions?

  • Transactions

  • Properties of Transactions: ACID

  • Logging

8 of 65

Example

Unpack

ATM DB:

Transaction

Read Balance

Give money

Update Balance

Read Balance

Update Balance

Give money

vs

9 of 65

Visa does > 60,000 TXNs/sec with users & merchants

Want your 4$ Starbucks transaction to wait for a stranger’s 10k$ bet in Las Vegas ?

⇒ Transactions can (1) be quick or take a long time, (2) unrelated to you

10 of 65

Transactions are at the core of

-- payment, stock market, banks, ticketing

-- Gmail, Google Docs (e.g., multiple people editing)

11 of 65

Example Visa DB

Flush/Write to SSD or HDD

Visa

DB

Transaction Queue

  • 60000 user TXNs/sec
  • Monthly 10% Interest TXN

Design#1 [RAM only]

For each Transaction in Queue

  • For relevant records
    • Read, Update records in RAM

[Obviously bad → e.g., turn off power, lose $$s]

Design#2 [RAM + Write/Flush on HDD/SSD]

For each Transaction in Queue

  • For relevant records
    • Read, Update records in RAM
    • Flush/Write updates on SSD/HDD

[Let’s study tradeoffs in next few slides]

12 of 65

Example

Monthly

Visa bank interest transaction

‘T-Monthly-423’

Monthly Interest 10%

4:28 am Starts run on 100M bank accounts

Takes 24 hours to run

Money

Money (@4:29 am day+1)

UPDATE Money

SET Balance = Balance * 1.1

13 of 65

Example

Monthly bank interest transaction

With crash

‘T-Monthly-423’

Monthly Interest 10%

4:28 am Starts run on 100M bank accounts

Takes 24 hours to run

Network outage at 10:29 am,

System access at 10:45 am

Money

Money (@10:45 am)

??

??

??

??

Did T-Monthly-423 complete?

Which tuples are bad?

Case1: T-Monthly-423 crashed

Case2: T-Monthly-423 completed 4002 deposited 20$ at 10:45 am

Problem 1: Wrong :(

14 of 65

Example

Monthly Visa bank interest transaction

Performance

Money

Money (@4:29 am day+1)

Cost to update all data

100M bank accounts → 100M updates?

Worst case

(@10 msec/HDD seek, that’s 1 million secs)

(@10 musec/SSD access, ~1000 secs)

Problem2: SLOW to run :(

Problem3: How does user access money in this time?

15 of 65

Problems

Problem1: How can a {Bank, Visa, Fintech, NASDAQ} update 100 million records correctly?

Problem2: How to update 100 million records in Seconds (not 1000s to 1 million secs)?

Problem3: How to let users have access their accounts?

Problem4: Make it easy for the application engineer!!!

16 of 65

PreviewOur eventual solution (after 2 weeks)

Commit

WAL Flush to disk

Visa

DB

Transaction Queue

  • 60000 user TXNs/sec
  • Monthly 10% Interest TXN

Design#3 VisaDB

For each Transaction in Queue

  • For relevant records
    • Use 2PL to acquire/release locks
    • Read, Process, Write records
    • WAL Logs for updates
  • Commit or Abort

[Note: We’ll focus on the BLUE parts in next lectures]

17 of 65

Motivation for Transactions

Group user actions (reads & writes) into Transactions (TXNs) helps with two goals:

  1. Recovery & Durability: Keep the data consistent and durable. Despite system crashes, user canceling TXN part way, etc.

  • Concurrency: Get better performance by parallelizing TXNs without creating ‘bad data.’ Despite slow disk writes and reads.

This lecture!

Next lecture

Idea: Use LOGS. Support to “commit” or “rollback” TXNs

Idea: Use LOCKS. Run several user TXNs concurrently.

18 of 65

Today’s Lecture

  • Why Transactions?

  • Properties of Transactions: ACID

  • Logging

19 of 65

Transactions: Basic Definition

A transaction (“TXN”) is a sequence of one or more operations (reads or writes) which reflects a single real-world transition.

START TRANSACTION

UPDATE Product� SET Price = Price – 1.99� WHERE pname = ‘Gizmo’

COMMIT

In the real world, a TXN either happened completely or not at all (e.g., you withdrew 100$ from bank. Or not.)

20 of 65

Transactions in SQL

  • In “ad-hoc” SQL, each statement = one transaction

  • In a program, multiple statements can be grouped together as a transaction

START TRANSACTION

UPDATE Bank SET amount = amount – 100

WHERE name = ‘Bob’

UPDATE Bank SET amount = amount + 100

WHERE name = ‘Joe’

COMMIT

21 of 65

Example 1: Protection against crashes / aborts

Client 1:

INSERT INTO CheapProduct(name, price)

SELECT pname, price� FROM Product

WHERE price <= 0.99

DELETE Product

WHERE price <=0.99

What goes wrong?

Crash / abort!

Scenario: Make a CheapProducts table, from a Products table

22 of 65

Client 1:

START TRANSACTION

INSERT INTO CheapProduct(name, price)

SELECT pname, price� FROM Product

WHERE price <= 0.99

DELETE Product� WHERE price <=0.99

COMMIT

Now we’d be fine! We’ll see how / why this lecture

23 of 65

Example 2: Multiple users: single statements

Client 1: [at 10:01 am]

UPDATE Product� SET Price = Price – 1.99� WHERE pname = ‘Gizmo’

��

Client 2: [at 10:01 am]

UPDATE Product� SET Price = Price*0.5� WHERE pname=‘Gizmo’

��

Two managers attempt to change prices at same time -

What could go wrong?

24 of 65

Client 1: START TRANSACTION

UPDATE Product� SET Price = Price – 1.99� WHERE pname = ‘Gizmo’

COMMIT��

Now works like a charm- we’ll see how / why next lecture…

Client 2: START TRANSACTION

UPDATE Product� SET Price = Price*0.5� WHERE pname=‘Gizmo’

COMMIT��

25 of 65

3. Properties of Transactions

26 of 65

What you will learn about in this section

  1. Atomicity
  2. Consistency
  3. Isolation
  4. Durability

27 of 65

ACID: Atomicity

  • TXN is all or nothing
    • Commits: all the changes are made

    • Aborts: no changes are made

28 of 65

ACID: Consistency

  • The tables must always satisfy user-specified integrity constraints

(E.g., Account number is unique, Sum of debits and of credits is 0)

  • How DBs support consistency? Split responsibility
      • Programmer knows needed logic. Will assert a TXN will go from one consistent state to a consistent state
      • System asserts the TXN is atomic (e.g., if EXCEPTION, rolls back)

29 of 65

ACID: Isolation

  • A TXN executes concurrently with other TXNs

  • Effect of TXNs is the same as TXNs running one after another

Conceptually,

    • similar to OS “sandboxes”
    • E.g. TXNs can’t observe each other’s “partial updates”

30 of 65

ACID: Durability

  • The effect of a TXN must persist after the TXN
    • And after the whole program has terminated
    • And even if there are power failures, crashes, etc.

  • Write data to durable IO (e.g., disk)

31 of 65

ACID Summary

  • Atomic
    • State shows either all the effects of TXN, or none of them
  • Consistent
    • TXN moves from a state where integrity holds, to another where integrity holds
  • Isolated
    • Effect of TXNs is the same as TXNs running one after another
  • Durable
    • Once a TXN has committed, its effects remain in the database

32 of 65

A Note: ACID is one popular option!

  • Many debates over ACID, both historically and currently

  • Some “NoSQL” DBMSs relax ACID

  • In turn, now “NewSQL” reintroduces ACID compliance to NoSQL-style DBMSs…

⇒ Usually, depends on what consistency and performance your application needs

ACID is an extremely important & successful paradigm, but still debated!

33 of 65

4. Atomicity & Durability via Logging

34 of 65

Conceptual

Idea:

Plan a trip with friends.

⇒ TODOs in NY, LA, SF, and Berlin

Execute your TODO list in random sequence, as soon as the ideas flow in from friends?

OR

Plan your TODOs in NY, in LA, in Berlin? And commit by buying Tickets?

35 of 65

Big Idea

LOGS!

(aka TODO/

ledger)

Recall (on disks)

    • For SMJ/Indexing, we exploited – Block reads FASTER than random reads
    • Now – Appends are FASTER than random writes
    • [ See Notebook – ]

Many kinds of LOGs. We’ll study a few key ones!

Big Idea: LOGs (or log files or ledger)

    • Any value that changes? Append to LOG!
      • LOG is a compact “todo” list of data updates
    • Intuition:
      • Data pages: (a) Update in RAM (fast) (b) Update on disk later (slow)
      • LOGs: ( c) Append “todo” in LOGs and (d) control when you Flush LOGs to disk

36 of 65

Basic Idea: (Physical) Logging

Idea:

    • Log consists of an ordered list of Update Records
    • Log record contains UNDO information for every update!

<TransactionID, &reference, old value, new value>

(e.g., key)

What DB does?

    • Owns the log “service” for all applications/transactions.
    • Appends to log. Flush when necessary — force writes to disk

This is sufficient to UNDO any transaction!

37 of 65

Couple of changes,

Planning ahead for next 4 weeks

Bigger CS class

  • More diverse in backgrounds
  • Ed posts are growing faster than we can search, or respond to
    • On k topics, we now have ~2^k threads (various combos)
  • Recalibrating before Transactions

Transactions are important, and tricky

  1. Released short videos on focused topics [two on Logging, ~10 mins each]
  2. Goal:
    1. Set up as videos you can view separately on focused sub-topics
    2. We’ll use these examples in class to spark questions
    3. Set up focused Ed posts – one on LOGGING, one on LOCKING, etc. Easier for students to find, navigate, connect similar questions other’s have or are exploring, and help peers on this tricky topic. (And for us to answer connected questions)

Why are Transactions tricky, and important?

  1. Old, New, and Evolving
    1. Old: Classic Jim Gray text book on Transaction Processing on HDDs [e.g., cs 345]
    2. Evolving: Machine architectures, HDDs vs SSDs performance
    3. Industry + Research – new problems in OLAP → OLTP
      1. Past 10 yrs lots on OLAP - e.g., BQ/RedShift, etc.
      2. Next 10 yrs – Big investments are happening in OLTP now

38 of 65

Problems

Problem1: How can a {Bank, Visa, Fintech, NASDAQ} update 100 million records correctly?

Problem2: How to update 100 million records in Seconds (not 1000s to 1 million secs)?

Problem3: How to let users have access their accounts?

Problem4: Make it easy for the application engineer!!!

39 of 65

Example1: “TXN-INC5”

1 TXN, 1 value

numLikes = 30 // Stored in an IO block in DB

numLikes++ 5 times

Possible values for numLikes: // Updated in IO block in DB

TXN Aborts (fails/crashes) ⇒ 30

TXN Commit (aka succeeds) ⇒ 35

40 of 65

End to end Example

41 of 65

Key questions

Recall: A TXN can either COMMIT or ABORT. DB guarantees below, whether the TXN succeeds or fails.

  • Atomicity
  • Consistency
  • Isolation
  • Durability

For 1 TXN, 1 value, we saw a full run. But . . .

  1. Is it correct? What happens if there’s a crash?
  2. Why is there a speedup?
  3. How to extend?

⇒ Let’s focus on A and D in ACID.

42 of 65

Example1

Review TXN if there’s a crash

How to recover?

43 of 65

Extending to 1 TXN, 2 values

Extension #1

7

7

7

13

Consider updating <numLikes, numViews>.

⇒ Same process as earlier. (2) Read each value from Blocks, (3) update in LOG.

numViews = f(...)

numViews = f(...)

7 -> 13

44 of 65

General Algorithm. . .

45 of 65

WAL Logging

46 of 65

Definitions

<Tid, &numLikes, 31,32>

Update record

(aka Undo or Undo/Redo logs)

COMMIT record

Flushing to disk

  1. Writing to disk from RAM
  2. For LOGs, appending to LOG-Disk

Transaction from DB’s POV

  1. Write LOGs to LOG-Disk
  2. Notify user (or application), that DB COMMITs and takes full responsibility
  3. Until then, DB may ABORT TXN. User (or application) may need to rerun TXN.

47 of 65

What if no more space in RAM? Or in LOG-RAM?

Extension #2

We may need to append partial results of TXNs on LOG-DISK due to

    • Memory constraints (e.g. , billions of updates)
    • Time constraints (what if one TXN takes very long?)

⇒ We could use the same LOG-DISK for UNDOing partial TXNs. That is, we use the LOG for Atomicity (in addition to Durability).

48 of 65

Formalizing

Write-Ahead Logging (WAL)�TXN Commit Protocol

49 of 65

Write-Ahead Logging (WAL)

Atomicity

Durability

Data Flush

Flush Update

Record to LOG-Disk

Rule1: For each tuple update

Flush COMMIT Record to LOG-Disk

TXN

COMMIT

Rule2: Before TXN commits

Time > 0

Time > 0

Algorithm: WAL

For each tuple update, write Update Record into LOG-RAM

Follow two Flush rules for LOG

  • Rule1: Flush Update Record into LOG-Disk before corresponding data page goes to storage
  • Rule2: Before TXN commits,
  • Flush all Update Records to LOG-Disk
  • Flush COMMIT Record to LOG-Disk

Transaction is committed once COMMIT record is on stable storage

Note: fixing an unintended visual association. We get the combo of Durabiity + Atomicity, from Rules 1 and 2.Adding a ‘curly brace’

50 of 65

Records

Logs

A, B

R(A), R(B)

A, B

W(A), W(B)

A’, B’

U(A’)

Start TXN

U(A’)

U(B’)

A’, B

U(B’)

A’, B’

CommitRecord

CommitRecord

COMMIT TXN

1. Can A’ be flushed to disk before U(A’)?

2. Can A’ be flushed to disk before CommitRecord?

3. Can B’ be flushed to disk after CommitRecord?

UpdateRecord U(A’) for

<A, A’, TXN>

Example WAL Sequence

Time

Later …

No [rule 1]

Yes

Yes

Rows A, B are in 2 IOblocks

51 of 65

Example: Bank Transaction

52 of 65

Example

Monthly bank interest transaction

Full run

‘T-Monthly-423’

Monthly Interest 10%

4:28 am Starts run on 100M bank accounts

Takes 24 hours to run

WA Log (@4:29 am day+1)

Money

Money (@4:29 am day+1)

START TRANSACTION

UPDATE MoneySET Amt = Amt * 1.10

COMMIT

Update

Records

Commit

Record

53 of 65

Example

Monthly bank interest transaction

With crash

TXN ‘T-Monthly-423’

Monthly Interest 10%

4:28 am Starts run on 100M bank accounts

Takes 24 hours to run

Network outage at 10:29 am,

System access at 10:45 am

Money

Money (@10:45 am)

WA Log (@10:29 am)

??

??

??

??

Did T-Monthly-423 complete?

Which tuples are bad?

Case1: T-Monthly-423 was crashed

Case2: T-Monthly-423 completed. 4002 deposited 20$ at 10:45 am

Can you infer from RED Update logs?

⇒ Yup. There is NO TXN Commit.As part of recovery, undo any data updates already flushed.

54 of 65

Example

Monthly bank interest transaction

Recovery

Money (@10:45 am)

Money (after recovery)

WA Log (@10:29 am)

System recovery (after 10:45 am)

  1. Rollback uncommitted TXNs as ABORTed

- Restore old values from Log-Disk (if any)

- Notify developers about ABORTed TXN

  • Redo Recent COMMIT-ed TXNs (w/ new values)
  • Back in business

55 of 65

Example

Monthly bank interest transaction

Performance

WAL (@4:29 am day+1)

Money

Money (@4:29 am day+1)

Cost to update all data

100M bank accounts → 100M seeks? (worst case)

Cost to Append to log

+ 1 access to get ‘end of log’

+ write 100M log entries sequentially

(fast!!! < 10 sec)

[Lazily update data on disk later,

when convenient.]

(@10 msec/seek, that’s 1 Million secs)

Speedup for TXN Commit

1 Million secs vs 10 sec!!!

(Try other examples in Notebook)

56 of 65

Extensions of TXNs

57 of 65

LOG on Cluster model

A popular alternative (with tradeoffs)

Commit

WAL Flush to HDD/SSD

COMMIT by flushing log and/or data to ‘n’ other machines (e.g. n =2 )

[On same rack, different rack or different datacenter]

58 of 65

Example

Cluster LOG model

Performance

Failure model

Main model: RAM could fail, Disk is durable

VS

Cluster LOG model:

RAM on different machines don’t fail at same time

Power to racks is uncorrelated

59 of 65

Example: Youtube DB

60 of 65

Example

Youtube writes

Performance

Critique?

Correct?

Write Speed? Cost? Storage?

Bottlenecks?

Design 1: WAL Log for Video Likes

<videoid, numLikescount, new numLikes count>

Example: Increment number of LIKEs per Video

61 of 65

Example

Youtube writes

Performance

Critique?

Correct?

Write Speed? Cost? Storage?

Bottlenecks?

System recovery?

Design 2:

  • Replicate numLikes count in cluster.
  • Batch updates in Log (e.g. batch increase counts by 100 or 1000 to amortize writes, based on video popularity)

Flush to disk

Log “+n”

Update RAM on

n=3 machines

(<videoid, #likes>)

62 of 65

Example

Youtube writes

Performance

Vs Cost

Critique?

Correct?

Write Speed? Cost? Storage?

Bottlenecks?

System recovery?

Popular video

World’s most boring video (Unpopular)

Design #3

For most videos, Design 1 (full WAL logs)

For popular videos, Design 2

63 of 65

Summary

Design Questions?

Correctness: Need true ACID? Pseudo-ACID? What losses are OK?

Design parameters:

Any data properties you can exploit? (e.g., ‘+1’, popular vs not)

How much RAM, disks and machines?

How many writes per sec?

How fast do you want system to recover?

Choose: WAL logs, Replication on n-machines, Hybrid? (More in cs345)

64 of 65

Transactions

Summary

Why study Transactions?

Good programming model for parallel applications on shared data !

Atomic

Consistent

Isolation

Durable

Write Logs

WAL

RAM

Cluster

Serializable

Conflict

Serializable

S2PL

Note: this is an intro

Next: Take 346 (Distributed Transactions) or read Jim Gray’s classic

Next Lecture

65 of 65

Logging Summary

  • If DB says TX commits, TX effect remains after database crash

  • DB can undo actions and help us with atomicity

  • This is only half the story…