1 of 19

Databases Demystified

with Michael Kaminsky

Sponsored by

2 of 19

Row Store vs.

Column Store

Databases Demystified

Sponsored by

Sponsored by

3 of 19

Why It Matters

  • We want to make our databases fast
  • By changing how we store our data, we can make the database respond to queries faster
  • But, there are trade-offs here so we want to optimize for either analytical or transactional workloads

Databases Demystified: Row Store vs Column Store

Sponsored by

Sponsored by

4 of 19

Disk Storage

Permanent storage on a “hard drive”

Databases Demystified: Row Store vs Column Store

Sponsored by

Sponsored by

5 of 19

Reading 3 blocks takes longer than

reading one.

We want to put data that are

read together onto the same block!

Databases Demystified: Row Store vs Column Store

Sponsored by

Sponsored by

6 of 19

Data Example

....

....

....

....

Databases Demystified: Row Store vs Column Store

ID

Name

State

Age

1

Juan

NY

29

2

Elizabeth

TX

22

3

Scott

MA

35

6893849

Fernanda

CA

41

Sponsored by

Sponsored by

7 of 19

Row Stores

....

....

....

....

Databases Demystified: Row Store vs Column Store

1 Juan NY 29 | 2 Elizabeth TX 22 | 3 Scott MA 35 | … | 6893849 Fernanda CA 41

ID

Name

State

Age

1

Juan

NY

29

2

Elizabeth

TX

22

3

Scott

MA

35

6893849

Fernanda

CA

41

Sponsored by

Sponsored by

8 of 19

Row Stores

Each person’s data is more likely to be on the same block

Databases Demystified: Row Store vs Column Store

?

Sponsored by

Sponsored by

9 of 19

Row Stores

To read all of Scott’s data, we only need to read one block!

Databases Demystified: Row Store vs Column Store

?

Sponsored by

Sponsored by

10 of 19

Row Stores

To add new data, we just add a row to the end!

Databases Demystified: Row Store vs Column Store

?

Sponsored by

Sponsored by

11 of 19

Column Stores

....

....

....

....

Databases Demystified: Row Store vs Column Store

1 2 3 … 6893849 | Juan Elizabeth Scott ... Fernanda | NY TX MA … CA | 29 22 35 … 41

ID

Name

State

Age

1

Juan

NY

29

2

Elizabeth

TX

22

3

Scott

MA

35

6893849

Fernanda

CA

41

Sponsored by

Sponsored by

12 of 19

Column Stores

Each “type” of data is more likely to be on the same block

Databases Demystified: Row Store vs Column Store

?

Sponsored by

Sponsored by

13 of 19

Column Stores

To read all of the states, we only have to read one block!

Databases Demystified: Row Store vs Column Store

?

Sponsored by

Sponsored by

14 of 19

Column Stores

Compression algorithms work best on columns of similar types

Databases Demystified: Row Store vs Column Store

Sponsored by

Sponsored by

15 of 19

Comparison

Analytical Workloads

Transactional Workloads

Column Store

Row Store

Databases Demystified: Row Store vs Column Store

Sponsored by

Sponsored by

16 of 19

Comparison: Transactional

Column Store

Row Store

Databases Demystified: Row Store vs Column Store

SELECT name, state, age

FROM data

WHERE id = 3;

Sponsored by

Sponsored by

17 of 19

Comparison: Transactional

Column Store

Row Store

Databases Demystified: Row Store vs Column Store

SELECT state, count(1)

FROM data

GROUP BY state;

Sponsored by

Sponsored by

18 of 19

Row Store

Column Store

  • Keep entire columns on the same block
  • Easy to analyze entire columns quickly
  • Easy to compress lots of data
  • Slow to add new data or manipulate old data

  • Keep data for objects on the same block
  • Easy to read and manipulate one object at a time
  • Easy to insert new data
  • Slow to analyze large amounts of data

Databases Demystified: Analytical vs. Transactional

Sponsored by

Sponsored by

19 of 19

Don’t forget to like and subscribe!

Databases Demystified

Sponsored by