1 of 31

CSE 344: Section 8

Parallel DBs

November 30, 2023

2 of 31

Administrivia

  • Homework 6:
    • Due at 10:00 pm on Thursday, November 30th.
    • More transactions!

3 of 31

Parallel DBs

4 of 31

Parallel Query Processing

  • Data Replication: Copying the entire dataset
    • Expensive
    • OLAP workloads (i.e. mostly read-only) unlikely to cause consistency issues
    • Mostly used for intra-query parallelism
  • Data Partitioning (“sharding”): splitting the dataset into heterogeneous pieces
    • Avoiding consistency issues is complicated
    • Can accommodate OLTP workloads
      • Rarely compatible with OLAP workloads, which usually require the full dataset
    • Commonly used in inter-operator parallelism

5 of 31

Horizontal vs Vertical Partitioning

  • Horizontal partitioning: Every attribute is available on a replica
    • But not every tuple is present

  • Vertical partitioning: All tuple values are available on a replica
    • But not every attribute is present

6 of 31

Horizontal Data Partitioning

Payroll

1

2

3

ID

Name

Job

Salary

7 of 31

Horizontal Data Partitioning

Payroll

1

2

3

ID

Name

Job

Salary

8 of 31

Horizontal Data Partitioning

Payroll

ID

Name

Job

Salary

1

2

3

ID

Name

Job

Salary

ID

Name

Job

Salary

ID

Name

Job

Salary

Payroll1

Payroll2

Payroll3

9 of 31

Vertical Data Partitioning

Payroll

1

2

3

ID

Name

Job

Salary

10 of 31

Vertical Data Partitioning

Payroll

1

2

3

ID

Name

Job

Salary

11 of 31

Vertical Data Partitioning

11

Payroll

ID

Name

Job

Salary

1

2

3

ID

Name

Job

Salary

Payroll1

Payroll2

Payroll3

12 of 31

Horizontal Partitioning Strategies

  • Block Partition: Tuples partitioned by a raw size of chunk randomly
  • Range Partition: Each chunk contains tuples in a specific attribute’s range
  • Hash Partition: Each node contains tuples with a range of attribute hashes

  • Variables:
    • Input data size = N
    • Number of servers/nodes = p

13 of 31

Block Partitioning

T(R) = N

T(R2) = N/p

T(Rp) = N/p

T(R1) = N/p

p nodes

14 of 31

Range Partitioning

A

A

A

A

R2, v1 < A <= v2

RN, vN < A < inf

R1, -inf < A <= v1

p nodes

15 of 31

Hash Partitioning

A

A

A

A

R2, 2 = h(A)%p

RN, 0 = h(A)%p

R1, 1 = h(A)%p

p nodes

h(A)

16 of 31

Partition Quality

  • Uniform Partition: Chunks have roughly the same size

  • Skewed Partition: Some chunks are much larger than others, leading to a computational bottleneck for some queries
    • Block partition: Always uniform
    • Range partition: May be skewed
    • Hash partition: May be skewed

17 of 31

Worksheet Question 3 & 4

18 of 31

Question 3

  1. The most common query does a join on the mid attribute of both tables, and a group by on the same attribute. We’ll get the best performance by hash partitioning both tables on mid to avoid the shuffle required for both the join and group-by. Range partitioning also works but is less common in practice.

  1. While we can’t minimize skew with respect to the above query, we can spread the both tables as evenly as possible among the machines with horizontal partitioning. Of the schemes learned in class, we could block partition or hash partition on the primary key of each table. It may even be possible to range partition but we’d need to know the statistics of the attributes to choose the correct range values.

19 of 31

Question 4

20 of 31

Question 4

  1. Since the query load consisted solely of counts of in-transit letters, the majority of the other columns in the table are not used, so one way is to vertically partition the data by the Status attribute

  1. Since no mention is made of expected query load, we should choose a horizontal partitioning scheme because we don’t know which columns are needed. So for this case Block partitioning would work well; range partitioning or hash partitioning would also work well if we do not choose an attribute set consisting solely of the recipient column.

21 of 31

Question 4 cont.

The query load consists primarily but not solely of randomly-sampled letter contents, grouped by sender. The database contains every letter sent since the creation of the Postal Service in 1771.

As before, we should choose a horizontal partitioning scheme (although we do know that the letter contents attribute will be used eventually, we don’t have information on if/which other columns are needed). If we are grouping primarily by sender, this would be a reasonable attribute to partition on.

(The founding of the postal service is a red herring in this question)

22 of 31

Shared-Nothing Model

23 of 31

Data Partitioning

We focus on shared-nothing architecture and intra-operator parallelism.

  • Block Partition
    • Tuples partitioned by raw size (no ordering considered)
  • Hash partitioned on attribute A
    • Node contains tuples with chosen attribute hashes
  • Range partitioned on attribute A
    • Node contains tuples in chosen attribute ranges

24 of 31

Moving Data

We have a “network layer” to move tuples temporarily between nodes.

Transferring data is expensive, so we need to be efficient (especially on joins and grouping).

25 of 31

Moving Data:

Partitioned Hash-Join Mechanism

  • Join on some attribute (e.g. R.x and S.y)
  • Call hash function h(z)

Key Points:

  • Hash shuffle tuples on join attributes

R1, S1

R2, S2

Rn, Sn

R1’, S1’

R2’, S2’

Rn’, Sn’

...

...

Contains tuples s.t.

h(R.x) = h(S.y) = red

Contains tuples s.t.

h(R.x) = h(S.y) = green

Contains

tuples s.t.

h(R.x) = h(S.y) = blue

26 of 31

Moving Data:

Broadcast Join Mechanism

Takes advantage of small datasets (can all fit into main memory)

Key Points:

  • Partition type of R doesn’t matter
  • S is unpartitioned and small
  • Distribute S across all R

R1

R2

Rn

R1’, S

R2’, S

Rn’, S

...

...

S

Contains all of S

Contains all of S

Contains all of S

27 of 31

Parallel Query Plans

Now, we need to know how to derive parallel plans from single node plans.

  • Which RA operations can you do without talking to other nodes?
  • Which RA operations require moving tuples?

σ

π

28 of 31

Parallel DB Practice!

29 of 31

We have a distributed database that holds the relations:

Drug(spec VARCHAR(255), compatibility INT)

Person(name VARCHAR(100) primary key, compatibility INT)

We want to compute:

SELECT P.name, count(D.spec)

FROM Person AS P, Drug AS D

WHERE P.compatibility = D.compatibility

GROUP BY P.name;

Drug is block-partitioned

Person is hash-partitioned on compatibility [h(n)]

You have three nodes. Draw a parallel query plan.

30 of 31

ƔP.name, count(D.spec)(P D)

Node 1

Node 2

Node 3

P D

ƔP.name,count(D.spec)

Ɣ

Ɣ

Hash [h(n)] Drug on compatibility

Takes advantage of:

  1. Hash partitioning of [h(n)]
  2. PK uniqueness of name

Block partitioning of Drug

31 of 31

Section Worksheet!