Lecture 18: Disjoint Sets
CSE 373: Data Structures and Algorithms
CSE 373 2` SP β CHAMPION
1
Announcements
Practice - no poll today
CSE 373 SP 21 - CHAMPION
3
6
4
5
0
3
1
2
8
10
12
9
11
7
13
Practice
CSE 373 SP 21 - CHAMPION
4
6
4
5
0
3
1
2
8
10
12
9
11
7
13
Practice
CSE 373 SP 21 - CHAMPION
5
6
4
5
0
3
1
2
8
10
12
9
11
7
13
Practice
CSE 373 SP 21 - CHAMPION
6
6
4
5
0
3
1
2
8
10
12
9
11
7
13
Practice
CSE 373 SP 21 - CHAMPION
7
8
10
12
9
11
6
4
5
0
3
1
2
7
13
New ADT
CSE 373 SP 18 - KASEY CHAMPION
8
Set ADT
create(x) - creates a new set with a single member, x
Count of Elements
state
behavior
Set of elements
add(x) - adds x into set if it is unique, otherwise add is ignored
remove(x) β removes x from set
size() β returns current number of elements in set
Disjoint-Set ADT
makeSet(x) β creates a new set within the disjoint set where the only member is x. Picks representative for set
Count of Sets
state
behavior
Set of Sets
findSet(x) β looks up the set containing element x, returns representative of that set
union(x, y) β looks up set containing x and set containing y, combines two sets into one. Picks new representative for resulting set
D
B
C
A
D
C
F
B
A
G
H
Implementation
CSE 373 SP 18 - KASEY CHAMPION
9
TreeDisjointSet<E>
makeSet(x)-create a new tree of size 1 and add to our forest
state
behavior
Collection<TreeSet> forest
findSet(x)-locates node with x and moves up tree to find root
union(x, y)-append tree with y as a child of tree with x
Disjoint-Set ADT
makeSet(x) β creates a new set within the disjoint set where the only member is x. Picks representative for set
Count of Sets
state
behavior
Set of Sets
findSet(x) β looks up the set containing element x, returns representative of that set
union(x, y) β looks up set containing x and set containing y, combines two sets into one. Picks new representative for resulting set
Dictionary<NodeValues, NodeLocations> nodeInventory
TreeSet<E>
TreeSet(x)
state
behavior
SetNode overallRoot
add(x)
remove(x, y)
getRep()-returns data of overallRoot
SetNode<E>
SetNode(x)
state
behavior
E data
addChild(x)
removeChild(x, y)
Collection<SetNode> children
Review QuickFind vs. QuickUnion
Joyce, Sam, Ken, Alex
Aileen, Santino
Paul
DISJOINT SETS ADT
QuickFind
QuickUnion
map from value to representative ID
Aileen
Joyce
Santino
Sam
Ken
1
2
2
2
1
Alex
Paul
2
3
Aileen (1)
Santino
Paul (3)
Joyce (2)
Ken
Sam
Alex
trees of values with representative ID at each root
| (Baseline) | QuickFind | QuickUnion |
makeSet(value) | π―(1) | π―(1) | π―(1) |
find(value) | π―(n) | π―(1) | π―(n) |
union(x, y) | π―(n) | π―(n) | π―(1) |
Could also use one element from each set (e.g. the root) as its representative: only uniqueness matters
Review QuickUnion: Why Use Both Roots?
Example: result of union(Ken, Santino) on these Disjoint Sets given three possible implementations:
union(A, B):
rootA = find(A)
rootB = find(B)
set rootA to point to rootB
union(A, B):
rootB = find(B)
set A to point to rootB
union(A, B):
rootA = find(A)
set rootA to point to B
Aileen (1)
Santino
Paul (3)
Joyce
Ken
Sam
Alex
Aileen (1)
Santino
Paul (3)
Joyce (2)
Ken
Sam
Alex
Aileen (1)
Santino
Paul (3)
Joyce
Ken
Sam
Alex
Aileen (1)
Santino
Paul (3)
Joyce (2)
Ken
Sam
Alex
Correct: Everything in Kenβs set now connected to everything in Santinoβs set!
Incorrect: Ken and Joyce were connected before; the union operation shouldnβt remove connections.
Inefficient: Technically correct, but increases height of the up-tree so makes
QuickUnion: Checking in on those runtimes
| Maps to Sets | QuickFind | QuickUnion |
makeSet(value) | π―(1) | π―(1) | π―(1) |
findSet(value) | π―(n) | π―(1) | π―(n) |
union(x, y) | π―(n) | π―(n) | π―(1) |
union(A, B):
rootA = find(A)
rootB = find(B)
set rootA to point to rootB
kruskalMST(G graph)
DisjointSets<V> msts; Set finalMST;
initialize msts with each vertex as single-element MST
sort all edges by weight (smallest to largest)
for each edge (u,v) in ascending order:
uMST = msts.find(u)
vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST);
*
*
QuickUnion: Letβs Build a Worst Case
union(A, B):
rootA = find(A)
rootB = find(B)
set rootA to point to rootB
find(A):
jump to A node
travel upward until root
return ID
Even with the βuse-the-rootsβ implementation of union, try to come up with a series of calls to union that would create a worst-case runtime for find on these Disjoint Sets:
A
B
C
D
QuickUnion: Letβs Build a Worst Case
union(A, B):
rootA = find(A)
rootB = find(B)
set rootA to point to rootB
find(A):
jump to A node
travel upward until root
return ID
Even with the βuse-the-rootsβ implementation of union, try to come up with a series of calls to union that would create a worst-case runtime for find on these Disjoint Sets:
A
B
C
D
union(A, B)
union(B, C)
union(C, D)
find(A)
B
A
C
D
Analyzing the QuickUnion Worst Case
B
A
C
D
union(C, D)
B
A
C
D
What currently happens
What would help avoid degenerate tree
WeightedQuickUnion
union(A, B):
rootA = find(A)
rootB = find(B)
put lighter root under heavier root
union(A, B)
union(B, C)
union(C, D)
find(A)
A
B
C
D
Now what happens?
B
A
C
D
Perfect! Best runtime we can get.
WeightedQuickUnion: Performance
union(A, B):
rootA = find(A)
rootB = find(B)
put lighter root under heavier root
WeightedQuickUnion: Performance
0
N | H |
1 | 0 |
WeightedQuickUnion: Performance
0
1
N | H |
1 | 0 |
2 | 1 |
WeightedQuickUnion: Performance
0
1
2
3
N | H |
1 | 0 |
2 | 1 |
4 | ? |
WeightedQuickUnion: Performance
0
1
2
3
N | H |
1 | 0 |
2 | 1 |
4 | 2 |
WeightedQuickUnion: Performance
0
1
2
3
4
5
6
7
N | H |
1 | 0 |
2 | 1 |
4 | 2 |
8 | ? |
WeightedQuickUnion: Performance
0
1
2
3
N | H |
1 | 0 |
2 | 1 |
4 | 2 |
8 | 3 |
4
5
6
7
WeightedQuickUnion: Performance
0
1
2
3
N | H |
1 | 0 |
2 | 1 |
4 | 2 |
8 | 3 |
16 | 4 |
4
5
6
7
8
9
10
11
12
13
14
15
Why Weights Instead of Heights?
1
2
0
4
6
5
3
8
9
7
+
1
2
0
4
6
5
3
8
9
7
WeightedQuickUnion Runtime
| Maps to Sets | QuickFind | QuickUnion | WeightedQuickUnion |
makeSet(value) | π―(1) | π―(1) | π―(1) | π―(1) |
find(value) | π―(n) | π―(1) | π―(n) | π―(log n) |
union(x, y) assuming root args | π―(n) | π―(n) | π―(1) | π―(1) |
union(x, y) | π―(n) | π―(n) | π―(n) | π―(log n) |
Aileen
Joyce
Santino
Sam
Ken
1
2
2
2
1
B
A
C
D
1
2
3
4
5
6
Modifying Data Structures for Future Gains
Path Compression: Idea
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Path Compression: Idea
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Path Compression: Details and Runtime
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Path Compression: Runtime
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Path Compression: Runtime
N | log* N |
1 | 0 |
2 | 1 |
4 | 2 |
16 | 3 |
65536 | 4 |
265536 | 5 |
216
Number of atoms in the known universe is 2256ish
WQU + Path Compression Runtime
| (Baseline) | QuickFind | QuickUnion | WeightedQuickUnion | WQU + Path Compression |
makeSet(value) | π―(1) | π―(1) | π―(1) | π―(1) | π―(1) |
find(value) | π―(n) | π―(1) | π―(n) | π―(log n) | π―(log* n) |
union(x, y) assuming root args | π―(n) | π―(n) | π―(1) | π―(1) | π―(1) |
union(x, y) | π―(n) | π―(n) | π―(n) | π―(log n) | π―(log* n) |
In-Practice Runtimes:
Disjoint Sets Implementation
| (Baseline) | QuickFind | QuickUnion | WeightedQuickUnion | WQU + Path Compression |
makeSet(value) | π―(1) | π―(1) | π―(1) | π―(1) | π―(1) |
find(value) | π―(n) | π―(1) | π―(n) | π―(log n) | π―(log* n) |
union(x, y) assuming root args | π―(n) | π―(n) | π―(1) | π―(1) | π―(1) |
union(x, y) | π―(n) | π―(n) | π―(n) | π―(log n) | π―(log* n) |
In-Practice Runtimes:
Aileen
Joyce
Santino
Sam
Ken
1
2
2
2
1
B
A
C
D
0
1
2
3
4
5
6
7
0
1
2
3
4
5
Kruskalβs Runtime
kruskalMST(G graph)
DisjointSets<V> msts; Set finalMST;
initialize msts with each vertex as single-element MST
sort all edges by weight (smallest to largest)
for each edge (u,v) in ascending order:
uMST = msts.find(u)
vMST = msts.find(v)
if (uMST != vMST):
finalMST.add(edge (u, v))
msts.union(uMST, vMST)
Β
Β
Β
Β
Β
Β
Β
Β
Β
Using Arrays for Up-Trees
Aileen (2)
Santino
Paul (4)
Joyce (0)
Ken
Sam
Alex
0 | 1 | 2 | 3 | 4 | 5 | 6 |
-1 | 0 | -1 | 6 | -1 | 2 | 0 |
Joyce | Sam | Aileen | Alex | Paul | Santino | Ken |
Using Arrays: Find
0 | 1 | 2 | 3 | 4 | 5 | 6 |
-1 | 0 | -1 | 6 | -1 | 2 | 0 |
Joyce | Sam | Aileen | Alex | Paul | Santino | Ken |
Aileen (2)
Santino
Paul (4)
Joyce (0)
Ken
Sam
Alex
Alex
Aileen
Sam
β¦
find(A):
index = jump to A nodeβs index
while array[index] > 0:
index = array[index]
path compression
return index
1
2
find(Alex)
1
2
= 0
0
3
3
Using Arrays: Union
0 | 1 | 2 | 3 | 4 | 5 | 6 |
-4 | 0 | -2 | 6 | -1 | 2 | 0 |
Joyce | Sam | Aileen | Alex | Paul | Santino | Ken |
Aileen (2)
Santino
Joyce (0)
Ken
Sam
Alex
union(A, B):
rootA = find(A)
rootB = find(B)
use -1 * array[rootA] and -1 *
array[rootB] to determine weights
put lighter root under heavier root
weight 4
weight 2
union(Ken, Santino)
Paul (4)
weight 1
Using Arrays: Union
0 | 1 | 2 | 3 | 4 | 5 | 6 |
-4 | 0 | -2 | 6 | -1 | 2 | 0 |
Joyce | Sam | Aileen | Alex | Paul | Santino | Ken |
union(A, B):
rootA = find(A)
rootB = find(B)
use -1 * array[rootA] and -1 *
array[rootB] to determine weights
put lighter root under heavier root
-6
0
Aileen (2)
Santino
Joyce (0)
Ken
Sam
Alex
weight 6
Paul (4)
weight 1
Aileen
union(Ken, Santino)
Array Implementation
CSE 373 SP 18 - KASEY CHAMPION
40
1
6
3
rank = 0
4
2
10
5
7
0
9
8
11
15
13
rank = 3
14
12
17
16
18
rank = 3
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| | | | | | | | | | | | | | | | | | |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
-1 | -1 | 1 | 2 | 2 | 2 | 1 | 6 | 7 | 7 | 6 | -1 | 11 | 12 | 12 | 11 | 15 | 15 | 17 |
Store (rank * -1) - 1
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
-1 | -4 | 1 | 2 | 2 | 2 | 1 | 6 | 7 | 7 | 6 | -4 | 11 | 12 | 12 | 11 | 15 | 15 | 17 |
Each βnodeβ now only takes 4 bytes of memory instead of 32
Practice
CSE 373 SP 18 - KASEY CHAMPION
41
3
0
rank = 0
4
11
1
5
2
13
12
rank = 2
10
9
14
15
8
rank = 2
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| | | | | | | | | | | | | | | | |
rank = 1
6
7
16
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
3 | 0 | 0 | -6 | 3 | -1 | -2 | 6 | 12 | 13 | 13 | 0 | 13 | -3 | 12 | 12 | 12 |
Using Arrays for WQU+PC
| (Baseline) | QuickFind | QuickUnion | WeightedQuickUnion | WQU + Path Compression | ArrayWQU+PC |
makeSet(value) | π―(1) | π―(1) | π―(1) | π―(1) | π―(1) | π―(1) |
find(value) | π―(n) | π―(1) | π―(n) | π―(log n) | π―(log* n) | π―(log* n) |
union(x, y) assuming root args | π―(n) | π―(n) | π―(1) | π―(1) | π―(1) | π―(1) |
union(x, y) | π―(n) | π―(n) | π―(n) | π―(log n) | π―(log* n) | π―(log* n) |
Appendix