Lecture 10:
Binary Search Trees
CSE 247 / CSE 502N
Fall 2016
Ron Cytron
1
Where are we?
2
Trees
3
API of SortedSet<E>
4
Using an unordered List?
5
Using an ordered List?
6
Using an unordered Array?
7
Using an ordered Array?
8
Using a Hash Table? (worked well for Set)
9
Using a Priority Queue?
10
Binary search trees
11
p
a
b
Binary search trees
12
p
a
b
p
a
b
Binary search trees
13
p
a
b
p
a
b
Binary search trees
14
p
a
b
p
a
b
All values over here are no greater than p’s value.
In a set they are strictly smaller.
Binary search trees
15
p
a
b
p
a
b
All values over here are no greater than p’s value.
In a set they are strictly smaller.
All values over here are no smaller than p’s value.
In a set they are strictly larger.
Insertions
16
Insert 6, 5, 2, 7, 5, 8
First insertion is always the root node of the tree
6
Insertions
17
Insert 6, v
If v ≤ p.value, go left.
p
a
b
v ≤ p.value
Insertions
18
Insert 6, v
If v ≥ p.value, go right.
p
a
b
v ≥ p.value
Insertions
19
Insert 6, v
If v = p.value, go either way (flip a coin for balance).
p
a
b
v ≥ p.value
v ≤ p.value
Insertions
20
Insert 6, v
p
a
b
v ≥ p.value
Recursively apply this idea through the tree until finding a place for v.
Let’s get the idea from some insertions
21
Insert 6, 5, 2, 7, 5, 8
Figure 12.1 from text
Let’s get the idea from some insertions
22
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
23
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
24
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
25
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
26
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
27
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
28
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
29
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
30
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
31
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
32
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
33
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
34
Insert 6, 5, 2, 7, 5, 8
Could go either way for next move
Let’s get the idea from some insertions
35
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
36
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
37
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
38
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
39
Insert 6, 5, 2, 7, 5, 8
Let’s get the idea from some insertions
40
Insert 6, 5, 2, 7, 5, 8
For each insertion v, there is always a place to put v.
Tree depends on the order of insertion
41
Insert 6, 5, 2, 7, 5, 8
Insert 2, 5, 7, 6, 8, 5
Another insertion order
42
Insert 2, 5, 7, 6, 8, 5
Another insertion order
43
Insert 2, 5, 7, 6, 8, 5
Another insertion order
44
Insert 2, 5, 7, 6, 8, 5
Another insertion order
45
Insert 2, 5, 7, 6, 8, 5
Another insertion order
46
Insert 2, 5, 7, 6, 8, 5
Another insertion order
47
Insert 2, 5, 7, 6, 8, 5
Another insertion order
48
Insert 2, 5, 7, 6, 8, 5
Another insertion order
49
Insert 2, 5, 7, 6, 8, 5
Another insertion order
50
Insert 2, 5, 7, 6, 8, 5
Another insertion order
51
Insert 2, 5, 7, 6, 8, 5
Another insertion order
52
Insert 2, 5, 7, 6, 8, 5
Another insertion order
53
Insert 2, 5, 7, 6, 8, 5
Another insertion order
54
Insert 2, 5, 7, 6, 8, 5
Another insertion order
55
Insert 2, 5, 7, 6, 8, 5
Another insertion order
56
Insert 2, 5, 7, 6, 8, 5
Another insertion order
57
Insert 2, 5, 7, 6, 8, 5
Another insertion order
58
Insert 2, 5, 7, 6, 8, 5
Another insertion order
59
Insert 2, 5, 7, 6, 8, 5
Another insertion order
60
Insert 2, 5, 7, 6, 8, 5
Could go either way for next move
Another insertion order
61
Insert 2, 5, 7, 6, 8, 5
Another insertion order
62
Insert 2, 5, 7, 6, 8, 5
Another insertion order
63
Insert 2, 5, 7, 6, 8, 5
Another insertion order
64
Insert 2, 5, 7, 6, 8, 5
Two different trees for same data
65
Insertion
66
Listing the tree’s items in sorted order
67
p
a
b
Tree traversal
68
p
Tree traversal
69
p
Process everything ≤ p’s value found to the left.
Tree traversal
70
p
If something is < p’s value, it has already been processed.
Nothing < p’s value can be found elsewhere in the tree. Things > p’s value are to the right.
So now it’s p’s turn, so act upon it.
Tree traversal
71
p
Process everything ≥ p’s value found to the right.
Tree traversal
72
p
More implementation
73
Is an element in the tree?
74
Finding the least and greatest elements
75
Where is the least element of a subtree s?
s
Where is the least element of a subtree?
s
Where is the least element of a subtree s?
s
Where is the least element of a subtree?
s
Where is the least element of a subtree?
s
Gratuitous WUTexter participation poll
Do you want to see this done
Where is the greatest element of subtree s?
s
Where is the greatest element of subtree s?
s
Where is the greatest element of subtree s?
s
For a tree of height h, how much time do all of these methods take so far?
Where is the greatest element of subtree s?
s
For a tree of height h, how much time do all of these methods take so far?
Θ(h)
In terms of the number of nodes, how do we describe h?
Where is the greatest element of subtree s?
s
For a tree of height h, how much time do all of these methods take so far?
Θ(h)
In terms of the number of nodes, how do we describe h?
h is Ω(log n) balanced tree
h is O(n) very unbalanced tree
Removing a node z
86
A node z with no children contains val
87
z
Looking for val at z, navigating left or right as needed
A node z with no children contains val
88
z
Looking for val at z, navigating left or right as needed
A node z with no children contains val
89
z
Looking for val at z, navigating left or right as needed
Found it!
A node z with no children contains val
90
Node z’s parent’s left child is set to null.
We can do this by assignment as we did with insert, recursively.
A node z with no children contains val
91
z
Node z’s parent’s left child is set to null.
We can do this by assignment as we did with insert, recursively.
z.left = deleteHelper(tree.left, val)
A node z with no children contains val
92
Node z’s parent’s left child is set to null.
We can do this by assignment as we did with insert, recursively.
if z is the node to be removed and it has no children, return null to the caller for assignment.
z
return null
A node z with no children contains val
93
z
Node z’s parent’s left child is set to null.
We can do this by assignment as we did with insert, recursively.
if z is the node to be removed and it has no children, return null to the caller for assignment.
z.left = null
A node z with one child contains val
94
z
Looking for val at z, navigating left or right as needed
A node z with one child contains val
95
z
Looking for val at z, navigating left or right as needed.
To delete z, have its only child take z’s place
return z.left
A node z with one child contains val
96
z
Looking for val at z, navigating left or right as needed.
To delete z, have its only child take z’s place
return z.left
This node has value less than z, but greater than z’s parent, so it is suitable to take z’s place.
A node with two children
z
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
A node with two children
We can’t pick either of z’s children necessarily to replace z.
Not a binary tree!
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
Any of these nodes would do
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
Any of these nodes would do
But less than me!
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
Any of these nodes would do
But less than me!
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
Pick one of us!
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
….and less than all the remaining pink values.
Pick one of us!
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
….and less than all the remaining pink values.
Pick one of us!
Oh, then pick the smallest among us
A node with two children
z
We can’t pick either of z’s children necessarily to replace z.
What we place here has to be bigger than all the yellow values.
….and less than all the remaining pink values.
Pick one of us!
That would be me
A node with two children
z
From z, go right, and then go left as far as possible.
Place that value at z.
A node with two children
z
From z, go right, and then go left as far as possible.
Place that value at z.
Then delete the pink node.
It has to be one of the earlier simple cases: it cannot have two children
Properties of red-black trees
110
111
Example from text
112
Black height: these integers show the number of black nodes from this node to any leaf, not including this node.
Example from text
113
Black height: these integers show the number of black nodes from this node to any leaf, not including this node.
Example from text
114
Black height: these integers show the number of black nodes from this node to any leaf, not including this node.
Example from text
115
Black height: these integers show the number of black nodes from this node to any leaf, not including this node.
Recall this number is the same on any paths to a leaf.
For node x, bh(x) is the black height of that node. The integer next to node x is bh(x).
Example from text
116
Black height: these integers show the number of black nodes from this node to any leaf, not including this node.
Recall this number is the same on any paths to a leaf.
For node x, bh(x) is the black height of that node. The integer next to node x is bh(x).
bh(root) is the black-height of the tree. This tree’s black height is 3.
Example from text
117
The root is always black
Example from text
118
If a node is red
Example from text
119
If a node is red
Then both its children are black
Example from text
120
If a node is red
Then both its children are black
This keeps us from getting two reds in a row heading toward a leaf
Example from text
121
These are not null references, they are each a real node named NIL.
These nodes serve as sentinels to make the code easier to write and the proofs simpler.
Example from text
122
These are not null references, they are each a real node named NIL.
These nodes serve as sentinels to make the code easier to write and the proofs simpler.
They add one more black node on every path, so you would think we could do without them. They are not there to make the black node count right. They are there to simplify the code (which you will appreciate later).
Example from text
123
These are not null references, they are each a real node named NIL.
But because they are all the same, we don’t need them to be different nodes, so…..
They add one more black node on every path, so you would think we could do without them. They are not there to make the black node count right. They are there to simplify the code (which you will appreciate later).
Example from the text
124
They can all point to the same node, called T.nil in the text.
Example from the text
125
All nodes except T.nil have a parent reference.
It helps for the root to have one too, and it points to T.nill
We draw the trees like this
126
but every unspecified reference, including those on apparent leaves, goes to T.nil, which we just won’t draw to keep the pictures clean
Height of a red-back tree of n nodes is O(log n)
Lemma 13.1 of text: A red-black tree with n internal nodes has height at most 2 lg(n + 1)
Proof by induction on k using
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
P(0): node x has height 0 → x is T.nil → bh(x) = 0
then x has 0 internal nodes
0 = 20 - 1 so P(0) is true
127
restatement
Height of a red-back tree of n nodes is O(log n)
Lemma 13.1 of text: A red-black tree with n internal nodes has height at most 2 lg(n + 1)
Proof by induction on k using
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
Now consider a node x of positive height, so
it must be an internal node
it must have 2 children (all internal nodes do: T.nil if nothing else)
128
Consider that node x
129
x
height: h > 0
black height: bh(x)
Consider that internal node x
130
x
height: h > 0
black height: bh(x)
x must have 2 children
These nodes have height h-1
Consider that internal node x
131
x
height: h > 0
black height: bh(x)
x must have 2 children
These nodes have height h-1
For each of these children, there are two cases
Consider that internal node x
132
x
height: h > 0
black height: bh(x)
x must have 2 children
These nodes have height h-1
For each of these children, there are two cases
If the child is black, then it has black height bh(x)-1
Consider that internal node x
133
x
height: h > 0
black height: bh(x)
x must have 2 children
These nodes have height h-1
For each of these children, there are two cases
If the child is black, then it has black height bh(x)-1
If the child is red, then it has black height bh(x)
Consider that internal node x
134
x
height: h > 0
black height: bh(x)
x must have 2 children
These nodes have height h-1
For each of these children, there are two cases
If the child is black, then it has black height bh(x)-1
If the child is red, then it has black height bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
The children are covered by P(h-1)
Consider that internal node x
135
x
height: h > 0
black height: bh(x)
x must have 2 children
These nodes have height h-1
For each of these children, there are two cases
If the child is black, then it has black height bh(x)-1
If the child is red, then it has black height bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
The children are covered by P(h-1)
→ it has at least 2bh(x)-1-1 internal nodes
→ it has at least 2bh(x)-1 internal nodes
The worst case
Consider that internal node x
136
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Consider that internal node x
137
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
Consider that internal node x
138
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+
Consider that internal node x
139
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+ 2bh(x)-1-1
+
Consider that internal node x
140
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+ 2bh(x)-1-1
+ 1
Consider that internal node x
141
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+ 2bh(x)-1-1
+ 1
----------------------
2 x 2bh(x)-1 -2 + 1
Consider that internal node x
142
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+ 2bh(x)-1-1
+ 1
----------------------
2 x 2bh(x)-1 -2 + 1
= 2bh(x) - 1
Consider that internal node x
143
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+ 2bh(x)-1-1
+ 1
----------------------
2 x 2bh(x)-1 -2 + 1
= 2bh(x) - 1
Consider that internal node x
144
x
height: h > 0
black height: bh(x)
P(k): node x has height k → subtree rooted at x has at least 2bh(x)-1 internal nodes
each child has at least 2bh(x)-1-1 internal nodes
Let’s add up what we have
2bh(x)-1-1
+ 2bh(x)-1-1
+ 1
----------------------
2 x 2bh(x)-1 -2 + 1
= 2bh(x) - 1
QED
To finish the argument
145
We balance by rotating
146
Maintains: α < x < β < y < γ
Example from text
147
Insertion into red-black tree
148
Demo
1, 2, 25, 35, 3, 5, 8, 55, 60, 90, 80, 12, 75, 25, 50
149
The three cases
150
In this loop, z is always a red node whose parent is also red.
The three cases
151
The .p is the “parent of”
The three cases
152
Code is written for z’s parent being a left child.
The three cases
153
How do we know y exists? What if 7 had no right child?
The three cases
154
How do we know y exists? What if 7 had no right child?
y?
The three cases
155
How do we know y exists? Thanks to NIL it always does.
y?
NIL
The three cases
156
Case 1: transfers z’s grandparent’s black to z’s parent and aunt.
The three cases
157
To preserve black heights, 5 and 8 will become black and 7 will become red.
The three cases
158
z moves up the tree by two nodes, but that node (7) is definitely red. So the loop may continue.
The three cases
159
Otherwise the red-red problem is fixed by either one right rotation
The three cases
160
Otherwise the red-red problem is fixed by either one right rotation, or a left rotation followed by a right rotation.
The three cases
161
When done, z’s parent is black so the loop will exit.
The three cases
162
The three cases
163
The three cases
164
Case 1 no longer applies: z’s aunt is not red
The three cases
165
The three cases
166
The three cases
167
We know case 2 leaves a problem that is solved by case 3
The three cases
168
We know case 2 leaves a problem that is solved by case 3
The three cases
169
In case 3, z is its parent’s left child
The three cases
170
The three cases
171
z’s parent is black so the loop exits
The black height of the tree never changed
172
Black height was 2 before inserting 4.
It’s also 2 with the red-red problem.
The black height of the tree never changed
173
Still 2.
The black height of the tree never changed
174
Still 2.
The black height of the tree never changed
175
Still 2.
The black height of the tree never changed
176
Then what causes the black height of the tree to increase on insertion?
The black height of the tree never changed
177
Then what causes the black height of the tree to increase on insertion?
The tree’s black height must be the same on all paths to leaves.
The black height of the tree never changed
178
Then what causes the black height of the tree to increase on insertion?
The tree’s black height must be the same on all paths to leaves.
So the root must change to red, preserving the height, and then to black, increasing the height, for the tree’s height to increase.