1 of 57

CS61B, 2019

Lecture 16: ADTs and BSTs

  • Abstract Data Types
  • Binary Search Tree (intro)
  • BST Definitions
  • BST Operations
  • Sets vs. Maps, Summary

2 of 57

Abstract Data Types

3 of 57

Interfaces vs. Implementation

In class:

  • Developed ALists and SLLists.
  • Created an interface List61B.
    • Modified AList and SLList to implement List61B.
    • List61B provided default methods.

In projects:

  • Developed ArrayDeque and LinkedListDeque.
  • Created an interface Deque.
    • Modified AD and LLD to implement Deque.

List61B

AList

SLList

Deque

Array

Deque

LinkedList

Deque

4 of 57

Interfaces vs. Implementation

With DisjointSets, we saw a much richer set of possible implementations.

DisjointSets

ListOfSetsDS

QuickFindDS

QuickUnionDS

WeightedQuickUnionDS

5 of 57

Abstract Data Types

An Abstract Data Type (ADT) is defined only by its operations, not by its implementation.

Deque ADT:

  • addFirst(Item x);
  • addLast(Item x);
  • boolean isEmpty();
  • int size();
  • printDeque();
  • Item removeFirst();
  • Item removeLast();
  • Item get(int index);

Deque

Array

Deque

LinkedList

Deque

ArrayDeque and LinkedList Deque are implementations of the Deque ADT.

6 of 57

Another example of an ADT: The Stack

The Stack ADT supports the following operations:

  • push(int x): Puts x on top of the stack.
  • int pop(): Removes and returns the top item from the stack.

insertBack()

getBack()

get(int i)

deleteBack()

push(int x)

pop()

4

6

2

7 of 57

The Stack ADT: yellkey.com/?

The Stack ADT supports the following operations:

  • push(int x): Puts x on top of the stack.
  • int pop(): Removes and returns the top item from the stack.�

Which implementation do you think would result in faster overall performance?

  1. Linked List
  2. Array

insertBack()

getBack()

get(int i)

deleteBack()

push(int x)

pop()

4

8 of 57

The Stack ADT: yellkey.com/?

The Stack ADT supports the following operations:

  • push(int x): Puts x on top of the stack.
  • int pop(): Removes and returns the top item from the stack�

Which implementation do you think would result in faster overall performance?

  • Linked List
  • Array

Both are about the same. No resizing for linked lists, so probably a lil faster.

insertBack()

getBack()

get(int i)

deleteBack()

push(int x)

pop()

4

9 of 57

The GrabBag ADT: yellkey.com/?

The GrabBag ADT supports the following operations:

  • insert(int x): Inserts x into the grab bag.
  • int remove(): Removes a random item from the bag.
  • int sample(): Samples a random item from the bag (without removing!)
  • int size(): Number of items in the bag.

Which implementation do you think would result in faster overall performance?

  1. Linked List
  2. Array

insertBack()

getBack()

get(int i)

deleteBack()

remove()

insert(int x)

sample()

size(int i)

10 of 57

The GrabBag ADT: yellkey.com/?

The GrabBag ADT supports the following operations:

  • insert(int x): Inserts x into the grab bag.
  • int remove(): Removes a random item from the bag.
  • int sample(): Samples a random item from the bag (without removing!)
  • int size(): Number of items in the bag.

Which implementation do you think would result in faster overall performance?

  • Linked List
  • Array

insertBack()

getBack()

get(int i)

deleteBack()

remove()

insert(int x)

sample()

size(int i)

11 of 57

Abstract Data Types in Java

One thing I particularly like about Java is the syntax differentiation between abstract data types and implementations.

  • Note: Interfaces in Java aren’t purely abstract as they can contain some implementation details, e.g. default methods.

Example: List<Integer> L = new ArrayList<>();

List

ArrayList

Linked

List

12 of 57

Collections

Among the most important interfaces in the java.util library are those that extend the Collection interface (btw interfaces can extend other interfaces).

  • Lists of things.
  • Sets of things.
  • Mappings between items, e.g. jhug’s grade is 88.4, or Creature c’s north neighbor is a Plip.
    • Maps also known as associative arrays, associative lists (in Lisp), symbol tables, dictionaries (in Python).

Collection

List

Set

Map

13 of 57

Map Example

Maps are very handy tools for all sorts of tasks. Example: Counting words.

Map<String, Integer> m = new TreeMap<>();

String[] text = {"sumomo", "mo", "momo", "mo",

"momo", "no", "uchi"};

for (String s : text) {

int currentCount = m.getOrDefault(s, 0);

m.put(s, currentCount + 1);

}

m = {}

text = ["sumomo", "mo", "momo", "mo", \

"momo", "no", "uchi"]

for s in text:

current_count = m.get(s, 0)

m[s] = current_count + 1

Python equivalent

sumomo

1

mo

2

momo

2

no

1

uchi

1

14 of 57

Java Libraries

The built-in java.util package provides a number of useful:

  • Interfaces: ADTs (lists, sets, maps, priority queues, etc.) and other stuff.
  • Implementations: Concrete classes you can use.

Today, we’ll learn the basic ideas behind the TreeSet and TreeMap.

Collection

List

Set

Map

ArrayList

Linked

List

TreeSet

HashSet

TreeMap

HashMap

15 of 57

Binary Search Trees

16 of 57

Analysis of an OrderedLinkedListSet<Character>

In an earlier lecture, we implemented a set based on unordered arrays. For the order linked list set implementation below, name an operation that takes worst case linear time, i.e. Θ(N).

size

contains

add

iterator

A

C

B

D

E

F

G

sent

7

size

17 of 57

Analysis of an OrderedLinkedListSet<Character>

In an earlier lecture, we implemented a set based on unordered arrays. For the order linked list set implementation below, name an operation that takes worst case linear time, i.e. Θ(N).

size

contains

add

iterator

A

C

B

D

E

F

G

sent

7

size

18 of 57

Optimization: Extra Links

Fundamental Problem: Slow search, even though it’s in order.

A

C

B

D

E

F

G

  • Add (random) express lanes. Skip List (won’t discuss in 61B)

19 of 57

Optimization: Change the Entry Point

Fundamental Problem: Slow search, even though it’s in order.

  • Move pointer to middle.

A

C

B

D

E

F

G

20 of 57

Optimization: Change the Entry Point, Flip Links

Fundamental Problem: Slow search, even though it’s in order.

  • Move pointer to middle and flip left links. Halved search time!

A

C

B

D

E

F

G

21 of 57

Optimization: Change the Entry Point, Flip Links

Fundamental Problem: Slow search, even though it’s in order.

  • How do we do even better?
  • Dream big!

A

C

B

D

E

F

G

22 of 57

Optimization: Change Entry Point, Flip Links, Allow Big Jumps

Fundamental Problem: Slow search, even though it’s in order.

  • How do we do better?

A

C

B

D

E

F

G

A

C

B

D

E

F

G

23 of 57

BST Definitions

24 of 57

Tree

A tree consists of:

  • A set of nodes.
  • A set of edges that connect those nodes.
    • Constraint: There is exactly one path between any two nodes.

Green structures below are trees. Pink ones are not.

25 of 57

Rooted Trees and Rooted Binary Trees

In a rooted tree, we call one node the root.

  • Every node N except the root has exactly one parent, defined as the first node on the path from N to the root.
  • Unlike (most) real trees, the root is usually depicted at the top of the tree.
  • A node with no child is called a leaf.

In a rooted binary tree, every node has either 0, 1, or 2 children (subtrees).

A

B

C

A

B

C

A

C

C

B

For each of these:

  • A is the root.
  • B is a child of A. (and C of B)
  • A is a parent of B. (and B of C)

Not binary!

26 of 57

Binary Search Trees

A binary search tree is a rooted binary tree with the BST property.

BST Property. For every node X in the tree:

  • Every key in the left subtree is less than X’s key.
  • Every key in the right subtree is greater than X’s key.�

dog

bag

flat

alf

cat

elf

glut

debt

bus

ears

axe

cow

fish

gut

Binary Tree, but not a Binary Search Tree

Binary Search Tree

27 of 57

Binary Search Trees

Ordering must be complete, transitive, and antisymmetric. Given keys p and q:

  • Exactly one of p ≺ q and q ≺ p are true.
  • p ≺ q and q ≺ r imply p ≺ r.

One consequence of these rules: No duplicate keys allowed!

  • Keeps things simple. Most real world implementations follow this rule.

dog

bag

flat

alf

cat

elf

glut

debt

bus

ears

axe

cow

fish

gut

Binary Tree, but not a Binary Search Tree

Binary Search Tree

28 of 57

BST Operations: Search

29 of 57

Finding a searchKey in a BST (come back to this for the BST lab)

If searchKey equals T.key, return.

  • If searchKey T.key, search T.left.
  • If searchKey ≻ T.key, search T.right.

dog

bag

flat

alf

cat

elf

glut

30 of 57

Finding a searchKey in a BST

If searchKey equals T.key, return.

  • If searchKey T.key, search T.left.
  • If searchKey ≻ T.key, search T.right.

static BST find(BST T, Key sk) {

if (T == null)

return null;

if (sk.equals(T.key))

return T;

else if (sk ≺ T.key)

return find(T.left, sk);

else

return find(T.right, sk);

}

dog

bag

flat

alf

cat

elf

glut

31 of 57

BST Search: http://yellkey.com/?

What is the runtime to complete a search on a “bushy” BST in the worst case, where N is the number of nodes.

  1. Θ(log N)
  2. Θ(N)
  3. Θ(N log N)
  4. Θ(N2)
  5. Θ(2N)

“bushiness” is an intuitive concept that we haven’t defined.

32 of 57

BST Search

What is the runtime to complete a search on a “bushy” BST in the worst case, where N is the number of nodes.

  1. Θ(log N)

Height of the tree is ~log2(N)

33 of 57

BSTs

Bushy BSTs are extremely fast.

  • At 1 microsecond per operation, can find something from a tree of size 10300000 in one second.

Much (perhaps most?) computation is dedicated towards finding things in response to queries.

  • It’s a good thing that we can do such queries almost for free.

34 of 57

BST Operations: Insert

35 of 57

Inserting a New Key into a BST

Search for key.

  • If found, do nothing.
  • If not found:
    • Create new node.
    • Set appropriate link.

Example:

insert “eyes”

dog

bag

flat

alf

cat

elf

glut

36 of 57

Inserting a New Key into a BST

Search for key.

  • If found, do nothing.
  • If not found:
    • Create new node.
    • Set appropriate link.

eyes

A common rookie bad habit to avoid:

static BST insert(BST T, Key ik) {

if (T == null)

return new BST(ik);

if (ik ≺ T.key)

T.left = insert(T.left, ik);

else if (ik ≻ T.key)

T.right = insert(T.right, ik);

return T;

}

if (T.left == null)

T.left = new BST(ik);

else if (T.right == null)

T.right = new BST(ik);

dog

bag

flat

alf

cat

elf

glut

ARMS LENGTH RECURSION!!!! No good.

37 of 57

BST Operations: Delete

38 of 57

Deleting from a BST

3 Cases:

  • Deletion key has no children.
  • Deletion key has one child.
  • Deletion key has two children.

eyes

dog

bag

flat

alf

cat

elf

glut

39 of 57

Case 1: Deleting from a BST: Key with no Children

Deletion key has no children (“glut”):

  • Just sever the parent’s link.
  • What happens to “glut” node?

eyes

dog

bag

flat

alf

cat

elf

glut

40 of 57

Case 1: Deleting from a BST: Key with no Children

Deletion key has no children (“glut”):

  • Just sever the parent’s link.
  • What happens to “glut” node?
    • Garbage collected.

eyes

dog

bag

flat

alf

cat

elf

glut

41 of 57

Case 2: Deleting from a BST: Key with one Child

Example: delete(“flat”):

Goal:

  • Maintain BST property.
  • Flat’s child definitely larger than dog.
    • Safe to just move that child into flat’s spot.

Thus: Move flat’s parent’s pointer to flat’s child.

eyes

dog

bag

flat

alf

cat

elf

42 of 57

Case 2: Deleting from a BST: Key with one Child

Example: delete(“flat”):

Goal:

  • Maintain BST property.
  • Flat’s child definitely larger than dog.
    • Safe to just move that child into flat’s spot.

Thus: Move flat’s parent’s pointer to flat’s child.

  • Flat will be garbage collected (along with its instance variables).

eyes

dog

bag

flat

alf

cat

elf

43 of 57

Hard Challenge

Delete k.

e

b

g

a

d

f

v

p

y

m

r

x

z

k

44 of 57

Hard Challenge

Delete k.

e

b

g

a

d

f

v

p

y

m

r

x

z

k

45 of 57

Case 3: Deleting from a BST: Deletion with two Children (Hibbard)

Example: delete(“dog”)

Goal:

  • Find a new root node.
  • Must be > than everything in left subtree.
  • Must be < than everything right subtree.

�Would bag work?

eyes

dog

bag

flat

alf

cat

elf

glut

46 of 57

Case 3: Deleting from a BST: Deletion with two Children (Hibbard)

Example: delete(“dog”)

Goal:

  • Find a new root node.
  • Must be > than everything in left subtree.
  • Must be < than everything right subtree.

�Choose either predecessor (“cat”) or successor (“elf”).

  • Delete “cat” or “elf”, and stick new copy in the root position:
    • This deletion guaranteed to be either case 1 or 2. Why?
  • This strategy is sometimes known as “Hibbard deletion”.

eyes

dog

bag

flat

alf

cat

elf

glut

47 of 57

Hard Challenge (Hopefully Now Easy)

Delete k.

e

b

g

a

d

f

v

p

y

m

r

x

z

k

48 of 57

Hard Challenge (Hopefully Now Easy)

Delete k. Two solutions: Either promote g or m to be in the root.

  • Below, solution for g is shown.

e

b

g

a

d

f

v

p

y

m

r

x

z

k

49 of 57

Hard Challenge (Hopefully Now Easy)

Two solutions: Either promote g or m to be in the root.

  • Below, solution for g is shown.

e

b

g

a

d

f

v

p

y

m

r

x

z

50 of 57

Sets vs. Maps, Summary

51 of 57

Sets vs. Maps

Can think of the BST below as representing a Set:

  • {mo, no, sumomo, uchi, momo}

sumomo

momo

mo

no

uchi

sumomo

mo

momo

no

uchi

52 of 57

Sets vs. Maps

Can think of the BST below as representing a Set:

  • {mo, no, sumomo, uchi, momo}

But what if we wanted to represent a mapping of word counts?

sumomo

momo

mo

no

uchi

sumomo

mo

momo

no

uchi

sumomo

1

mo

2

momo

2

no

1

uchi

1

????

53 of 57

Sets vs. Maps

To represent maps, just have each BST node store key/value pairs.

Note: No efficient way to look up by value.

  • Example: Cannot find all the keys with value = 1 without iterating over ALL nodes. This is fine.

sumomo 1

momo 2

mo 2

no 2

uchi 1

sumomo

1

mo

2

momo

2

no

1

uchi

1

54 of 57

Summary

Abstract data types (ADTs) are defined in terms of operations, not implementation.

Several useful ADTs: Disjoint Sets, Map, Set, List.

  • Java provides Map, Set, List interfaces, along with several implementations.

We’ve seen two ways to implement a Set (or Map): ArraySet and using a BST.

  • ArraySet: Θ(N) operations in the worst case.
  • BST: Θ(log N) operations in the worst case if tree is balanced.

BST Implementations:

  • Search and insert are straightforward (but insert is a little tricky).
  • Deletion is more challenging. Typical approach is “Hibbard deletion”.

55 of 57

BST Implementation Tips

56 of 57

Tips for BST Lab

  • Code from class was “naked recursion”. Your BSTMap will not be.
  • For each public method, e.g. put(K key, V value), create a private recursive method, e.g. put(K key, V value, Node n)
  • When inserting, always set left/right pointers, even if nothing is actually changing.
  • Avoid “arms length base cases”. Don’t check if left or right is null!

static BST insert(BST T, Key ik) {

if (T == null)

return new BST(ik);

if (ik ≺ T.label()))

T.left = insert(T.left, ik);

else if (ik ≻ T.label())

T.right = insert(T.right, ik);

return T;

}

Always set, even if nothing changes!

Avoid “arms length base cases”.

if (T.left == null)

T.left = new BST(ik);

else if (T.right == null)

T.right = new BST(ik);

57 of 57

Citations

Probably photoshopped binary tree: http://cs.au.dk/~danvy/binaries.html