1 of 17

Binary Search Trees

Previously…

2 of 17

Binary Search Tree

The binary search tree (BST) is a hierarchical node-based data structure that aims to combine:

  • The efficient Binary Search algorithm (a strength of arrays)
  • The efficient operations for adding or removing elements (a strength of nodes).

Efficient add() and remove() from LinkedLists!

+

Previously…

3 of 17

Binary Search Tree Invariants

Each node in a binary search tree has a left and right child node, where:

  • All the elements to the left are less than the current element
  • All the elements to the right are greater than the current element

Previously…

4 of 17

Set & Map Abstract Data Types

Binary search trees are commonly used to implement sets or maps.

Set ADT

A collection of unique elements or “keys”. Unlike lists and deques, sets do not maintain indices for each element, which enables more efficient data structure implementations.

Map ADT

A collection that associates each unique key with a value. Maps are like sets except each key can have a (not necessarily unique) value.

Previously…

5 of 17

Adding to a BST

In a binary search tree, the add method inserts new values as leaf nodes. Consider different orders for adding the values 1, 2, 3, 4, 5, 6, 7 into a BST.

An insertion order that results in the tallest possible tree is [1, 2, 3, 4, 5, 6, 7]. This results in the worst-case runtime, Θ(N), for add/remove and search in the BST.

Give an insertion order that results in the shortest possible tree

and the big-theta bound Θ for this best case scenario of

add/remove and search in a BST.

1

6 of 17

Pre-Class Quiz: Question 1

7 of 17

Tries

8 of 17

Tries

  • Each node represents a single character within a string
  • The number of children per trie node can vary and depends on the size of the alphabet
  • To indicate that a node represents a complete word:
    • Trie Map: the value associated with the node is null
    • Trie Set: the isTerm value associated with the node is true

9 of 17

Runtime

  • Add/remove/search for a single string: the length of the string
    • Rather than the total number of strings!
  • We define the length of the string by the variable, L

Best Case and Worst Case Asymptotic Runtime: Θ(L)

10 of 17

Tries

Draw the trie containing all the words in: “she sells sea shells by the sea shore”. Omit null references to keep your drawing simple.

Then, identify the subtree containing all words starting with “se”.

2

11 of 17

Large Memory Usage

  • Tries need to consume a lot of memory space in order to store such a large number of R-length arrays.
  • The unicode alphabet that represents all possible char values has R = 65536. In other words, each node in this kind of trie would need to maintain its own 65536-length array.
  • How might we reduce the memory space usage?

12 of 17

Ternary Search Trees (TSTs)

13 of 17

Ternary Search Tree (TST)

  • Just like in a trie, each node represents a single character within a string
  • Whereas a trie can have up to 65536 non-null children, TST nodes can only have up to 3 non-null children.
    • This makes TSTs more memory efficient!!
  • Also uses isTerm to mark end of the word

14 of 17

TST Invariants

Left child

  • All strings not using the current character, and before the current string in the alphabet.

Middle child

  • All strings using the current character.

Right child

  • All strings not using the current character, and after the current string in the alphabet.

15 of 17

Ternary search trees

What node represents the string CAC? Where would we insert the string CCC?

3

A

C

C

G

G

C

G

C

C

A

G

C

C

1

2

3

4

5

6

*

*

*

*

*

*

*

16 of 17

Pre-Class Quiz: Question 1

17 of 17

Collecting strings

Consider the collect method, which can be used to collect all the strings in a TST.

List<String> list = new LinkedList<>();�collect(overallRoot, "", list);

Why add node.data in the mid case? Why are the recursive calls in this order?

4

void collect(Node node, String prefix, List<String> list) {

if (node == null) {

return;

}

collect(node.left, prefix, list);

if (node.isTerm) {

list.add(prefix + node.data);

}

collect(node.mid, prefix + node.data, list);

collect(node.right, prefix, list);

}