Lecture 21:
Hash Tables, Part 1
CS 136: Spring 2024
Katie Keith
Record on Zoom
📣 Announcements
📚Readings
Multiple data structures can implement the same ADT
Binary Search Tree
Data Structures
Linked List
Hash Tables
List
Finite number of elements
(same element may occur more than once)
Two arrays (sorted)
Stack
Last in first out (LIFO) operations
Symbol Table
Associates a key with a value
Graph
A set of vertices, pairs of which are connected by edges
Queue
First in first out (FIFO) operations
String[] keys;
int[] values;
Mon
Mon
Weds
Lab 6: Implement!
Today!
ADT
Review: Python Dictionaries are Symbol Tables
gold_medals = {
"USA": 40,
"China": 40,
"Japan": 20,
"Saint Lucia": 1,
}
gold_medals["New Zealand"] = 10
print(gold_medals["USA"])
put operation
get operation (search!)
Key of a dictionary
Value of a dictionary
Preview: Python dictionaries are implemented with Hash Tables!
🎯 Today’s Learning Objectives
Binary Search Tree: Inner Node Class
public class BST<Key, Value> implements SymbolTable<Key, Value> {
private Node root; // root node of BST
private int numPairs; // number of key-value pairs in the symbol table
private class Node {
private Key key; // BSTs maintain left/right subtree ordering by key
private Value val; // value for the node
private Node left; // node at the beginning of the left subtree
private Node right; // node at the beginning of the right subtree
...
The beginning of a Java implementation of a Binary Search Tree
BST put example
'E'
root node
the key of this node is ‘E’
105
'S'
30
'X'
106
'A'
35
'C'
12
'H'
24
'R'
45
the value of this node is 105
Step 1. Search for ‘L’. This ends in a null link.
Step 2. Create new node and update the link of the parent node.
'L’
76
BST<Character, Integer> bst = new BST<Character, Integer>();
bst.put(‘S’, 30);
bst.put(‘E’, 105);
…
bst.put(‘L’, 76);
The method below finds the smallest common ancestor node of two nodes’ keys. What is printed?
public static int sca(Node current, int pKey, int qKey){
while (current != null) {
if (pKey > current.key && qKey > current.key) {
current = current.right;
}
else if (pKey < current.key && qKey < current.key) {
current = current.left;
}
else {
return current.key;
}
}
return -1;
}
public static void main(String[] args){
BinarySearchTree<Integer, Integer> ss;
ss = new BinarySearchTree<Integer, Integer>();
ss.put(6, 1);
ss.put(2, 3);
ss.put(8, 9);
ss.put(0, 5);
ss.put(4, 2);
ss.put(7, 1);
ss.put(9, 9);
ss.put(3, 4);
ss.put(5, 0);
int low = sca(ss.root, 3, 0);
System.out.println(low);
}
put(key, value)
Keys are integers so we have dropped the comparator from put
💡Think-pair-share
Analysis of Symbol Table’s Data Structures
Data Structure | put | get |
(Unordered) Linked List | O(n) | O(n) |
Two arrays (keys sorted, binary search) | O(n) | O(log n) |
Binary Search Tree | Ave = O(log n) Worst = O(n) | Ave = O(log n) Worst = O(n) |
Hash Table (with separate chaining) | | |
Hash table (with linear probing) | | |
✅
🎯 Today’s Learning Objectives
Precursor idea: Arrays
Arrays are fast!
Performance guarantees:
“pqr”
null
“xyz”
“ijk”
Why?
In Java, all elements in an array are equally sized and stored in a contiguous block of memory, so obtaining the memory addresses is a simple arithmetic operation, O(1)
Precursor idea: Arrays
If keys were small integers, we could use an array to implement an unordered symbol table.
Key | Value |
‘a’ | “xyz” |
‘b’ | “pqr” |
‘c’ | “ijk” |
Key | Hash Code |
‘a’ | 2 |
‘b’ | 0 |
‘c’ | 3 |
Array index (where the key’s value will be stored)
“pqr”
null
“xyz”
“ijk”
hash code = index = 0
1
2
3
Need: A hash function to map keys to hash codes.
Hash function properties
We want to create hash functions that obey the following properties:
Examine the method below. Is this a “good” hash function? Which properties of hash functions does it align (not align) with?
💡Think-pair-share
Modular hashing with primes
Example:
Key | Hash code = Key % 4 | Hash code = Key % 7 |
10 | 2 | 3 |
20 | 0 | 6 |
30 | 2 | 2 |
40 | 0 | 5 |
50 | 2 | 1 |
60 | 0 | 4 |
Using M=4 (not prime), hash codes are all 0 or 2 (not evenly distributed).
Using M=7 (prime) hash codes are evenly distributed.
Hash function:
Key % M where M is prime
Why? Prime numbers have no divisors other than 1 and themselves, reducing the chance that patterns in the keys align with %M.
Scenario:
Keys: n=0 to n=499 (ints)
M = 97
Hash function: n % M for each n
Questions:
Figure credit: Adobe stock
💡Think-pair-share
Preview: Monday
Two approaches to collision resolution
We’ll talk about the time-space tradeoff!
✅
✅
🎯 Today’s Learning Objectives
Java’s hashCode()
Goal: Reproduce Java’s hash code for Strings
Key | Hash code |
“Blah” | 2073105 |
“great” | ??? |
“dogs!” | ??? |
Integers
String s = "Blah";
System.out.println(s.hashCode());
Prints: 2073105
How? Why?
ASCII code
https://www.cs.cmu.edu/~pattis/15-1XX/common/handouts/ascii.html
Aside: Unicode
Goal: Reproduce Java’s hash code for Strings
// Method that replicates how Java makes a hashCode for a String
public static int ourStringHash(String s){
int R = 31; // Java chooses R as this small prime
int hash = 0;
for (int i = 0; i < s.length(); i++)
hash = R * hash + s.charAt(i);
return hash;
}
“Polynomial rolling hash function”: ensures that all characters in the string influence the final hash value, reducing the likelihood of collisions
Adds ASCII value of current character
HashCodePlay.java
💻
Java: Modular hashing function
public static int hash(Key key) {
return (key.hashCode() & 0x7fffffff) % M;
}
Java’s required hash code method for every data type
Hexadecimal number which is the maximum positive value for a 32-bit signed binary number in Java
Bitwise AND operation
Strips off the sign bit of the hash code, ensuring that result is a non-negative integer.
Mod operator
Size of array (typically we choose a very large prime number)
Custom classes: Must write .hashCode and .equals
class CustomClass{
...
public boolean equals(Object obj){
...
}
public int hashCode(){
...
}
}
Make sure these are consistent!
✅
✅
✅
🎯 Today’s Learning Objectives