Hashmaps
Lab 7
CS61B Spring 2025
Announcements
Project 2A due 3/12 at 11:59 pm
Project 2B released! Checkpoint + Design Document due 3/21 at 11:59
CS61B Spring 2025
Throwback: Arrays and Maps
CS61B Spring 2025
Arrays
Question: How fast does it take for an array to lookup an item, given an index?
CS61B Spring 2025
Arrays
Question: How fast does it take for an array to lookup an item, given an index?
CS61B Spring 2025
Arrays
Question: How fast does it take for an array to lookup an item, given an index?
It’s pretty fast! We should try to take advantage of this characteristic to create another data structure.
CS61B Spring 2025
Maps
Let’s talk Maps for a minute. Remember, Maps in Java represent a mapping between a key and value pairing (<key, value>)
We’re going to try combining mapping with the constant lookup time of arrays!
CS61B Spring 2025
Maps
Let’s talk Maps for a minute. Remember, Maps in Java represent a mapping between a key and value pairing (<key, value>)
We’re going to try combining mapping with the constant lookup time of arrays!
So, how do we insert these pairings into an array?
CS61B Spring 2025
Hash Functions
CS61B Spring 2025
Hash Functions
What are hash functions?
CS61B Spring 2025
Hash Functions
What are hash functions?
Hash codes are of type int, which means they can take on any value of -2,147,483,648 to 2,147,483,647.
Do we just make… 4,294,967,296 slots in our underlying array for each <key, value> pairing?
CS61B Spring 2025
Modulo
So here’s our solution: %!
CS61B Spring 2025
Possible Problem?
However, if we know that our hash code is technically within a certain range, what happens if two keys end up with the same hash code?
This brings us into our next topic: hash collisions!
CS61B Spring 2025
Hash Collisions
CS61B Spring 2025
Hash Collisions
If two keys end up with the same hash code, this is called a hash collision (and they will end up mapping to the same bucket). To deal with them, at least for this lab, we’ll use external chaining.
CS61B Spring 2025
Hash Collisions
If two keys end up with the same hash code, this is called a hash collision (and they will end up mapping to the same bucket). To deal with them, at least for this lab, we’ll use external chaining.
External Chaining: Store all keys with the same hash code in a collection of their own, such as a linked list (i.e. each bucket will have a collection, and if a key maps to that bucket, we store it in the corresponding collection).
CS61B Spring 2025
Duplicates
What about duplicate keys? Assuming we’re using the same hash function, the key will be mapped to the same bucket (deterministic), so how do we check if a key is already in the hashmap?
CS61B Spring 2025
Duplicates
What about duplicate keys? Assuming we’re using the same hash function, the key will be mapped to the same bucket (deterministic), so how do we check if a key is already in the hashmap?
CS61B Spring 2025
Collisions
What happens if we have too many collisions?
CS61B Spring 2025
Collisions
What happens if we have too many collisions?
The less collisions (aka less elements in a single bucket), the better runtime for all HashMap operations!
CS61B Spring 2025
Back to Hash Functions
So if minimizing collisions is our goal, how does that relate to the hash function that we use, and how do we pick a good hash function?
CS61B Spring 2025
Back to Hash Functions
So if minimizing collisions is our goal, how does that relate to the hash function that we use, and how do we pick a good hash function?
CS61B Spring 2025
Resizing
CS61B Spring 2025
Resizing
What happens if we have too many elements in our HashMap, i.e. too many elements in all buckets, total?
CS61B Spring 2025
Resizing
At some point, when there are too many items in the hashmap, we want to resize our hashmap. Specifically, when our load factor surpasses a certain limit.
CS61B Spring 2025
Resizing
At some point, when there are too many items in the hashmap, we want to resize our hashmap. Specifically, when our load factor surpasses a certain limit.
Number of Items
Number of Buckets
Load Factor =
CS61B Spring 2025
Rehashing
We resize our HashMap by increasing the number of buckets and rehashing all the elements?
Thus, with a good resizing mechanism and a good hash function, all Hashmap operations run in amortized O(1) runtime!
CS61B Spring 2025
HashMaps
CS61B Spring 2025
HashMaps
Bringing this all together, we have our data structure: hashmaps!
CS61B Spring 2025
HashMaps
Bringing this all together, we have our data structure: hashmaps!
CS61B Spring 2025
Let’s walk through an example! For now, our hashmap starts out with 2 buckets and the load factor limit is set to 0.75. Our hash function will return a randomized integer. Note: We are only concerned with mapping here, so the value is omitted.
CS61B Spring 2025
hashFunction() → returns a random integer
Object 1
Let’s pass our first object into the hash function, Object1, and it returns an integer of 27. Which bucket does it go into?
CS61B Spring 2025
hashFunction() → returns a random integer
Object 2
Object 1
Using the modulus operator, it goes into the bucket corresponding to index of 1. How about for Object2, assuming it’s hash code is 22444?
CS61B Spring 2025
hashFunction() → returns a random integer
Object 1
Object 2
Reminder: our limit for this hash map is a load factor of 0.75 (we check for when it exceeds the load factor)
It’ll go into the bucket corresponding to index of 0! What do we have to do now?
CS61B Spring 2025
hashFunction() → returns a random integer
Object 1
Object 2
Our current load factor is 1. Our limit is 0.75.
We need to increase the number of buckets in our hashmap! This means that we have to rehash all the objects that are currently in the hashmap.
CS61B Spring 2025
Hash Codes:
Object 1
Object 2
Our current load factor is 1. Our limit is 0.75.
Assuming that our resize factor is 2, what will our hashmap look like? The hash codes of the current objects are provided above.
CS61B Spring 2025
New Hash Code:
Object 1
Object 2
It’ll look something like this! We rehashed all our objects based on their original hash codes and the new number of buckets in our hashmap.
CS61B Spring 2025
Additional Notes
Keep in mind that the key of our <key, value> pairing is what is passed into the hash function.
In addition, make sure that you maintain the association of your key with your value (i.e. that the entirety of the <key, value> pairing is placed into the correct bucket).
If a duplicate key is placed into the hashmap, then the old value is replaced with the new value.
CS61B Spring 2025
Lab Overview
CS61B Spring 2025
An Overview
Lab 07 is due Friday, 3/14 at 11:59 pm.
Deliverables:
Some tips:
CS61B Spring 2025
Some Tips
Map out what you want to do for each method first before you implement them. Potentially, you might find that you can create some helper methods to reduce redundant code and the number of lines of code written.
CS61B Spring 2025
Some Tips
Map out what you want to do for each method first before you implement them. Potentially, you might find that you can create some helper methods to reduce redundant code and the number of lines of code written.
Hashmaps: Roughly, you can break the steps of inserting into a hashmap with the following:
CS61B Spring 2025
Attendance
CS61B Spring 2025