1 of 32

To do: November 16 - Function Game

I have a function rule. Someone will give me an input, and I will give them a corresponding output.

The goal is to figure out what my function rule is.

If you know what the function is, raise your hand but DO NOT SAY IT OUT LOUD!

Instead, I will give you an input and if you give me the correct output, then this will confirm you know the function rule.

2 of 32

What is a function?

It is a relation/mapping. Each element (input) in the domain (pre-image) is mapped to an element (output) in the range (image).

Every input should be mapped to exactly one output. However, two different inputs can be mapped to the same output.

3 of 32

Go over homework

  • Repl.it 06-01 Intro to Sets and 06-02 New Words
  • goFormative - Sets

Compare implementations of New Words (goFormative)

4 of 32

HashSet – a set ordered by each

item’s hashCode that is extremely time efficient. O(1) for search/insert/delete operations

TreeSet – a naturally ordered set that is very efficient, but not as efficient as HashSet.

O(logn) for search/insert/delete operations

Java Set

5 of 32

  • HashSet and HashMap are both created around hash tables.
  • A hash table essentially an array. Each element is inserted into the array according to a hash function.

0 1 2 3 4

What is a hash table?

6 of 32

Hash Functions

Modified from slides by William Fiset

https://github.com/williamfiset

7 of 32

A hash function H(x) is a function that maps a key ‘x’ to a whole number in a fixed range.

Which operation would be useful for this purpose?

Hash Functions

8 of 32

A hash function H(x) is a function that maps a key ‘x’ to a whole number in a fixed range.

For example, H(x) = (x² - 6x + 9) mod 10 maps all integer keys to the range [0,9]

H(4) = (16 - 24 + 9) mod 10 = 1

H(-7) = (49 + 42 + 9) mod 10 = 0

H(0) = (0 - 0 + 9) mod 10 = 9

H(2) = (4 - 12 + 9) mod 10 = 1

H(8) = (64 - 48 + 9) mod 10 = 5

Hash Functions

9 of 32

We can also define hash functions for arbitrary objects such as strings, lists, multi data objects, etc…

10 of 32

We can also define hash functions for arbitrary objects such as strings, lists, multi data objects, etc…

For a string s let hash(s) be a hash function defined below

public int hash(String s)

{

int sum = 0;

for (int i=0; i<s.length(); i++)

{

sum += s.charAt(i);

}

return sum % 50;

}

11 of 32

public int hash(String s)

{

int sum = 0;

for (int i=0; i<s.length(); i++)

{

sum += s.charAt(i);

}

return sum % 50;

}

hash(“BB”) =

hash(“”) =

hash(“ABC”) =

hash(“Z”) =

12 of 32

public int hash(String s)

{

int sum = 0;

for (int i=0; i<s.length(); i++)

{

sum += s.charAt(i);

}

return sum % 50;

}

hash(“BB”) = (66 + 66) % 50 = 32

hash(“”) = (0) % 50 = 0

hash(“ABC”) = (65 + 66 + 67) % 50 = 48

hash(“Z”) = (90) % 50 = 40

13 of 32

Challenge: Suppose we have a database of people objects with three fields: name, age and sex. Can you define a hash function H(person) that maps a person to the set {0,1,2,3,4,5}?

Name

Age

Sex

Hash

William

21

M

?

Kate

19

F

?

Bob

33

M

?

Rose

26

F

?

14 of 32

Name

Age

Sex

Hash

William

21

M

?

Kate

19

F

?

Bob

33

M

?

Rose

26

F

?

There are an infinite number of possible valid hash functions H(Person p), here is one:

public int H(Person p){

int hash = p.age;

hash += p.name.length()

if (p.sex == 'M’)

hash ++;

return hash % 6;

}

15 of 32

Name

Age

Sex

Hash

William

21

M

5

Kate

19

F

5

Bob

33

M

1

Rose

26

F

0

There are an infinite number of possible valid hash functions H(person), here is one:

public int H(Person p){

int hash = p.age;

hash += p.name.length()

if (p.sex == 'M’)

hash ++;

return hash % 6;

}

16 of 32

Properties of Hash functions

If H(x) = H(y) then what do we know?

Name

Age

Sex

Hash

William

21

M

5

Kate

19

F

5

Bob

33

M

1

Rose

26

F

0

17 of 32

Properties of Hash functions

If H(x) = H(y) then what do we know?

If H(x) = H(y) then objects x and y might be equal

Name

Age

Sex

Hash

William

21

M

5

Kate

19

F

5

Bob

33

M

1

Rose

26

F

0

18 of 32

Properties of Hash functions

If H(x) ≠ H(y) then what do we know?

Name

Age

Sex

Hash

William

21

M

5

Kate

19

F

5

Bob

33

M

1

Rose

26

F

0

19 of 32

Properties of Hash functions

If H(x) ≠ H(y) then what do we know?

x and y are certainly not equal.

Name

Age

Sex

Hash

William

21

M

5

Kate

19

F

5

Bob

33

M

1

Rose

26

F

0

20 of 32

Properties of Hash functions

Q: How can we use this to our advantage to speedup object comparisons?

If H(x) = H(y) then objects x and y might be equal, but if H(x) ≠ H(y) then x and y are certainly not equal.

21 of 32

Properties of Hash functions

Q: How can we use this to our advantage to speedup object comparisons?

A: This means that instead of comparing x and y directly, a

smarter approach is to first compare their hash values,

and only if the hash values match do we need to explicitly compare x and y.

If H(x) = H(y) then objects x and y might be equal, but if H(x) ≠ H(y) then x and y are certainly not equal.

22 of 32

Properties of Hash functions

A hash function H(x) must be deterministic.

This means that if H(x) = y then H(x) must always produce y and never another value. This may seen obvious, but it is critical to the functionality of a hash function.

23 of 32

Properties of Hash functions

Example of non-deterministic hash function:

private static int counter = 0;

public int H(int x){

counter ++;

return (x + counter) % 13;

}

The first time called H(5) = 6, but if called again H(5) = 7.

A hash function H(x) must be deterministic.

This means that if H(x) = y then H(x) must always produce y and never another value. This may seen obvious, but it is critical to the functionality of a hash function.

24 of 32

Properties of Hash functions

We also try very hard to make uniform hash functions to MINIMIZE the number of hash collisions.

A hash collision is when two objects x, y hash to the same value (i.e. H(x) = H(y)).

25 of 32

Properties of Hash functions

We try very hard to make uniform hash functions to minimize the number of hash collisions.

A hash collision is when two objects x, y hash to the same value (i.e. H(x) = H(y)).

Name

Age

Sex

Hash

William

21

M

5

Kate

19

F

5

Bob

33

M

1

Rose

26

F

0

In the table we generated earlier William and Kate have a hash collision.

26 of 32

How does a hash table work?

Ideally we would like to have a very fast insertion, lookup and removal time for the data we are placing within our hash table.

Remarkably, we can achieve all this in O(1)* time using a hash function as a way to index into a hash table.

* The constant time behaviour attributed to hash tables is only true if you have a good uniform hash function!

27 of 32

Where have we seen hashing before?

28 of 32

Where have we seen hashing before?

Object toString() method -> hashCode()

https://docs.oracle.com/javase/8/docs/api/index.html

29 of 32

Object .equals method

If two objects are equal according to the equals(Object) method, calling the hashCode() method on each of the two objects must produce the same value.

If two objects are unequal according to the equals(java.lang.Object) method, calling the hashCode method on each of the two objects doesn't need to produce distinct integer results. However, developers should be aware that producing distinct integer results for unequal objects improves the performance of hash tables.

https://www.baeldung.com/java-hashcode

30 of 32

Repl.it 06-03.hash functions

31 of 32

The Collection interface is the parent of the List interface and Set interface. The Collection interface has many methods listed including add(), clear(), remove(), and size().

Collection

List

Set

Collection Interface

32 of 32

Homework

1 Hackerrank problem - complete online. Then copy implementation in IN and submit on Hub

  • Java Comparator: https://www.hackerrank.com/challenges/java-comparator/problem
    • Checker class must NOT be declared as public