The Math of FaceID
Lesson 2: Images as Points in Space
GOALS FOR LESSON 2
3. Explain how misclassifications can occur and their consequences.
Binary Classification Problems
Classification generally means putting things in groups. Motivation to study this field came from the human brain, in thinking about how humans put objects into groups or categories.
“cat”
“dog”
“fish”
Given that we can recognize the animals above, what does our brain think THIS is?
Binary Classification means that we’re only considering two groups, a “this-or-that” question. In the case of FaceID, the phone is always seeking to answer the question, “Is this me or not me?”
“me”
“not me”
There are many technologies in our daily lives that rely on binary classification! A few examples:
Phone needs to decide… … is it me or not?
Email program needs to decide… … is it spam or not?
Turnitin.com needs to decide… … is it plagiarized or not?
Siri needs to decide… … am I being spoken to (“Hey Siri!”) or not?
There are many technologies in our daily lives that rely on binary classification! A few examples:
Phone needs to decide… … is it me or not?
Email program needs to decide… … is it spam or not?
Turnitin.com needs to decide… … is it plagiarized or not?
Siri needs to decide… … am I being spoken to (“Hey Siri!”) or not?
Brainstorm: What examples of binary classifications do you see used by technologies in everyday life?
Review from Lesson 1
During the last class, we introduced the idea that computers process images as a form of data.
What were some big ideas for you?
During the last class, we introduced the idea that computers process images as a form of data.
(Possible responses to “what are lesson 1 big ideas” prompt on previous slide)
In lesson 1, we discussed three possible ways to compare images to one another.
Facial Features
The human brain pays attention to different facial features.
Facial Distances
One strategy explored in lesson 1 took distances between important facial features.
Pixel Brightness
Another strategy explored in lesson 1 measured differences between individual pixels.
Using two different strategies yesterday, we came up with “unlock rules” based on the similarity between two images.
Humans see images and ask: “Are these faces similar enough to be the same person?”
The phone processes a list of numbers and asks: “Are these lists of numbers similar enough to be the same person?”
We came back to an idea from Statistics, the mean-squared error, to quantify the amount of difference.
Reference Photo
Current Photo
Add up variable differences squared, then take the mean of that number to calculate the mean-squared error (MSE).
= MSE
The phone processes a list of numbers and asks: “Are these lists of numbers similar enough to be the same person?”
We came back to an idea from AP Statistics, the mean-squared error, to quantify the amount of difference.
Reference Photo
Current Photo
Add up variable differences squared, then take the mean of that number to calculate the mean-squared error (MSE).
= MSE
The phone processes a list of numbers and asks: “Are these lists of numbers similar enough to be the same person?”
(10.5)2 + (7.1)2 + (0.1)2 + (4.8)2 + (12.2)2 + (-6.2)2 + (3.5)2 + (6.1)2 + (1.7)2 + (1)2
10
We came back to an idea from AP Statistics, the mean-squared error, to quantify the amount of difference.
Reference Photo
Current Photo
Add up variable differences squared, then take the mean of that number to calculate the mean-squared error (MSE).
= MSE
The phone processes a list of numbers and asks: “Are these lists of numbers similar enough to be the same person?”
42.434
We came back to an idea from AP Statistics, the mean-squared error, to quantify the amount of difference.
Reference Photo
Current Photo
The phone’s unlock rule determines if the MSE is “similar enough."
For example, an unlock rule could be, “If the MSE is less than 100, then the photo is similar enough to the reference photo of me that iPhone determines the current photo is probably me as well.”
= MSE
The phone processes a list of numbers and asks: “Are these lists of numbers similar enough to be the same person?”
42.434
[ Placeholder Slide: teacher could input an example of student “summative slide” from the previous day showing me and not-me images, with MSE’s and a threshold value. ]
Multiple Reference Images
Having only a single stored image of your face in the phone’s memory could be a problem.
Talk with a partner: why could using only a single reference image be problematic?
“You” could appear in many different ways (different brightness levels, different facial distances).
“You” could appear in many different ways (different brightness levels, different facial distances).
“You” could appear in many different ways (different brightness levels, different facial distances).
“You” could appear in many different ways (different brightness levels, different facial distances).
By saving many reference images of yourself, you’re telling iPhone, “all of these lists of numbers represent me.”
“You” could appear in many different ways (different brightness levels, different facial distances).
By giving FaceID more data of ourselves, it can make better lock/unlock decisions!
“You” could appear in many different ways (different brightness levels, different facial distances).
Thinking about Space
So far, we’ve been thinking about our image data as lists of numbers that represent our facial features.
Another way to think about similarity of images would be to visualize them as points in space (e.g., points on a graph)
So far, we’ve been thinking about our image data as lists of numbers that represent our facial features.
Another way to think about similarity of images would be to visualize them as points in space (e.g., points on a graph)
If we think about our data as points in space, it would seem reasonable that photos of “me” would be close to each other, while “not me” would be farther apart.
“me”
“not me”
These groupings are called clusters in machine learning. Photos of the same person should form a cluster together in space.
“me”
“not me”
Practical Note: The graph is to help us visualize image data. Images are not 2-dimensions. They are actually thousands of dimensions, one for every pixel! But, that’s really hard for human brains to visualize. So for now, think of the coordinate plane as a way to think about image similarity in two-dimensions.
Furthermore, maybe it could be useful to figure out a decision rule so that, whenever I try to unlock my phone in the future, iPhone knows where the “me” cluster generally belongs.
“me”
“not me”
In this data, fitting a straight line could separate “me” from “not me” photos.
“me”
“not me”
Would straight lines separate “me” from “not me” in all cases?
“me”
“not me”
Optional: k-nearest neighbors
Example of a decision rule:
Say that I took a new photo of myself, boxed in red. If we agree that photos of me cluster in the same general space, perhaps I should look at nearby photos to make the me/not me decision.
“me”
“not me”
What’s nearby?
Let’s say I use the closest photo to the new one to make a decision.
“me”
“not me”
What’s nearby?
Let’s say I use the closest photo to the new one to make a decision.
The nearest photo to me is also me - so phone unlocks!
“me”
“not me”
But! – Maybe there are good reasons to believe we shouldn’t be confident with just examining the single nearest photo.
What could go wrong?
What would be a better decision rule?
What’s nearby?
What if we instead use the nearest three photos?
“me”
“not me”
What’s nearby?
What if we instead use the nearest three photos?
2 photos are me and 1 is John Legend. So, it’s more likely the photo is of me, and the phone unlocks!
“me”
“not me”
This decision rule is formally called k-nearest neighbors.
The idea is that you choose “k” (some number) of nearby data points to examine when making a decision about what the current data point is.
Note that k-nearest neighbors does not always work!
“me”
“not me”
For instance, let’s say the green highlighted photo was new, and we set k = 3 (we look at the three nearest photos to make a decision).
“me”
“not me”
K-nearest neighbors would make the wrong decision, and Tom Brady would be able to unlock my phone!
“me”
“not me”
The quality of our model really depends on the data and on which variables we are using to make the classification decision!
“me”
“not me”
Head width
Head height
Activity: Classifying Images in Space
In this activity you will:
(1) Record 10 examples of “me” faces and 10 examples of “not me” images with a partner. Think about the importance of statistical variability and representativeness as you record your samples.
(2) The data widget will save your photos as numeric representations in a spreadsheet called a CSV file.
(3) Upload the CSV to the explorer widget, and use it to visualize the data in graphs.
(4) [Optional] Upload the CSV to the KNN widget, and use it to create simple classification models.
Reflection Questions:
(1) Why is it helpful to think about your data as “points in a space”?
(2) What variables, if any, gave you clear “me” and “not me” groups? Are you able to fit a line through your data that clearly separates the groups? Why or why not?
(3) What are the practical, real-world consequences of drawing a line (a boundary) that does not correctly classify 100% of the data?
(4) Can you think of a better way to classify the data into “me” and “not me” groups other than drawing a line?
Whole-Group Discussion
Which combinations of variables separated classes well? Which did not?
Why, mathematically, could the same pair of variables work well for one set of partners but not another?
Not having a good model for separable classes has real-world consequences.
Suppose the example below was my model.
“me”
“not me”
Not having a good model for separable classes has real-world consequences.
Suppose the example below was my model.
“me”
In this case, Tom Brady would unlock my phone! This a false positive.
The phone thought it was me, when it really wasn’t.
“not me”
Not having a good model for separable classes has real-world consequences.
Now, let’s suppose that this was my model.
“me”
“not me”
Not having a good model for separable classes has real-world consequences.
Now, let’s suppose that this was my model.
“me”
“not me”
In this case, I’d be locked out of my phone! This is a false negative.
The phone thought it wasn’t me, but it really was.
One way to visualize decision rule outcomes is in a matrix.
Phone’s Decision…
Unlock
Lock
Image Contains…
Me
Not Me
Correct Decision!
Correct Decision!
False Negative
False Positive
What is the more dangerous real-world faulty decision? Why?
In real life, false negatives are not so harmful. It just means that your phone doesn’t recognize you, and there could be a number of reasons why,
If the phone doesn’t recognize you, it just takes you to the keypad for you to enter a numerical passcode.
But false positives mean that strangers can unlock your phone.
Phone’s Decision…
Unlock
Lock
Image Contains…
Me
Not Me
Correct Decision!
Correct Decision!
False Negative
False Positive
What can designers of FaceID do to avoid false positives?
Wrapping Up
Today we learned…
3. We discussed how misclassifications can occur and their consequences.