ML Problems: Formulation and Adoption
$whoami
2
ML is fascinating!
“A 3D render of an astronaut walking in a green desert” (Stable Diffusion 2)
What are we upto today?
4
Disclaimer: The talk is focused on initiating an ML project and NOT on what goes after initiating one.
Problem Formulation in ML
Prompt: “Formulating problem statements in ML” (Stable Diffusion 2)
What is Supervised Machine Learning?
In 90 seconds,
summarise what you know in pairs. Go!
Terminology
Label is the true thing we are predicting “y”
8
Terminology
Label is the true thing we are predicting “y”
Features are input variables describing our data “x1”
9
Terminology
Label is the true thing we are predicting “y”
Features are input variables describing our data “x1
Example is a particular instance of data, x
10
Terminology
Label is the true thing we are predicting “y”
Features are input variables describing our data “x1
Example is a particular instance of data, x
Labeled example {features, label}: (x, y)
11
Terminology
Label is the true thing we are predicting “y”
Features are input variables describing our data “x1
Example is a particular instance of data, x
Labeled example {features, label}: (x, y)
Unlabeled example {feature, ?}: (x, ?)
12
Terminology
Label is the true thing we are predicting “y”
Features are input variables describing our data “x1
Example is a particular instance of data, x
Labeled example {features, label}: (x, y)
Unlabeled example {feature, ?}: (x, ?)
Model maps examples to prediction labels: y
13
Extensions
14
Recommended reading: https://developers.google.com/machine-learning/guides/rules-of-ml/
ML in the wild
15
Deep learning algorithm does as well as dermatologists in identifying skin cancer
Label: __________
Feature: __________
Example: __________
Labeled Examples: ___________
Unlabeled Examples: ___________
Output: ___________
A Framework
16
What will traffic be like tomorrow?
Weather forecast could be informative.
Collect historical traffic and weather data.
Test a model with the data
Is this model better than existing systems?
I should (not) use this model, because of X, Y, and Z.
Time of year could be a helpful signal.
ML Adoption
Prompt: “ML adoption in 2022 with Google technologies” (Stable Diffusion 2)
18
Stage I: PoC
Considerations
19
Data?
20
Start with existing ML models / APIs if possible
21
Focus on one ML problem at a time
22
Choose the right tooling for training
23
Choose the right tooling for training
24
Why?
25
Why?
26
Training considerations
27
Interacting with the model for usage
28
Model consumption
29
Model consumption
30
Model consumption
31
32
Stage II: MVP and Beyond
Scaling up
33
TFX comes to the rescue
34
Why TFX though?
35
Why TFX though?
Need more reasons?
36
Scaling up
37
Parallely explore Vertex AI
Support for
38
Read more here: https://cloud.google.com/vertex-ai
Recommended readings
39
Wrapping up
40