JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 41

ML Problems: Formulation and Adoption

Sayak Paul

ML at 🤗

@RisingSayak

2 of 41

$whoami

ML at Hugging Face 🤗
Past: Carted, PyImageSearch, DataCamp, TCS Research
Open-source 🥑 (Keras, KerasCV, 🤗 Transformers, etc.)
Netflix nerd
Coordinates at sayak.dev

3 of 41

ML is fascinating!

“A 3D render of an astronaut walking in a green desert” (Stable Diffusion 2)

https://huggingface.co/spaces/stabilityai/stable-diffusion

BLOOMZ

https://huggingface.co/bigscience/bloom

4 of 41

What are we upto today?

Problem formulation in ML

Is my problem suitable to worked out with ML?
Yes:

How do we know?
Defining the fundamentals of an ML system
What metrics should I optimize?

ML adoption

Tooling for ML adoption at various stages

PoC
MVP and beyond

5 of 41

Disclaimer: The talk is focused on initiating an ML project and NOT on what goes after initiating one.

6 of 41

Problem Formulation in ML

Prompt: “Formulating problem statements in ML” (Stable Diffusion 2)

7 of 41

What is Supervised Machine Learning?

In 90 seconds,

summarise what you know in pairs. Go!

8 of 41

Terminology

Label is the true thing we are predicting “y”

The y variable in basic linear regression

9 of 41

Terminology

Label is the true thing we are predicting “y”

Features are input variables describing our data “x1”

The x1, x2, …, xn variables in basic linear regression

10 of 41

Terminology

Label is the true thing we are predicting “y”

Features are input variables describing our data “x1

Example is a particular instance of data, x

11 of 41

Terminology

Label is the true thing we are predicting “y”

Features are input variables describing our data “x1

Example is a particular instance of data, x

Labeled example {features, label}: (x, y)

Used to train the model

12 of 41

Terminology

Label is the true thing we are predicting “y”

Features are input variables describing our data “x1

Example is a particular instance of data, x

Labeled example {features, label}: (x, y)

Unlabeled example {feature, ?}: (x, ?)

Used for making predictions on new data

13 of 41

Terminology

Label is the true thing we are predicting “y”

Features are input variables describing our data “x1

Example is a particular instance of data, x

Labeled example {features, label}: (x, y)

Unlabeled example {feature, ?}: (x, ?)

Model maps examples to prediction labels: y

Defined by internal parameters, which are learned

14 of 41

Extensions

Do we have a precise understanding of the inputs and outputs of the ML system?
Do we know the decisions the ML system will help driving?
How do we measure success?
Can the problem be solved using heuristics?

15 of 41

ML in the wild

Deep learning algorithm does as well as dermatologists in identifying skin cancer

Label: __________

Feature: __________

Example: __________

Labeled Examples: ___________

Unlabeled Examples: ___________

Output: ___________

https://news.stanford.edu/2017/01/25/artificial-intelligence-used-identify-skin-cancer/

Original article: https://news.stanford.edu/2017/01/25/artificial-intelligence-used-identify-skin-cancer/

Printable Doc:

https://docs.google.com/document/d/1uUUJkIcrDGML1AUBWgdeSvDO2MLY-FzwvCzdpHCpCKs/edit?usp=sharing

[Provide 10 minutes for learners to read article and highlight/record each term as it relates to the model described.]

Review answers:

Label: Type of cancer or not cancer-malignant carcinoma or melanoma
Feature: What might features of skin cancer be? Asymmetry, irregular Borders, more than one or uneven distribution of Color, or a large (greater than 6mm) Diameter, the Evolution of moles (growth)
Example: image of a skin lesion
Labeled example: Biopsy confirmed image of cancerous lesion, labeled by a radiologist
Unlabeled example: image of skin- unknown if there is cancer
Output: prediction of cancer vs. not cancer

Stretch Questions:

What is the real-word impact of this model?
How can a computer “see” an image?

16 of 41

A Framework

Frame the problem:

What will traffic be like tomorrow?

Make a hypothesis:

Weather forecast could be informative.

Collect the data:

Collect historical traffic and weather data.

Test hypothesis:

Test a model with the data

Analyze results:

Is this model better than existing systems?

Reach a conclusion:

I should (not) use this model, because of X, Y, and Z.

Refine and repeat

Time of year could be a helpful signal.

17 of 41

ML Adoption

Prompt: “ML adoption in 2022 with Google technologies” (Stable Diffusion 2)

18 of 41

Stage I: PoC

19 of 41

Considerations

ML is worth giving a go.
You have enough signals to support this.
You have a framework to decide for moving forward with ML or not.

20 of 41

Data?

Start with data samples that closely represent the problem you want to solve with ML.
Data acquisition can still be very ad-hoc at this stage.
May not dedicated warehousing and feature stores to store your data.

21 of 41

Start with existing ML models / APIs if possible

Saves you time
Saves you resources
Easy incorporation
Already been battle-tested

22 of 41

Focus on one ML problem at a time

Decouple the impact
Decouple the efforts
Decouple the technical complexities

23 of 41

Choose the right tooling for training

Easy to use (technical debts can be brutal)
Readable by the broader team
Minimal efforts when scaling (scale from notebooks to prod)
Easily maintainable
Fits well with other things: serverless hosting, on-device platforms, etc.

24 of 41

Choose the right tooling for training

https://www.tensorflow.org/

25 of 41

Why?

Keras is known for its API design (tf.keras).
Lets you write models and train them using intuitive API.
Progressive disclosure of complexity:

Standard API for training, prediction, and evaluation.
But also possible to customize things arbitrarily.

First-class support for accelerators like TPUs.
Integrates well with XLA for accelerated computation.

26 of 41

Why?

[...]
Off-the-shelf support for TensorFlow Serving, TensorFlow Lite, TFX, TensorFlow Cloud, Vertex AI, etc.

27 of 41

Training considerations

If possible start training on a small sample (doesn’t fit all)
If you have a trained model:

Carefully evaluate the impact of data leakage and other feature descriptors.
Evaluate the model under different sub-populations.
Study the predictions to determine if they can help you reach decisions.

28 of 41

Interacting with the model for usage

Decide a way to interact with the model for application usage.
Determine the best way to consume the model: batched, online, on-device, etc.

29 of 41

Model consumption

Batched:

Predictions are not required immediately.
Recommended tooling:

An Apache Beam pipeline run on Dataflow on schedules
BigQuery with scheduled queries

Online:

Data can leave the device?

Docker + Kubernetes + GKE (microservices)

Prefer gRPC over REST
Prefer Go over Python

30 of 41

Model consumption

Online

Data cannot leave the (mobile) device

TensorFlow Lite
Firebase (for model management and communication with app)

31 of 41

Model consumption

Online
Other solutions for mobile ML

MLKit
MediaPipe

32 of 41

Stage II: MVP and Beyond

33 of 41

Scaling up

You probably did all the development through a notebook.

No shame! Everyone does it like that :)

Now, we need to graduate that to a bigger capacity.
What do we need?

Well tested practices and processes.

34 of 41

TFX comes to the rescue

https://www.tensorflow.org/tfx

35 of 41

Why TFX though?

Designed to operate at arbitrary scales.
Enforces good practices for:

Maintainability
Repeatability
Reproducibility
Adaptability

Flexibility

Run on any compatible executor (Kubeflow, Apache Airflow, Vertex AI, etc.)
Create your own components easily without worrying about the scalability part

36 of 41

Why TFX though?

Need more reasons?

https://blog.tensorflow.org/2022/10/how-startups-can-benefit-from-tfx.html

37 of 41

Scaling up

TFX may not be a solution for all ML applications.
Develop what’s best for your scenario to reliably and safely deploy models to prod -

Continuous integration
Continuous delivery
Continuous evaluation / retraining

38 of 41

Parallely explore Vertex AI

Support for

Scalable model training (bring your own framework)
Scalable deployments (bring your own framework)
Model monitoring
… and much more!

Read more here: https://cloud.google.com/vertex-ai

39 of 41

40 of 41

Wrapping up

Assess if you’ve an ML-friendly problem statement
Don’t be afraid to launch without ML
Keep it simple
Keep it one at a time
Launch from notebooks to prod
Experiment rapidly and reliably

41 of 41

Thank you!

Sayak Paul (@RisingSayak)