Deep Learning and ML
Industry perspective
Unwrapping the computational nodes
David Cardozo
GDE Quebec
@davidcardozo
Field is changing rapidly
The following demo maybe not impressive
"Give me the first few sentences of the speech delivered by portia of the merchant of venice in modern spanish, feel free to modernize the speech "
Large Language Model (Translator + Retrieval) -> Speech to Text -> Image generator
Understanding Images
Which capabilities exceed human performance between 2015 and 2020?
Videos
Even new models can process videos
Embedding videos
Half screen photo slide if �text is necessary
What was the state of the art in 1989?
How did it start?
Hubel & Wiesel (1962)
An historic development
CUDA and CUDNN
Linus et Nvidia.
“Near the end of his talk, when asked by one of the attendees about NVIDIA's hardware support and lack of open-source driver enablement / documentation, he had a few choice words for the Santa Clara company.”
Convolutional Networks
Computer Vision goes BRRR
Characteristics of images
14
©2019 Kiwi campus inc. All rights reserved
Statistics of natural images obey invariants
…
Translation
Cutout
Dilatation
Contrast
Rotation
Scale
Brightness
…
Sobel Filters
Deep ConvNets
15
©2019 Kiwi campus inc. All rights reserved
Layer 1
¿Cómo funciona un kiwibot?
Nvidia’s Autopilot
16
©2019 Kiwi campus inc. All rights reserved
End Project
Kiwibot dreaming
17
©2019 Kiwi campus inc. All rights reserved
Statistical Methods
“Inference-only” (DL) Models
“Generalizable” (DL) Models
CLIP
Transformer
ChatGPT & GPT4
Reinforcement Learning from Human Feedback (RLHF)
Setting the background…
Deep Learning for Generation
Setting the background…
What are Variational Autoencoders?
Encoder
Decoder
z
Input image
Hidden representation
Output image
Setting the background…
What are GANs?
Generator
Discriminator
True/False
Random Noise
Generated Image
Original Image
Panda mad scientist mixing sparkling chemicals, artstation
What can Diffusion models do?
Dog looking in the mirror, seeing a cat
A hedgehog wearing a leather jacket, playing a guitar on a beach
What is Diffusion?
The physics definition:
Is it possible to reverse this?
What is Diffusion?
The ML definition:
Diffusion Task:
Gradually add noise to the image in T steps in the forward process and try to recover the original image from the noisy image at xT
[q (xt |xt-1)]
[pθ (xt-1|xt)]
Deep Learning Frameworks
Hardware
GPUs
TPUs
Hubs
🤗 Hugging Face
TF Datasets, TF Hub, Torch Hub, Detectron2, ...
How do this actually work?
Data
calcPE(stock){
price = readPrice();
earnings = readEarnings();
return (price/earnings);
}
Rules
(Expressed in Code)
Answers
(Returned From Code)
| | | | |
| | | | |
| | | | |
if (ball.collide(brick)){
removeBrick();
ball.dx=-1*(ball.dx);
ball.dy=-1*(ball.dy);
}
Rules
Data
Traditional Programming
Answers
Rules
Data
Traditional Programming
Answers
Answers
Data
Rules
Machine Learning
Activity Recognition
if(speed<4){
status=WALKING;
}
Activity Recognition
if(speed<4){
status=WALKING;
}
if(speed<4){
status=WALKING;
} else {
status=RUNNING;
}
Activity Recognition
if(speed<4){
status=WALKING;
}
if(speed<4){
status=WALKING;
} else {
status=RUNNING;
}
if(speed<4){
status=WALKING;
} else if(speed<12){
status=RUNNING;
} else {
status=BIKING;
}
Activity Recognition
if(speed<4){
status=WALKING;
}
if(speed<4){
status=WALKING;
} else {
status=RUNNING;
}
if(speed<4){
status=WALKING;
} else if(speed<12){
status=RUNNING;
} else {
status=BIKING;
}
// ????
Rules
Data
Traditional Programming
Answers
Answers
Data
Rules
Machine Learning
Activity Recognition
0101001010100101010100101010100101110101001010100101010010101001010100101010
Label = WALKING
1010100101001010101010101001001001000100100111110101011111010100100111101011
Label = RUNNING
1001010011111010101110101011101010111010101011110101010111111110001111010101
Label = BIKING
1111111111010011101001111101011111010101011101010101011101010101010100111110
Label = GOLFING (Sort of)
Activity Recognition
0101001010100101010100101010100101110101001010100101010010101001010100101010
Label = WALKING
1010100101001010101010101001001001000100100111110101011111010100100111101011
Label = RUNNING
1001010011111010101110101011101010111010101011110101010111111110001111010101
Label = BIKING
1111111111010011101001111101011111010101011101010101011101010101010100111110
Label = GOLFING (Sort of)
Activity Recognition
0101001010100101010100101010100101110101001010100101010010101001010100101010
Label = WALKING
1010100101001010101010101001001001000100100111110101011111010100100111101011
Label = RUNNING
1001010011111010101110101011101010111010101011110101010111111110001111010101
Label = BIKING
1111111111010011101001111101011111010101011101010101011101010101010100111110
Label = GOLFING (Sort of)
The Machine Learning Paradigm
Make a Guess!
The Machine Learning Paradigm
Make a Guess!
Measure your accuracy
The Machine Learning Paradigm
Make a Guess!
Measure your accuracy
Optimize your Guess
The Machine Learning Paradigm
Make a Guess!
Measure your accuracy
Optimize your Guess
Repeat
The Machine Learning Paradigm
Machine Learning
Labels
Data
Rules
The Machine Learning Paradigm
Machine Learning
Answers
Data
Rules
Model
Data
Inferences
Jax Ecosystem
Jax Ecosystem
https://github.com/n2cholas/awesome-jax
Jax Neural Network Libraries
https://github.com/n2cholas/awesome-jax