Intro to Machine Learning
Girish Varma
IIIT Hyderabad
http://bit.ly/2tzcXHu
A Machine Learning Problem
Given a image of a handwritten digit, find the digit.
No well defined function from input to output.
Programming vs Machine Learning
Machine Learning:
Find the handwritten digit in an image.
Programming:
Find the shortest path in an input graph G.
Dataset
Tensors
All data, intermediate outputs, learnable parameters are represented by a tensor.
A machine learning model transforms an input tensor to an output tensor.
Tensors have a shape.
Model
The function that maps the input to the output. �y = f𝛉(x)
A model has learnable parameters, 𝛉.
The Neural Network Model
MNIST Classification
Input : x is a [28,28] shaped tensor, giving pixel values of the image
Output : y is a [10] shaped tensor, giving the probabilities of being 0 to 9.
If the dataset gives y as a digit, convert it to probability vector by one hot encoding.
Use Softmax function for converting real valued output to probabilities.
Multilayered Network
Complex data fits only more complex models.
Obtain complex models by layering multiple linear layers.
Multilayered Perceptron (MLP)
A MLP model for MNIST
Reshape
Fully Connected Layer
Fully Connected Layer
Softmax
p(0)�p(1)
.
.
.
p(8)
p(9)
Predicted probabilities for different digits
Training a Model
The process of finding the right parameters for the model.
Loss Function
Gradient Descent
Gradient Descent : Change the parameters 𝛉 slightly such that the loss function decreases. Gradients are the partial derivatives of the loss function wrt. the parameters.
Backpropagation
Backpropagation : The process of finding the gradients of parameters in a multilayered network.
Training Algorithm
Overfitting
Testing or Inference
Some References