1 of 42

MIST101: INTRODUCTION

INTO THE MIST OF MACHINE LEARNING

University of Toronto Machine Intelligence Student Team

2 of 42

MIST101 Outline

  • Introduction
  • Supervised Learning & Neural Network
  • Convolutional Neural Network
  • Recurrent Neural Network
  • Unsupervised Learning
  • Reinforcement Learning

University of Toronto Machine Intelligence Student Team

3 of 42

What is Machine Learning

  • Two approaches for problem solving
    • Traditional Approach: Follow the instructions
    • Learning Approach: Learn by experience

University of Toronto Machine Intelligence Student Team

4 of 42

Traditional Approach

  • Write computer program that executes a set of rules/algorithms. Like a cookbook.
  • For same input, always gives similar output. Output is determined only by the input and the algorithm
  • CSC263/265, CSC373, etc.

University of Toronto Machine Intelligence Student Team

5 of 42

Drawbacks

  • Need human expert to come up with up specific algorithm every time for a new problem.
  • For some problems, it’s impossible to code all the rules by hand.
    • Complicated problems might have too many rules
    • Some problems with unknown rules

University of Toronto Machine Intelligence Student Team

6 of 42

Machine Learning Approach

  • Design an learning algorithm that can learn from examples
  • One algorithm can solve a family of tasks
  • Even for the same task, giving it different data will lead to different output results
  • Give a man a fish, and you feed him for a day; teach a man to fish, and you feed him for a lifetime.�

University of Toronto Machine Intelligence Student Team

7 of 42

Applications

  • Classification
  • Recognizing patterns
  • Recommender system
  • Music Information retrieval
  • Computer Vision (CV) -- auto driving
  • Natural Language Processing (NLP)
  • Robotics (Westworld?)
  • Learning to play games -- Alpha Go

University of Toronto Machine Intelligence Student Team

8 of 42

Types of Learning Tasks

University of Toronto Machine Intelligence Student Team

Machine

Learning

Supervised

Learning

Reinforcement

Learning

Unsupervised

Learning

9 of 42

Supervised Learning

  • Learn the underlying function of the given input/output pairs

“Dog”

“Cat”

Input

Output

Inputs:

Outputs:

Underlying Function

Single-variable function

Image classifier

(Pixels to Category)

Inputs:

Outputs:

Translation

(Language to Language)

Hello baby!

Salut bébé!

10 of 42

Supervised Learning

  • Learning the underlying function of the given input/output pairs

Function

“Dog”

Function

“Cat”

11 of 42

Example

Object Detection

(Darknet YOLO)

12 of 42

Supervised Learning

  • Learning the underlying function of the given input/output pairs

  • Finding a model that best represents the function of the given input/output pairs

Model

“Dog”

Model

“Cat”

13 of 42

Supervised Learning

Model

Artificial Neural Network

Decision Tree

Graphical

Model

Gaussian

Process

14 of 42

Supervised Learning

Model

Artificial Neural Network

Decision Tree

Graphical

Model

Gaussian

Process

SVM

KNN

Ensembles

15 of 42

Supervised Learning

Model

Artificial Neural Network (ANN)

16 of 42

Supervised Learning

Model

Artificial Neural Network (ANN)

Feedforward Neural Network (FNN)

Convolutional Neural Network (CNN)

Recurrent Neural Network (RNN)

17 of 42

University of Toronto Machine Intelligence Student Team

18 of 42

Supervised Learning

Model

Input

Correct Output

“Cat”

“Dog”

Data Pair

{

Loss

Optimize

Prediction

19 of 42

Types of Learning Tasks

University of Toronto Machine Intelligence Student Team

Machine

Learning

Supervised

Learning

Reinforcement

Learning

Unsupervised

Learning

Label

20 of 42

Supervised Learning

(Outline)

  • Learning objective:
    • Loss Function
  • Model construction:
    • Linear/Logistic Regression
    • Neural Network
  • Learning algorithm:
    • Gradient Descent
  • Model evaluation:
    • Overfit/underfit

Objective

Learning

Construction

Evaluation

21 of 42

Types of Learning Tasks

University of Toronto Machine Intelligence Student Team

Label

Machine

Learning

Supervised

Learning

Reinforcement

Learning

Unsupervised

Learning

22 of 42

Types of Learning Tasks

University of Toronto Machine Intelligence Student Team

Label

Machine

Learning

Supervised

Learning

Reinforcement

Learning

Unsupervised

Learning

No Label

23 of 42

Unsupervised Learning

Unsupervised

Learning

Clustering

Dimensionality

Reduction

Semi-supervised Learning

24 of 42

Unsupervised Learning

  • Infer function for hidden structure
  • Define probabilistic models p(x|𝜃)
  • Obtain parameter value/distribution

25 of 42

Unsupervised Learning

26 of 42

Unsupervised Learning

27 of 42

Unsupervised Learning

Gulrajani et al. 2017

28 of 42

Unsupervised Learning

29 of 42

Types of Learning Tasks

University of Toronto Machine Intelligence Student Team

Label

Machine

Learning

Supervised

Learning

Reinforcement

Learning

Unsupervised

Learning

No Label

30 of 42

Types of Learning Tasks

University of Toronto Machine Intelligence Student Team

Label

Machine

Learning

Supervised

Learning

Reinforcement

Learning

Unsupervised

Learning

Reward

No Label

31 of 42

“Using chocolates as positive reinforcement for what you consider correct behaviours…”

––– Leonard

32 of 42

Reinforcement Learning

https://en.wikipedia.org/wiki/File:Reinforcement_learning_diagram.svg

33 of 42

Reinforcement Learning

  • Learn a policy that maximizes cumulative reward.

Environment

Policy

Observation &

Reward

Action

34 of 42

Reinforcement Learning

  • Learn a policy that maximizes cumulative reward.

Locations of the stones &

Place a stone

Result of the game

35 of 42

Reinforcement Learning

  • Randomly explore
  • Evaluate through reward
  • Adjust policy to increase the expected cumulative reward
  • Repeat

Policy

Action 1

(33%)

Action 2

(33%)

Action 3

(34%)

Expected

Cumulative

Reward

10

20

30

36 of 42

Reinforcement Learning

  • Randomly explore
  • Evaluate through reward
  • Adjust policy to increase the expected cumulative reward
  • Repeat

Policy

Action 1

(20%)

Action 2

(30%)

Action 3

(50%)

Expected

Cumulative

Reward

10

20

30

37 of 42

38 of 42

Reinforcement Learning

  • Why RL?
    • Require little human expertise
    • Emergence of intelligence
  • Challenges
    • Require extensive training before a global optimum can be reached
    • Learned knowledge usually cannot be transferred to other tasks.

39 of 42

Reinforcement Learning

(Outline)

  • Markov Decision Processes (MDPs)
  • Types of Reinforcement Learning Problems
    • Discrete/Continuous Space
    • Stochastic/Deterministic Policy
  • Types of Reinforcement Learning Algorithms
    • Policy-based
    • Value-based
    • Both
  • Evolutionary Algorithms

40 of 42

Coming up...

  • MIST101 Workshop #2:
    • Topic: Supervised Learning & Neural Network
    • Time: Week of Oct. 2-6

41 of 42

Thank you!

University of Toronto Machine Intelligence Student Team

42 of 42

References

  • Dmitriev, D. (2017). Neural Network 3D Simulation. [online] YouTube. Available at: https://www.youtube.com/watch?v=3JQ3hYko51Y [Accessed 2 Oct. 2017].
  • Mackay, C. (2017). Object detection using Darknet Yolo on indian traffic data. [online] YouTube. Available at: https://www.youtube.com/watch?v=DeCFxPQlOVk [Accessed 2 Oct. 2017].
  • Zhu, J. (2017). Generative Visual Manipulation on the Natural Image Manifold. [online] YouTube. Available at: https://www.youtube.com/watch?v=9c4z6YsBGQ0 [Accessed 2 Oct. 2017].
  • YouTube. (2017). The Big Bang Theory - Sheldon Trains Penny. [online] Available at: https://www.youtube.com/watch?v=qy_mIEnnlF4 [Accessed 2 Oct. 2017].
  • YouTube. (2017). DQN Breakout. [online] Available at: https://www.youtube.com/watch?v=TmPfTpjtdgg [Accessed 2 Oct. 2017].
  • YouTube. (2017). Emergence of Locomotion Behaviours in Rich Environments. [online] Available at: https://www.youtube.com/watch?v=hx_bgoTF7bs [Accessed 2 Oct. 2017].