1 of 65

Machine Learning

Dr. Dinesh Kumar Vishwakarma

Professor,

Department of Information Technology,

Delhi Technological University, Delhi-110042

dinesh@dtu.ac.in

http://www.dtu.ac.in/Web/Departments/InformationTechnology/faculty/dkvishwakarma.php

2 of 65

Course Detail

  • Faculty: Dinesh K Vishwakarma, Ph.D. in Computer Vision
  • Course Code:
    • Credit: L T P: 3 0 2 : 4C

2

3 of 65

Evaluation Criteria

  • CWS (15%): Attendance, Assignments, Tutorials/Quiz's/Random Questions
  • PRS (25%) : 10 Marks External Examination

3

CWS

PRS

MTE

ETE

15

25

20

40

4 of 65

Course Content

4

 UNIT NO

Contents

Contact Hours

 

 

UNIT 1

Introduction to Machine Learning: Overview of different tasks: classification, regression, clustering, Concept of learning, Types of the Machine Learning, Data Table, Information System, Data Representation, diversity of data, Basic Linear Algebra and Probaboliy Theory, Optimization: Maximum likelihood, Expectation maximization, Gradient descent, Bias-Variance Tradeoff, Metrics to Evaluate Classification and Regression models

 

 

14

 

UNIT 2

Supervised Learning: Linear Regression, Logistic Regression, Baysian Decision Theory, Naïve Bayes, K-Nearest Neighbour, Support Vector Machine, Decision trees, Ensemble Classifier, Random Forest, Linear Classifiers and Kernels, Neural Networks, Deep Neural Network, Fundametals of Deep Learning: DNN, CNN.

 

14

 

UNIT 3

Unsupervised Learning: Clustering, Expectation Maximization, K-Mean Clustering, Hierarchical vs Partitional Clustering, Gaussian Mixture Model, Dimensionality Reduction, Feature Selection, PCA, factor analysis, manifold learning.

 

14

5 of 65

Books

5

Text Books

1

Introduction to Machine Learning, Alpaydin, E., MIT Press, 2004

2

Machine Learning, Tom Mitchell, McGraw Hill, 1997

3

Elements of Machine Learning, Pat Langley Morgan Kaufmann Publishers

4.

Applied Machine Learning, M. Gopal, McGraw Hill, 2018

Reference

1

The elements of statistical learning, Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. Vol. 1. Springer, Berlin: Springer series in statistics, 2001.

2

Machine Learning: A probabilistic approach, by David Barber.

3

Pattern recognition and machine learning by Christopher Bishop, Springer Verlag, 2006

4

An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics) 1st ed. 2013, Corr. 7th printing 2017 Edition

6 of 65

Lesson Plan

6

42-Lecture Lesson Plan

click

7 of 65

Resources: Journals

7

1

IEEE Transactions on Pattern Analysis and Machine Intelligence

2

IEEE Transactions on Image Processing

3

Pattern Recognition

4

International Journal of Computer Vision

5

International Journal of Robotics Research

6

Information Fusion

7

IEEE Transactions on Visualization and Computer Graphics

8

IEEE Transactions on Medical Imaging

9

IEEE Robotics and Automation Letters

10

IEEE Transactions on Geoscience and Remote Sensing

11

IEEE Transactions on Circuits and Systems for Video Technology

12

Pattern Recognition Letters

Ranking

https://research.com/conference-rankings/computer-science/computer-vision

8 of 65

Resources: Conferences

8

https://research.com/conference-rankings/computer-science/computer-vision

9 of 65

A Few Quotes

9

AI is the main tool behind new-age innovation and discoveries like driverless cars or disease detecting algorithm

Generalized AI is worth thinking about because it stretches our imaginations and it gets us to think about our core values and issues of choice.

Artificial Intelligence will be ‘vastly smarter’ than any human and would overtake us by 2025.

We are now solving problems with machine learning and AI that were…in the realm of science fiction for the last several decades

“A breakthrough in machine learning would be worth�ten Microsofts” Bill Gates

10 of 65

Artificial Intelligence and Machine Learning in Industry 4.0

10

Breakdowns of industrial development and the great changes in related categories

Mechanization, stream and water power

Electronic and IT systems, Automation

Artificial intelligence

Mass production and Electricity

Industry

1.0

Industry

2.0

Industry

3.0

Industry

4.0

1760-1830

1870-1914

1970-2000

2015 -2050?

11 of 65

What is Machine Learning?

  • A branch of artificial intelligence, concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data.
  • Machine Learning is the science (and art) of programming computers so they can learn from data.
  • As intelligence requires knowledge, it is necessary for the computers to acquire knowledge.
  • Getting computers to program themselves
  • Writing software is the bottleneck
  • Let the data do the work instead!

11

The term machine learning was coined in 1959 by Arthur Samuel

Machine Learning is the field of study that gives computers the ability to learn

without being explicitly programmed.

—Arthur Samuel, 1959

12 of 65

What is Machine Learning?

  • A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E. Tom Mitchell, "Machine Learning" 1997.

12

E

T

P

Improve

Process

Measure

13 of 65

What is Machine Learning?

13

E

T

P

Experience

Task

Performance

Having Labelled Data: No. of students (male, female), etc.

Processing

Measuring Performance

Supervised Learning

Classification, Regression

Accuracy, Precession, Recall

14 of 65

What is Machine Learning?

14

T: Playing checkers

P: Percentage of games won against an arbitrary opponent

E: Playing practice games against itself

T: Recognizing hand-written words

P: Percentage of words correctly classified

E: Database of human-labeled images of handwritten words

T: Driving on four-lane highways using vision sensors

P: Average distance traveled before a human-judged error

E: A sequence of images and steering commands recorded while

observing a human driver.

T: Categorize email messages as spam or legitimate.

P: Percentage of email messages correctly classified.

E: Database of emails, some with human-given labels

15 of 65

Example 1: Class of ML Analysis

  • Typical customer: Admin/ Instructor.
  • Database:
    • Current students registered
    • basic parameters ( Height, weight )
    • Basic classification.
  • Goal: predict/decide whether student is FIT?

15

16 of 65

Example 2: Credit Risk Analysis

  • Typical customer: bank.
  • Database:
    • Current clients data, including:
    • basic profile (income, house ownership, delinquent account, etc.)
    • Basic classification.
  • Goal: predict/decide whether to grant credit.

16

17 of 65

Example 2: Credit Risk Analysis

  • Rules learned from data:

IF Other-Delinquent-Accounts > 2 and

Number-Delinquent-Billing-Cycles >1

THEN DENY CREDIT

IF Other-Delinquent-Accounts = 0 and

Income > $30k

THEN GRANT CREDIT

17

18 of 65

Example 3: Clustering news

  • Data: Reuters news / Web data
  • Goal: Basic category classification:
    • Business, sports, politics, etc.
    • classify to subcategories (unspecified)
  • Methodology:
    • consider “typical words” for each category.
    • Classify using a “distance “ measure.

18

19 of 65

Traditional Programming

Machine Learning

19

Computer

Data

Program

Output

Computer

Data

Output

Program

What is Machine Learning?

20 of 65

Resources: Datasets

20

21 of 65

Why Machine Learning?

  • Consider an example of Spam filtering.
    • First we look, how spam typically looks like, such as (“4U,” “credit card,” “free,” and “amazing”)
    • Then we write a detection algorithm for each patterns and flagged if pattern is detected.
    • We test our program and repeat step 1 and 2 until is good enough

21

Traditional Approach

Since the problem is not trivial, your program will likely become a long list of complex rules—pretty hard to maintain

22 of 65

Why Machine Learning?...

  • ML techniques automatically learns which words and phrases are good predictors of spam by detecting unusually frequent patterns of words in the spam examples compared to the ham example.
  • The program is much shorter, easier to maintain, and most likely more accurate.

22

23 of 65

Why Machine Learning?...

  • ML algorithms can be inspected to see what has been learned. For instance, once the spam filter has been trained on enough spam, it can easily be inspected to reveal the list of words and combinations of words that it believes are the best predictors of spam.
  • Sometimes this will reveal unsuspected correlations or new trends, and thereby lead to a better understanding of the problem.

23

Applying ML techniques to dig into large amounts of data can help discover patterns that were not immediately apparent. This is called data mining.

24 of 65

Why Machine Learning?...

  • No human experts
    • industrial/manufacturing control
    • mass spectrometer analysis, drug design, astronomic discovery
  • Black-box human expertise
    • face/handwriting/speech recognition
    • driving a car, flying a plane
  • Rapidly changing phenomena
    • credit scoring, financial modeling
    • diagnosis, fraud detection
  • Need for customization/personalization
    • personalized news reader
    • movie/book recommendation

24

25 of 65

Benefit of ML over Rule Based

  • Problems for which existing solutions require a lot of hand-tuning or long lists of rules: one ML algorithm can often simplify code and perform better.
  • Complex problems for which there is no good solution at all using a traditional approach: the best ML techniques can find a solution.
  • Fluctuating environments: a ML system can adapt to new data.
  • Getting insights about complex problems and large amounts of data.

25

26 of 65

Applications

26

  • Traffic Alerts
  • Image Recognition
  • Video Surveillance
  • Sentiment Analysis
  • Product Recommendation
  • Online support using Chatbots
  • Google Translate
  • Online Video Streaming Applications
  • Virtual Professional Assistants
  • Machine Learning Usage in Social Media
  • Stock Market Signals Using Machine Learning
  • Auto-Driven Cars
  • Fraud Detection

27 of 65

Related Field

27

Machine learning is primarily concerned with the accuracy and effectiveness of the computer system.

psychological models

data

mining

cognitive science

decision theory

information theory

databases

machine

learning

neuroscience

statistics

evolutionary

models

control theory

28 of 65

Machine Learning System

28

Feature Extraction

Grouping of Objects

Unsupervised

Machine Learning Algorithm

Supervised

Training Set

New Data

Annotated Data

Predictive Model

29 of 65

Machine Learning System

29

30 of 65

Machine Learning in a Nutshell

  • Tens of thousands of machine learning algorithms.
  • Hundreds new every year
  • Every machine learning algorithm has three components:

    • Representation
    • Evaluation
    • Optimization

30

31 of 65

Representation

  • Decision trees
  • Sets of rules / Logic programs
  • Instances
  • Graphical models (Bayes/Markov nets)
  • Neural networks
  • Support vector machines
  • Model ensembles
  • Etc.

31

32 of 65

Evaluation

  • Confusion Matrix
  • Accuracy
  • Recall/Sensitivity/True Positive Rate
  • Specificity
  • Error Rate
  • ROC
  • Squared error
  • Likelihood
  • Posterior probability

32

  • Cost / Utility
  • Margin
  • Specificity
  • F-Score
  • etc.

33 of 65

Optimization

  • Combinatorial optimization
    • E.g.: Greedy search,
    • finding an optimal object from a finite set of objects
  • Convex optimization
    • E.g.: Gradient descent
    • Finding the minimum of a function.
  • Constrained optimization
    • E.g.: Linear programming
    • Optimizing an objective function with respect to some variables in the presence of constraints on those variables

33

34 of 65

Examples of Machine Learning Problems

  • Pattern Recognition
    • Facial identities or facial expressions
    • Handwritten or spoken words (e.g., Siri)
    • Medical images
    • Sensor Data/IoT
  • Optimization
    • Many parameters have “hidden” relationships that can be the basis of optimization
  • Pattern Generation
    • Generating images or motion sequences
  • Anomaly Detection
    • Unusual patterns in the telemetry from physical and/or virtual plants (e.g., data centers)
    • Unusual sequences of credit card transactions
    • Unusual patterns of sensor data from a nuclear power plant
      • or unusual sound in your car engine or …
  • Prediction
    • Future stock prices or currency exchange rates

34

35 of 65

Web-based E.g. of ML

  • Web data is huge and tasks have to performed with very big datasets often use ML.
    • especially if the data is noisy or non-stationary.
  • Spam filtering, fraud detection:
    • The enemy adapts so we must adapt too.
  • Recommendation systems:
    • Lots of noisy data. Million dollar prize!
  • Information retrieval:
    • Find documents or images with similar content.
  • Data Visualization:
    • Display a huge database in a revealing way

35

36 of 65

36

Domain of ML

37 of 65

Types of Learning

  • Supervised (inductive) learning
    • Training data includes desired outputs
  • Unsupervised learning
    • Training data does not include desired outputs
  • Semi-supervised learning
    • Training data includes a few desired outputs
  • Reinforcement learning
    • Rewards from sequence of actions

37

38 of 65

Inductive Learning

  • Learner discovers rules by observing examples
  • Given examples of a function (X, F(X))
  • Predict function F(X) for new examples X
    • Discrete F(X): Classification
    • Continuous F(X): Regression
    • F(X) = Probability(X): Probability estimation

38

39 of 65

Learning Algorithms

39

Supervised learning

Unsupervised learning

Semi-supervised learning

40 of 65

Machine learning structure

  • Supervised learning

40

41 of 65

Supervised Learning

41

42 of 65

E.g. Supervised Learning

42

43 of 65

E.g. Supervised Learning

43

Document Classifier

44 of 65

Spectrum of Supervision

44

Unsupervised

“Weakly” supervised

Fully supervised

Definition depends on task

Slide credit: L. Lazebnik

45 of 65

Machine learning structure

  • Unsupervised learning

45

46 of 65

Unsupervised Learning

46

47 of 65

E.g. Unsupervised Learning

47

48 of 65

Reinforcement Learning

48

49 of 65

Reinforcement Learning

49

1

2

50 of 65

Reinforcement Learning

50

4

3

51 of 65

Reinforcement Learning

51

52 of 65

E.g. Reinforcement Learning

52

53 of 65

Why Machine Learning is Hard?

53

54 of 65

What We’ll Cover

  • Fundamentals of Linear Algebra and Probability
  • Supervised learning
    • Linear Regression
    • Logistic Regression
    • Decision tree induction
    • Instance-based learning
    • Bayesian learning
    • Neural networks
    • Support vector machines
    • Model ensembles
  • Unsupervised learning
    • Clustering
    • Dimensionality reduction
  • Reinforcement Learning

54

55 of 65

Data Representation

  • Information systems:
    • It represents knowledge from RAW data, which is used for decision making.
  • Data warehousing
    • It provide integrated, consistent and cleaned data to machine learning algorithms.
  • Data Table:
    • It is used to represent information.

55

56 of 65

DATA TABLE

  • Each row represents a measurements/ observations and each column gives the value of an attribute of the information system for all measurements/ observations.
  • Different terms are used to call ‘Rows’ information such as “Instances, examples, samples, measurements, observations, records, patterns, objects, cases, events”
  • Similarly, the ‘Column’ information is used to call “attributes and features”.

56

57 of 65

E.G. DATA TABLE

  • Consider a patient information in the data table.
  • Features and attributes: Headache, Muscle-Pain, Temperature. These attributes represented in linguistic form.

57

Patient

Headache

Muscle Pain

Temperature

Flu

1

NO

YES

HIGH

YES

2

YES

YES

HIGH

YES

3

YES

YES

VERY HIGH

YES

4

NO

YES

NORMAL

NO

5

YES

NO

HIGH

NO

6

NO

YES

VERY HIGH

YES

58 of 65

E.G. DATA TABLE

  • An outcome for each observation is known as “a priori” for directed/supervised learning.
  • Decision Attribute: one distinguished attributes that represent knowledge and information system of this kind called decision system.
  • E.g. ‘FLU’ is decision attribute
  • {Flu: Yes}, {Flu; No}.
  • Flu is a decision attribute with respect to condition attributes: headache, muscle-pain, temperature.

58

59 of 65

E.G. DATA TABLE

  •  

59

60 of 65

E.G. DATA TABLE

60

…….

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

.

.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

61 of 65

DATA REPRESENTATION

  •  

61

62 of 65

DATA REPRESENTATION

  •  

62

63 of 65

DATA REPRESENTATION

  •  

63

64 of 65

References

  • Gopal, M. 2019. Applied Machine Learning. 1st ed. New York: McGraw-Hill Education.
  • https://towardsdatascience.com/introduction-to-machine-learning-for-beginners-eed6024fdb08

64

65 of 65

Thank You

65