1 of 82

CSE 5523: �Machine Learning (ML)

2 of 82

Course information

  • Course website:

https://sites.google.com/view/osu-cse-5523-au24-chao/

(for course information, weekly schedule, and reading update)

  • Instructor:

Dr. Wei-Lun (Harry) Chao (chao.209@osu.edu), Office: DL 587

Assistant professor in CSE (PhD: USC; Postdoc: Cornell)

  • TA:

Tai-Yu Pan (pan.667)

2

3 of 82

A bit about me

Machine learning and its applications to

  • Autonomous driving
  • Computer vision
  • Natural language processing
  • Health care
  • Imageomics

3

Pancreatic

cancer

4 of 82

A bit about me

Learning with imperfect data sources

  • Limited data
  • Imbalanced data
  • Inaccessible data
  • Domain shifts

4

KITTI

(Germany)

Argoverse

(USA)

nuScenes

(USA, Singapore)

Lyft

(USA)

Waymo

(USA)

5 of 82

A bit about me

6 of 82

Course information

  • Lecture time: Tuesday and Thursday, 2:20 PM - 3:40 PM

  • Office hours: TBA, DL587
    • No office hours the first week

  • TA Office hours: TBA, BE406
    • No office hours the first week

7 of 82

Course information

  • Carmen/GitHub:
    • For announcement, posting course materials (slides), and homework submission

  • Piazza:
    • For discussion. Please register!
    • Link: TBA
    • Please use name.#@osu.edu
    • Access code: osu-cse-5523-AU24-chao

  • Detailed syllabus (pdf):
    • can be found on Carmen and the course website

7

8 of 82

Communications

  • Schedule and reading will be updated on the website
  • Announcements will be made through Carmen
  • Discussions and questions must be posted in Piazza
  • Please only use email to contact me or the TA for urgent or personal issues. Please include the tag "[OSU-CSE-5523]" in the subject line.
  • More details: See website, Carmen, and the syllabus

8

9 of 82

Questions?

10 of 82

Grading and homework

 Grading (tentative)

  • Homework – 60%
  • Midterm – 20%
  • Final (Dec. 6, 4 pm) – 20%
  • The midterm or final exam may be replaced by a project. The midterm and final exams may be re-distributed into three exams. The final exam is, by default, cumulative.

Guidelines

  •  Expect 6 homework assignments (including problem and programming sets)
    • Solutions involve derivations or proofs. Grading is based on correctness and clarity. Be concise and show your reasoning in a clear and precise way.
    • Homework completion and submissions are individual, but feel free to discuss. You must strictly follow the submission instructions.
    • NOT ALLOWED: ask or search solutions
    • No late days are accepted.

11 of 82

Tentative schedule

Homework

  • Dates: TBA
  • You will have two weeks to complete each homework
  • Due is at 23:59 ET

Exams

  • Midterm date(s): TBA
  • Final exam: 12/6/2024

12 of 82

Policy

Academic integrity

  • Plagiarism and other unacceptable violations
    • Zero tolerance
  • Please study the related sections in the syllabus (pdf) on academic integrity.
  • Please read OAA’s message on large language models: https://oaa.osu.edu/artificial-intelligence-and-academic-integrity

(Re-)grading

  • Only factual errors will be corrected.
  • Request: one week within the release of your homework and exam grade
  • Format: TBA

13 of 82

Pre-requisites & what to expect?

  • Pre-requisites (especially for graduate students)
    • Linear algebra: Math 2568, 2174, 4568, or 5520H
    • Artificial intelligence: 3521, 5521, or 5243
    • Statistics and probability: 5522, Stat 3460, or 3470
    • Decent degree of mathematical sophistication
    • Knowledge of programming, algorithm design, and data structures
  • Extensive math and programming related homework
    • Multivariate calculus, linear algebra, and probability
    • Python 3
    • Latex for writing (https://www.overleaf.com/)
  • ML algorithms often difficult to debug
    • We strongly recommend that you start early.

13

14 of 82

SEI comments

  • Very hard class in my opinion …
  • This is by far the hardest class I have ever taken at OSU …
  • I did find it very fast-based, computationally intensive, and difficult.
  • I thought the homeworks were challenging and very fun to solve.
  • That being said, this was a difficult class. It's very math heavy, but I expected that.
  • The problem sets were really tough.

14

15 of 82

Textbook

  • Not required, but for students who want to read more, we recommend

    • Warning: Not course textbooks, so our notation/presentation does not necessarily follow the book.
    • See Carmen for PDF links.

15

Pattern Recognition and Machine Learning

Machine Learning: A Probabilistic Perspective

16 of 82

Textbook

  • Not required, but for students who want to read more, we recommend

    • Warning: Not course textbooks, so our notation/presentation does not necessarily follow the book.
    • See Carmen for PDF links.

16

Deep Learning

Understanding Machine Learning

17 of 82

Other great textbooks

17

Learning from Data

Foundations of Machine Learning

Machine Learning:

A Bayesian and Optimization Perspective

The Elements of Statistical Learning

Machine Learning Refined

Introduction to Machine Learning

18 of 82

Other great textbooks

18

Deep Learning:

Foundations and Concepts

Understanding Deep Learning

Dive into Deep Learning

19 of 82

Other excellent ML courses

  • Stanford CS 229: http://cs229.stanford.edu/
    • Stanford CS 231n (deep learning for computer vision): http://cs231n.stanford.edu/

  • Cornell ML: https://www.cs.cornell.edu/courses/cs4780/2018fa/

  • NTU ML: https://www.csie.ntu.edu.tw/~htlin/course/ml10fall/

  • Machine learning courses are hard to be comprehensive and unified
    • Theoretical ML, Bayesian and probabilistic ML, optimization for ML, deep learning, online learning, reinforcement learning, ML for X, …
    • Even the basic ML courses can be very different

19

20 of 82

Important for this week

  • Register. If you are on the waitlist, you might or might not get in depending on how many empty seats or how many students drop.
  • Register for the class on piazza (see Carmen for the link) --- our main platform for discussion and communication
  • Math review/self-diagnostic: do Homework #0 and check suggested materials on the website (e.g., linear algebra slides)--- extremely important to check your readiness for the course
  • Python: check suggested tutorials on the website
  • Decision: stay or drop

  • Office hours: start next week

20

21 of 82

How to do well? How to learn ML well?

  • Lecture and lecture slides for basics
    • Describe basic concepts, tools
    • Describe algorithms and their development with intuition and rigor

  • Textbook reading for completeness and extension
  • Homework for practice, generalization, and implementation
  • Discussion (Piazza, office hours) for further understanding

  • Overall: Develop skills in grasping abstract concepts and thinking critically to solve problems with ML techniques

21

22 of 82

Important dates

  • Homework 1 will be released at the end of week 2 or week 3. You will have 2 weeks to complete each homework.

  • Midterm: in class (date: TBA)
  • Final: in class (date: 12/6, following the university’s schedule)

  • Online (or pre-recorded) teaching or guest lectures
    • For some weeks, I may be traveling, such as 9/10

22

23 of 82

Questions?

24 of 82

Today

Introduction

  • What is machine learning?

Machine learning setup

  • Training/testing
  • Supervised/unsupervised
  • Classification/regression

Course overview and math showtime

24

25 of 82

What is machine learning?

26 of 82

What is machine learning?

A set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data, or to perform other kinds of decision making under uncertainty

Kevin Murphy. Machine learning: a probabilistic perspective.

26

The term machine learning refers to the automated detection of meaningful patterns in data

S. Shalev-Shwartz and S. Ben-David. Understanding machine learning.

27 of 82

What is machine learning?

A set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data, or to perform other kinds of decision making under uncertainty

Kevin Murphy. Machine learning: a probabilistic perspective.

27

The term machine learning refers to the automated detection of meaningful patterns in data

S. Shalev-Shwartz and S. Ben-David. Understanding machine learning.

28 of 82

What is machine learning?

This book is about learning from data.

Sergios Theodoridis. Machine learning: a Bayesian and optimization perspective.

28

We choose the title “learning from data” that faithfully describes what the subject is about.

Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.

29 of 82

Example: coin classifier

29

  • The model (e.g., linear boundaries) captures the general trend
  • The model may miss some noisy examples (outliers)

Patterns in the space of weight and size

[Figure credit: Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.]

30 of 82

Other examples

  • Spam Detection
    • Data: Set of emails, each labeled: Spam, or Not Spam.
    • Model & Pattern: Prediction rule (i.e., classifier) to classify emails

  • Image classification
    • Data: Set of images, each labeled with object class names (e.g., dog, cat, car, …)
    • Model & Pattern: Prediction rule (i.e., classifier) to classify images

30

31 of 82

Key ingredients

  • Data: collected from past observations (we often call them training data)

  • Modeling: devised to capture the patterns (or knowledge) in the data
    • The model does not have to be true --- if it is close, it is useful
    • We should tolerate randomness and mistakes --- many interesting things are stochastic by nature.

  • Prediction: apply the model to forecast what is going to happen in future

31

32 of 82

Memorization vs. generalization

32

cat

Model

Input

Label

Model

Memorization

?

Test data

cat

cat

dog

Training data

33 of 82

Memorization vs. generalization

33

cat

cat

cat

dog

Training data

Agent function

Input

Label

Model

Generalization

cat

Test data

The detected patterns should be able to generalize to future test instances.

34 of 82

Learning algorithms

  • How is ML different from traditional programming?
    • Providing programs with the ability to “learn” and adapt to data on their own.

34

Learning algorithms

Training data

(experience,

past observations)

Learned models & patterns

(knowledge, expertise)

35 of 82

ML is everywhere!

  • Search engines, recommendation systems, ChatGPT,
  • Email spam detection, fraud detection in credit cards,
  • Personal assistance in smart phones, face detection in digital cameras,
  • Navigation, military applications, medicine, bioinformatics, astronomy,..

  • Important research/application areas:
    • Speech processing
    • Natural language processing
    • Computer vision
    • Autonomous driving
    • ……

35

36 of 82

Speech Processing & Natural Language Processing

  • Speech technologies (e.g., Siri, Alexa)
    • Automatic speech recognition (ASR)
    • Text-to-speech synthesis (TTS)
    • Dialog systems

  • Language processing technologies
    • Question answering (e.g., ChatGPT)
    • Machine translation

    • Web search
    • Text classification, spam filtering, etc…
    • Large language models (LLM)

36

37 of 82

Computer Vision

37

[Source: Detectron2]

  • Object and face recognition
  • Scene segmentation
  • Image classification

[Source: Graham Murdoch/Popular Science]

38 of 82

Autonomous Driving

38

Perception

Prediction & Planning

Action & decision

Radar

Sonar

Camera

Others

LiDAR

39 of 82

Questions?

40 of 82

Machine learning setup

Training/testing

Supervised/unsupervised

Classification/regression

41 of 82

Today

Introduction

  • What is machine learning?

Machine learning setup

  • Training/testing
  • Supervised/unsupervised
  • Classification/regression

  • Course overview and math showtime

41

42 of 82

“Supervised” ML pipeline

42

Learning algorithms

Training data

Learned models & patterns

[Figure credit: Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.]

43 of 82

“Supervised” ML pipeline

43

Learning algorithms

Training data

Learned models & patterns

Test data

[Figure credit: Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.]

44 of 82

How to know if the model works well or not?

44

Learning algorithms

Training data

Learned models & patterns

Test data

oracle

25

Ground truth

Difference

(loss)

[Figure credit: Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.]

45 of 82

Caution! Must be step-by-step!

45

Learning algorithms

Training data

Learned models & patterns

Test data

oracle

25

Ground truth

Difference

(loss)

1

2

3

[Figure credit: Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.]

46 of 82

Caution! Must be step-by-step!

46

Learning algorithms

Training data

Learned models & patterns

Test data

oracle

25

Ground truth

Difference

(loss)

Pretend you don’t know!

[Figure credit: Y. Abu-Mostafa, M. Magdon-Ismail, H-T Lin. Learning from data.]

47 of 82

Test data

  • Should be:
    • Unseen in the learning/training phase
    • Disjoint from the training data
    • BUT distributionally similar to the training data

  • Analogy:
    • Exam questions (test data) will not be the same as the questions in homework, practice exams, or textbook exercises (training data).
    • The style of and the content covered in the exam questions may be similar to them.

47

48 of 82

Today

Introduction

  • What is machine learning?

Machine learning setup

  • Training/testing
  • Supervised/unsupervised
  • Classification/regression

Course overview and math showtime

48

49 of 82

Different flavors of ML problems

  •  

49

50 of 82

Data and features

  •  

50

Time index

Machine learning (ML) is the study of computer algorithms that improve automatically through experience. ..

51 of 82

Supervised learning

  •  

51

laptop

laptop

camera

camera

bike

bike

52 of 82

Unsupervised learning

  •  

52

53 of 82

Unsupervised learning

  • Clustering:

  • Distribution/Density estimation
    • Can sample from the distribution to generate new data

53

 

 

 

 

54 of 82

Unsupervised learning

  • Generative models
    • After seeing many data, can the model generate one?
    • Example: Google bedroom image search (real images)

54

55 of 82

Unsupervised learning

  • Generative models
    • After seeing many data, can the model generate one?
    • Example: Machine generated images [Radford et al., ICLR 2016]

55

56 of 82

Unsupervised learning

  • How to know if the model works well? Do we need separated test data?
    • Clustering, maybe NO

56

Evaluation 1:

Visualization

Evaluation 2:

Hide

Compare

57 of 82

Unsupervised learning

  • How to know if the model works well? Do we need separated test data?
    • Density estimation, usually YES

    • vs. clustering: hard to visualize a distribution or know the “ground truth” distribution

57

Training data

Test data

 

 

 

 

 

 

 

 

58 of 82

Questions?�(We will focus mainly on supervised learning for the first half of the semester!)

59 of 82

Today

Introduction

  • What is machine learning?

Machine learning setup

  • Training/testing
  • Supervised/unsupervised
  • Classification/regression

Course overview and math showtime

59

60 of 82

Classification vs. regression

  • Classification
    • Form: Feature vector → Desired category (integer number index)
    • Ex: 1: Cat, 2: Dog, 3: Horse, ……, 1000: Car

  • Regression (curve Fitting)
    • Form: Feature vector → Desired real number (e.g., price, chance, etc.)

60

Machine learning: capture patterns from training data

that can be generalized to future unseen data

61 of 82

Classification vs. regression: training data

61

x: distance

y: price

x[1]: year in use

x[2]: miles

Regression (bus ticket):

From x (distance), predict y (price)

Classification (car buying company):

From x (year, miles), predict y (buy or not)

y:

Buy

Not

DO NOT mean the input data are always 1-D or 2-D!

62 of 82

Classification vs. regression: find patterns

62

x: distance

y: price

x[1]: year in use

x[2]: miles

Regression (bus ticket):

From x (distance), predict y (price)

Classification (car buying company):

From x (year, miles), predict y (buy or not)

Linear relationship

Linear boundary

y:

Buy

Not

63 of 82

Classification vs. regression: generalization

63

x: distance

y: price

x[1]: year in use

x[2]: miles

Regression (bus ticket):

From x (distance), predict y (price)

Classification (car buying company):

From x (year, miles), predict y (buy or not)

Linear boundary

Linear relationship

y:

Buy

Not

64 of 82

Classification vs. regression: generalization?

64

x: distance

y: price

x[1]: year in use

x[2]: miles

Regression (bus ticket):

From x (distance), predict y (price)

Classification (car buying company):

From x (year, miles), predict y (buy or not)

Linear boundary

Linear relationship

y:

Buy

Not

?

?

65 of 82

Questions to ML

  • What assumptions do we need for learning to be possible?
    • Data and Models

  • Given a fixed model type (e.g., linear boundaries), how can the machine output the “right” model?

  • How many “training” data samples are needed to ensure that the output model will generalize well to the unseen test data?
  • What is the practice of solving an ML problem?

65

66 of 82

Today

Introduction

  • What is machine learning?

Machine learning setup

  • Training/testing
  • Supervised/unsupervised
  • Classification/regression

Course overview and math showtime

66

67 of 82

Course overview

68 of 82

Topics

  • Supervised learning
    • Regression
    • Classification (nearest neighbors, linear models, SVM)
    • Probabilistic approaches
    • Ensemble approaches
    • Kernel methods
    • Optimization
  • Machine learning foundation, theories, practice
    • Empirical risk minimization and regularization
    • PAC learning and VC dimension
    • Bias and variance, over-fitting vs under-fitting
    • Debugging

68

69 of 82

Topics

  • Unsupervised learning
    • K-means clustering
    • GMM and EM
    • Dimensionality reduction and metric learning (supervised or unsupervised)
  • Neural networks and deep learning
    • Backpropagation and optimization
    • MLP
    • CNN
    • Transformer
  • Optional: Advanced ML paradigms or approaches
    • Generative models: VAE, GAN, diffusion models
    • Domain adaptation, semi-supervised learning

69

70 of 82

Math showtime

71 of 82

What you will see … (ELBO)

71

Probability, inequality, …

72 of 82

What you will see … (logistic loss)

72

Loss function, log, …

73 of 82

What you will see … (covariance matrix)

73

 

Linear algebra, …

74 of 82

Math in ML

  • A common language/tool to
    • represent and compare ML problems, ideas, and concepts
    • design ML algorithms and derive solutions
    • prove and understand ML theories
    • Different textbooks may use different notations to present the same thing. Thus, it is more important to know the underlying concepts than to memorize equations.

  • Math in this course
    • Expectation for now: You can “read” math equations.
    • The course will help you “understand” math equations.
    • Example:

74

 

75 of 82

Math quiz: linear algebra-1

  • Matrix inverse, eigenvalues, and eigenvectors

75

 

76 of 82

Math quiz: linear algebra-2

  • Positive definite and distance

76

 

 

77 of 82

Math quiz: linear algebra-2

  • Positive definite and distance

77

 

 

78 of 82

Math quiz: probability

  • Basics, conditional probability, and chain rule

78

 

79 of 82

Math quiz: probability

  • Basics, conditional probability, and chain rule

79

 

80 of 82

Math quiz: multivariate calculus

80

 

81 of 82

Math quiz: multivariate calculus

81

 

82 of 82

Summary

  • Machine learning
    • Learning from “training” data
    • Applying what are learned to future/other “test” data
    • Data, features, labels

  • Two important paradigms
    • Supervised vs. unsupervised learning

  • TODO
    • See the beginning slides and the course website for suggested reading
    • Background review: math and programming

82