1 of 136

Machine Learning

Prof. Seungtaek Choi

2 of 136

Do You Know Nano Banana? (1)

  • Google’s image generation AI (namely Gemini 2.5 Flash Image)

Prompt: Show me a man who is 33 years old, actually a professor at Hankuk University of Foreign Studies, and give a lecture about Linear Regression.

3 of 136

Do You Know Nano Banana? (2)

Nano Banana + Kling

4 of 136

Do You Know Nano Banana? (3)

5 of 136

Last Time

  • Course Overview
  • Introduction to AI

6 of 136

Today

  • Announcement: 1st Assignment!
  • Git/GitHub Basics
  • Introduction to ML
    • Definition
    • Supervised Learning vs. Unsupervised Learning
    • Classification, Regression, Dimension Reduction, Clustering, Model Selection
  • Linear Regression
  • Gradient Descent Algorithm

7 of 136

1st Assignment!

8 of 136

Git/GitHub Basics�(Or, How to Submit Assignment)

9 of 136

What is Git?

  • Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

10 of 136

What is GitHub?

  • GitHub is a cloud-based platform built on the "Git" version control system that provides tools for developers to store, manage, share, and collaborate on code and other files.

11 of 136

Git & GitHub?

  • git: version control & code management (local)
  • github: code storage (cloud)
  • push: upload code to github

push

12 of 136

Components of Git: Repository

  • Repository is a version-controlled project space that stores your files, branches, and full change history.

13 of 136

Components of Git: Branch

  • Branch is an independent line of development – a named timeline of commits within a repo.

14 of 136

Components of Git: Commit

  • Commit is a saved snapshot of changes with a message, author, and timestamp.

15 of 136

Git Branching Structure (Dev)

branch

branch

branch

branch

branch

commit

16 of 136

Git Branching Structure (Ours)

main

main

HUFS-LAI-Seungtaek/HUFS-LAI-OOP-2025-2:main

hufs-student-2/HUFS-LAI-OOP-2025-2:main

main+1

assignment1.py

main+2

assignment1.py

main+2

main+1

main

hufs-student-1/HUFS-LAI-OOP-2025-2:main

main+1

assignment1.py

17 of 136

Git Workflow

18 of 136

Git Workflow

  • Repository structure:
    • upstream (professor:main): original repo
    • origin (student:main): your repo
    • local: your computer

19 of 136

Git Workflow: fork

20 of 136

Git Workflow: clone

21 of 136

Git Workflow: add & commit

22 of 136

Git Workflow: add & commit

23 of 136

Git Workflow: add & commit

24 of 136

Git Workflow: add & commit

25 of 136

Git Workflow: add & commit

26 of 136

Git Workflow: push

27 of 136

Git Workflow: PR & merge

28 of 136

Git Workflow: branch

  • For your assignment, no need to use `branch`
    • professor:main (remote) 🡪 fork
    • student:main (remote) 🡪 clone
    • student:main (local) 🡪 commit
    • student:main+1 (local) 🡪 commit
    • student:main+2 (local) 🡪 push
    • student:main+2 (remote) 🡪 PR
    • professor:main+1 (remote)
  • It’s not about remote vs. local.
  • It’s about w/ permission vs. w/o permission.

29 of 136

Git Workflow: branch

  • In your team’s repo, if you are not allowed to push `main` branch, …
    • student:main (remote) 🡪 clone
    • student:main (local) 🡪 checkout (In this case, $ git checkout –b feature)
    • student:feature (local) 🡪 commit
    • student:feature+1 (local) 🡪 commit
    • student:feature+2 (local) 🡪 push
    • student:feature+2 (remote) 🡪 PR
    • student:main+1 (remote)

30 of 136

GitHub Web Shortcuts �(Or, How to Submit Assignment #1)

31 of 136

Fork repository

32 of 136

Fork repository

33 of 136

Repo is copied under your account.

34 of 136

Add a file

35 of 136

Add a file

36 of 136

Add a file

members/{학생이름}.md

Example: members/seungtaek.md

Example: members/yeachan.md

Example: members/gildong.md

Don’t include {}�Don’t use uppercase

Don’t use Korean

37 of 136

38 of 136

Introduce yourself

You can see actual “code” from

https://github.com/HUFS-LAI-Seungtaek/HUFS-LAI-OOP-2025-2/blob/main/members/seungtaek.md?plain=1

Feel free to introduce yourself more! (but, PLEASE FOLLOW THE OVERAL FORMAT!)

39 of 136

Commit the change (your file)

40 of 136

Back to “your” repo

41 of 136

Submit PR to lecture repository

42 of 136

Submit PR to lecture repository

43 of 136

Submit PR to lecture repository

Please follow the format �n-th Assignment by {학번} ({Full name})

�It’s important to check your submission status.

Do not use or

Please use ` (look at the ~)

You can see the preview.

44 of 136

Introduction to ML

45 of 136

What is Machine Learning?

  1. What is machine?
  2. What is learning?
    1. H. Simon: Any process by which a system improves its performance
    2. H. Minsky: Learning is making useful changes in our minds
    3. R. Michalsky: Learning is constructing or modifying representations of what is being experienced
    4. L. Valiant: Learning is the process of knowledge acquisition in the absence of explicit programming

46 of 136

What is Machine Learning?

  • Machine Learning
    • “[the] field of study that gives computers the ability to learn without being explicitly programmed.” – Arthur L. Samuel (1959)
    • “A learning machine, broadly defined, is any device whose actions are influenced by past experiences.” – Nils J. Nilsson (1965)
    • “Pattern recognition – the act of taking in raw data and taking an action based on the ‘category’ of the pattern.” – Duda & Hart (1973)
    • “The study and computer modeling of learning processes in their multiple manifestations constitutes the subject matter of machine learning” – Carbonell, Michalski & Mitchell (1983)
    • “A computer program is said to learn from experience E … if its performance at tasks in T, as measured by P, improves with experience E.” – Tom M. Mitchell (1997)
    • “The goal of machine learning is to program computers to use example data or past experience to solve a given problem.” – Ethem Alpaydin (2004)
    • “[We] define machine learning as a set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data.” – Kevin P. Murphy (2012)
    • “ML models use big data to learn and improve predictability and performance automatically … without being programmed … by humans.” – OECD (2020s)

47 of 136

What is Machine Learning?

  • Machine Learning
    • “[the] field of study that gives computers the ability to learn without being explicitly programmed.” – Arthur L. Samuel (1959)
    • “A learning machine, broadly defined, is any device whose actions are influenced by past experiences.” – Nils J. Nilsson (1965)
    • “Pattern recognition – the act of taking in raw data and taking an action based on the ‘category’ of the pattern.” – Duda & Hart (1973)
    • The study and computer modeling of learning processes in their multiple manifestations constitutes the subject matter of machine learning” – Carbonell, Michalski & Mitchell (1983)
    • “A computer program is said to learn from experience E … if its performance at tasks in T, as measured by P, improves with experience E.” – Tom M. Mitchell (1997)
    • “The goal of machine learning is to program computers to use example data or past experience to solve a given problem.” – Ethem Alpaydin (2004)
    • “[We] define machine learning as a set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data.” – Kevin P. Murphy (2012)
    • “ML models use big data to learn and improve predictability and performance automatically … without being programmed … by humans.” – OECD (2020s)

48 of 136

What is Machine Learning?

49 of 136

Why study machine learning?

  • Easier to build a learning system than to hand-code a working program!
    • Robot that learns a map of environment by exploring
    • Programs that learn to play games by playing against themselves
  • Improving on existing programs
    • Instruction scheduling and register allocation in compilers
    • Combinatorial optimization problems
  • Discover knowledge and patterns in highly dimensional, complex data
    • Sky surveys
    • Sequence analysis in bioinformatics
    • Social network analysis
    • Ecosystem analysis

50 of 136

Very brief history

  • Studied ever since computers were invented (e.g., Samuel’s checkers player)
  • Very active in 1960s (neural networks)
  • Died down in the 1970s
  • Revival in early 1980s (decision trees, backpropagation, temporal-difference learning) - coined as “machine learning”
  • Exploded starting in the 1990s
  • Now: very active research field, several yearly conferences (e.g., ICML, NeurIPS), major journals (e.g., Machine Learning, Journal of Machine Learning Research)
  • The time is right to study in the field!
    • Lots of recent progress in algorithms and theory
    • Flood of data to be analyzed
    • Computational power is available
    • Growing demand for industrial applications

51 of 136

Very brief history

52 of 136

What are good machine learning tasks?

  • There is no human expert
    • E.g., DNA analysis, video recommendation
  • Humans can perform the task but cannot explain how
    • E.g., character recognition
  • Desired function changes frequently
    • E.g., predicting stock prices based on recent trading data
  • Each user needs a customized function
    • E.g., news filtering

53 of 136

Important application areas…

  • Bioinformatics: sequence alignment, analyzing microarray data, information integration, …
  • Computer vision: object recognition, tracking, segmentation, active vision, …
  • Robotics: state estimation, map building, decision making
  • Graphics: building realistic simulations
  • Speech: recognition, speaker identification
  • Financial analysis: option pricing, portfolio allocation
  • E-commerce: automated trading agents, data mining, spam, …
  • Medicine: diagnosis, treatment, drug design, …
  • Computer games: building adaptive opponents
  • Multimedia: retrieval across diverse databases

54 of 136

Supervised Learning

vs.

Unsupervised Learning

55 of 136

Supervised Learning

  •  

56 of 136

Supervised Learning Example 0: Linear function

57 of 136

Supervised Learning Example 0: Linear function

58 of 136

Supervised Learning Example 1: Housing price prediction

  • You can practice on Kaggle:

59 of 136

Supervised Learning Example 2: Breast cancer (malignant, benign)

60 of 136

Supervised Learning Example 2: Breast cancer (malignant, benign)

61 of 136

Supervised Learning Example 3: Spam Detection

62 of 136

Unsupervised Learning

  • Training experience: unlabeled data
  • What to learn: interesting associations in the data
  • E.g., image segmentation, clustering
  • Often there is no single correct answer

63 of 136

Unsupervised Learning Example 1: Gene Clustering

  • Training experience: unlabeled data
  • What to learn: interesting associations in the data
  • E.g., image segmentation, clustering
  • Often there is no single correct answer

64 of 136

Unsupervised Learning Example 2: Customer Segmentation

65 of 136

Supervised Learning vs. Unsupervised Learning

66 of 136

Reinforcement Learning

  • Problems involving an agent interacting with an environment which provides numeric reward signals
  • Goal: Learn how to take actions in order to maximize reward

67 of 136

Machine Learning – Taxonomy of Problems

  • Classification
  • Regression
  • Density Estimation
  • Dimension Reduction
  • Clustering
  • Model Selection

68 of 136

Classification

  •  

69 of 136

Regression

  •  

70 of 136

Density Estimation

  •  

71 of 136

Dimension Reduction

  • In dimension reduction, one attempts to learn a low-dimensional manifold to represent complex data.
  • e.g. PCA (Principal Component Analysis), ICA (Independent Component Analysis)

72 of 136

Clustering

  • Clustering refers to techniques to segmenting data into coherent “clusters”
  • e.g. k-means, Mixtures-of-Gaussians, mean-shift

73 of 136

Model Selection

  • “Given a choice of two models, which one is more appropriate to the data?”
  • “How big should the model be?”
  • e.g. A model with more parameters will fit the data better, but it could also overfit the data.

74 of 136

Thus far…

Supervised vs. unsupervised learning

Different types of machine learning problem

  • Classification
  • Regression
  • Density Estimation
  • Dimension Reduction
  • Clustering
  • Model Selection

75 of 136

Supervised Learning /

Regression

76 of 136

Supervised Learning Problem

  •  

77 of 136

Housing price prediction (again)

78 of 136

Housing price prediction (again)

79 of 136

Housing price prediction (again)

80 of 136

Housing price prediction (again)

81 of 136

82 of 136

83 of 136

84 of 136

85 of 136

86 of 136

87 of 136

88 of 136

89 of 136

90 of 136

91 of 136

92 of 136

93 of 136

94 of 136

95 of 136

Gradient Descent Algorithm

96 of 136

97 of 136

98 of 136

99 of 136

100 of 136

101 of 136

102 of 136

103 of 136

Gradient Descent Algorithm �for Linear Regression

104 of 136

105 of 136

106 of 136

107 of 136

108 of 136

109 of 136

110 of 136

111 of 136

112 of 136

113 of 136

114 of 136

115 of 136

Linear Regression �with Multiple Features

116 of 136

117 of 136

118 of 136

119 of 136

120 of 136

Linear Regression with Multiple Variables�- Gradient Descent in Practice

121 of 136

122 of 136

123 of 136

124 of 136

Linear Regression with Multiple Variables�- Normal Equation

125 of 136

126 of 136

127 of 136

128 of 136

129 of 136

130 of 136

Linear Regression with Multiple Variables�- Features and Polynomial Regression

131 of 136

132 of 136

133 of 136

134 of 136

135 of 136

136 of 136

Next

  • Logistic Regression
  • Classification