1 of 33

LECTURE 1

INTRODUCTION TO MACHINE LEARNING

2 of 33

Course outline

  • Good morning everyone. In this course, we will build a full picture of Machine Learning—from the project workflow to the most common algorithms used in industry. We will cover:
  • classification and regression,
  • including binary and multi-class classification, and regression methods like linear, polynomial,
  • logistic approaches,
  • and gradient descent as an optimization idea. Then we will move to stronger models such as
  • Support Vector Machines,
  • Decision Trees, and Random Forest. We will also study dimensionality reduction and unsupervised learning topics like clustering and anomaly detection. Finally, we will practice using popular frameworks like Keras, TensorFlow, and Scikit-learn.

3 of 33

4 of 33

What is Machine Learning?

  • Machine Learning is a part of Artificial Intelligence that allows computers to learn from data without being explicitly programmed. Instead of writing fixed rules, we give the system examples, and it learns patterns from those examples.
  • So, what is Machine Learning? Machine Learning, or ML, is the ability of computers to learn without being explicitly programmed with fixed rules. Instead of writing a long list of “if-else” conditions, we provide examples and data, and the system learns patterns from them. A classic early definition, associated with Arthur Samuel in 1959, explains ML as giving machines the capability to learn from experience. In simple words: we teach the computer using data, and it improves over time.

5 of 33

6 of 33

ML idea with apples and pears (Training concept)

  • Let’s understand this with a simple example. Imagine we show the computer many pictures of apples and pears. Each image is data. The computer analyzes these examples and creates a model—this model is like a learned program. The process of creating the model is called training. During training, the computer finds patterns such as shape, color, texture, and learns how to separate “apple” from “pear.” After training, it can classify new, unseen images.
  • For example, if we want a computer to recognize apples and pears, we show it many labeled images. During training, the computer analyzes these images and creates a model. After training, the model can classify new images correctly.

7 of 33

8 of 33

Supervised Learning: what problems is it used for?

  • Now we move to supervised learning, which means “learning under supervision.” Here, the data is labeled: for each example, we know the correct answer. Supervised learning is mainly used for two types of tasks: classification and regression. In classification, we predict categories—for example, medical diagnosis, image classification, fraud detection, or customer churn prediction. In regression, we predict a numeric value—such as weather forecasting, market prices, population trends, currency exchange rates, and many other real-world predictions.
  • Supervised learning is a type of Machine Learning where the data is labeled. This means that for each example, the correct answer is known. Supervised learning is mainly used for classification and regression problems.

9 of 33

10 of 33

Supervised Learning: regression intuition (scatter plot)

  • This plot shows the idea of regression. On the horizontal axis we have a parameter or feature, and on the vertical axis we have the value we want to predict. The dots represent real observed data. The goal of regression is to learn the relationship between parameters and outcomes—so that when a new parameter value appears, we can estimate the expected output. This is the foundation of forecasting and numerical prediction in Machine Learning.
  • Regression is used to predict numerical values. The goal is to find a relationship between input variables and the output value. For example, regression can be used to predict house prices, exam scores, or future sales.

11 of 33

12 of 33

Supervised Learning algorithms list

  • Here are common supervised learning algorithms you should remember.
  • k-Nearest Neighbors is a simple method based on similarity.
  • Linear Regression predicts continuous values using a straight-line relationship.
  • Logistic Regression is used for classification, especially binary classification.
  • Support Vector Machines are powerful for separating classes using optimal boundaries.
  • Decision Trees make decisions through branching rules.
  • Random Forest combines many trees to increase stability and accuracy.
  • And Neural Networks can learn complex patterns, especially with large datasets.
  • Common supervised learning algorithms include Linear Regression, Logistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, and Neural Networks. Each algorithm has its own advantages and is used for different tasks.

13 of 33

14 of 33

Unsupervised Learning (no labels)

  • Next is unsupervised learning—learning without supervision. The key difference from supervised learning is that we do not provide labels. The system receives data and tries to discover structure by itself. For example, it can group similar items into clusters. In the slide, you can imagine many emails or messages with no labels; the algorithm can separate them into different groups based on similarity. This is useful when we don’t know the categories in advance.
  • Unsupervised learning is used when the data has no labels. The system does not know the correct answers and tries to find patterns by itself. This type of learning is useful for discovering hidden structures in data.

15 of 33

16 of 33

Unsupervised Learning applications

  • Unsupervised learning is used in several important tasks. First, clustering: grouping users or customers—for recommendation systems, targeted marketing, and customer segmentation. Second, dimensionality reduction, such as reducing data from many dimensions into 2D for visualization, compressing information, and simplifying features. Third, discovering patterns and rules in data. And fourth, anomaly detection—finding unusual or abnormal values, which is very useful in security, fraud detection, and system monitoring.
  • Unsupervised learning is used for clustering, dimensionality reduction, pattern discovery, and anomaly detection. For example, companies use clustering to group customers, and anomaly detection to find unusual behavior.

17 of 33

18 of 33

Clustering & anomaly detection visuals

  • This slide shows clustering visually: we start with many customers mixed together, and the algorithm separates them into meaningful groups. Another image shows anomaly detection: most points form a normal cluster, but a few points are far away—those are anomalies. In real systems, anomalies could mean fraud, unusual user behavior, network attacks, or machine failure signals.
  • In clustering, similar data points are grouped together. In anomaly detection, data points that are very different from others are identified. These techniques are widely used in security, finance, and system monitoring.

19 of 33

20 of 33

21 of 33

Unsupervised algorithms list + Semi-supervised learning

  • For unsupervised learning algorithms, we commonly use clustering methods like K-Means, DBSCAN, and Hierarchical Clustering. For anomaly detection, methods include One-Class SVM and Isolation Forest. For dimensionality reduction, we use PCA, Kernel PCA, and similar techniques. For rule discovery, association-rule algorithms like Apriori and Eclat are used.
  • The slide also introduces semi-supervised learning, which combines supervised and unsupervised learning. This is useful when labeled data is limited but unlabeled data is large. For example, in face recognition, we label a small number of images, and the system learns to group many more images automatically.
  • Popular unsupervised algorithms include K-Means, DBSCAN, Hierarchical Clustering, PCA, and Isolation Forest. These methods help analyze data without predefined labels.

22 of 33

23 of 33

24 of 33

Reinforcement learning

  • Now we discuss reinforcement learning. This is learning through interaction with an environment. An agent takes actions, receives rewards for good actions and penalties for bad actions, and updates its strategy. It is used in robotics, game AI, real-time decision-making, and control systems.
  • Reinforcement learning is based on rewards and penalties. An agent learns by interacting with the environment. Correct actions receive rewards, and incorrect actions receive penalties. This type of learning is used in robotics and game AI.

25 of 33

26 of 33

Offline vs Online learning

  • The slide also compares offline and online learning. Offline learning means we train using all available data at once. It can be simpler, but it often requires more time and resources, and when new data arrives we may need retraining. Online learning updates the model gradually as data comes in, which can be faster and suitable for continuously changing systems like weather or stock markets—but it can also be sensitive to poor-quality incoming data.

27 of 33

28 of 33

Machine Learning difficulties

  • Machine Learning faces challenges such as lack of data, poor data quality, bias, and overfitting. To evaluate models correctly, we use testing and validation datasets.
  • Machine Learning is powerful, but it has challenges. One challenge is collecting enough data—some tasks need thousands of examples, and image/audio tasks may need millions. Another issue is that data may not cover all real-world variations, which can cause bias. Poor-quality data—errors, noise, and outliers—can reduce performance. Also, using irrelevant parameters can confuse the model and lead to wrong predictions.

29 of 33

30 of 33

Test & Validation

  • To solve reliability problems, we use testing and validation. We split the data: we train the model on a training set, and we test it on a test set. If results are not good, we adjust the model or parameters and repeat.

31 of 33

32 of 33

33 of 33

Train/Test/Validation split + “No best model

  • Data is usually divided into training, testing, and validation sets. There is no single best Machine Learning model for all problems. The best model depends on the data and the task.
  • A common solution is to split the dataset into three parts: training, testing, and validation. For example, 60% for training, 20% for testing, and 20% for validation. Training is for learning the model, testing is for measuring performance on unseen data, and validation helps us tune parameters fairly.
  • Finally, an important lesson: there is no single “best model” for all problems. Different models perform differently depending on the dataset and the task. A strong ML engineer chooses a model that matches the data and the goal, and tests multiple options before deciding.