Clustering: K-Means
Supervised vs. Unsupervised Learning
2
Supervised Learning | Unsupervised Learning |
Building a model from labeled data | Clustering from unlabeled data |
| |
| |
Data Clustering
3
Data Clustering: Similarity
4
Unlabeled Data
5
6
Assigning Points
7
Assigning Points
8
Recomputing the Cluster Centers
9
Recomputing the Cluster Centers
10
Assigning Points
11
Assigning Points
12
Recomputing the Cluster Centers
13
Recomputing the Cluster Centers
14
Repeat
15
Final Clustering
16
K-Means: (Iterative) Algorithm
17
K-Means: (Iterative) Algorithm
2) Iteration
18
K-Means: (Iterative) Algorithm
19
Discussion: “Chicken and Egg" Dilemma
20
The Power of Continuous Improvement: Lessons from K-Means
21
K-Means in Python
22
Python: Data Generation
23
Python: Data Generation and Random Initialization
24
Python: K-Means
25
Python: K-Means in Scikit-learn
26
Some Issues in K-Means
27
Initialization Issues
28
Choosing the Number of Clusters
29
Choosing the Number of Clusters
30
Discussion (1/2)
31
Discussion (2/2)
32
K-Means: Limitations (1/4)
33
K-Means: Limitations (2/4)
34
K-Means: Limitations (3/4)
35
K-Means: Limitations (4/4)
36