�
Data Mining_Anoop Chaturvedi
1
Swayam Prabha
Course Title
Multivariate Data Mining- Methods and Applications
Lecture 29
Centroid and Non-hierarchical Clustering Methods
By
Anoop Chaturvedi
Department of Statistics, University of Allahabad
Prayagraj (India)
Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha
Data Mining_Anoop Chaturvedi
2
Agglomerative and divisive clustering are the two primary approaches, with agglomerative being more commonly employed due to its simplicity and efficiency.
Centroid Clustering:
Example:
Data Mining_Anoop Chaturvedi
3
Individual | 1 | 2 | 3 | 4 | 5 |
Variable 1 | 1 | 1 | 6 | 8 | 8 |
Variable 2 | 1 | 2 | 3 | 2 | 0 |
Data Mining_Anoop Chaturvedi
4
Data Mining_Anoop Chaturvedi
5
Individual | (12) | 3 | 4 | 5 |
Variable 1 | 1 | 6 | 8 | 8 |
Variable 2 | 1.5 | 3 | 2 | 0 |
Data Mining_Anoop Chaturvedi
6
Individual | (12) | 3 | (45) |
Variable 1 | 1 | 6 | 8 |
Variable 2 | 1.5 | 3 | 1 |
Fuse (45) and 3.
Fuse (12345) at a distance 6.04.
Data Mining_Anoop Chaturvedi
7
Individual | (12) | (345) |
Variable 1 | 1 | 7 |
Variable 2 | 1.5 | 2 |
1
2
3
4
5
Median Clustering:
Disadvantages of Centroid clustering
Data Mining_Anoop Chaturvedi
8
Data Mining_Anoop Chaturvedi
9
Data Mining_Anoop Chaturvedi
10
Data Mining_Anoop Chaturvedi
11
Some Comments on Hierarchical Procedures:
Data Mining_Anoop Chaturvedi
12
Data Mining_Anoop Chaturvedi
13
Dendrogram with crossover
Data Mining_Anoop Chaturvedi
14
Nonhierarchical or Partitioning Clustering Methods:
Data Mining_Anoop Chaturvedi
15
Data Mining_Anoop Chaturvedi
16
Data Mining_Anoop Chaturvedi
17
To check stability, rerun algorithm with new set of initial groups.
Example:
Data Mining_Anoop Chaturvedi
18
Item | Observations | |
| | |
A | 5 | 3 |
B | -1 | 1 |
C | 1 | -2 |
D | -3 | -2 |
Mean | | |
Data Mining_Anoop Chaturvedi
19
| | |
(AB) | | 2 |
(CD) | -1 | -2 |
Data Mining_Anoop Chaturvedi
20
| A | B | C | D |
A | 0 | 40 | 41 | 89 |
(BCD) | 52 | 4 | 5 | 5 |
Data Mining_Anoop Chaturvedi
21
Petal Length
Petal Width
Example: k-mean clustering for Iris dataset.
Data Mining_Anoop Chaturvedi
22
Data Mining_Anoop Chaturvedi
23
Comparison of results be forming a table with the species column of the original data
Data Mining_Anoop Chaturvedi
24
Cluster plot:
Data Mining_Anoop Chaturvedi
25