���������������PCA18502 – DATA MINING AND DATA WAREHOUSING�class 8 – 02.09.2020��
Mr Rajkumar D
Assistant Professor (S.G)
MCA DEPARTMENT
SRMIST, Ramapuram
CLUSTERING
What is Clustering?
CLUSTERING
Distance Functions
Goals of Clustering
Applications
Clustering algorithms can be applied in many fields, for instance:
Types of Clustering Algorithms
There are two types of clustering algorithms:
Nonhierarchical clustering Algorithms
K-means Algorithm
The k-means algorithm is one of a group of algorithms called partitioning methods.
K-means Algorithm
1. Select k clusters arbitrarily.
2. Initialize cluster centers with those k clusters.
3. Do loop
(a) Partition by assigning or reassigning all data objects to their closest cluster center.
(b) Compute new cluster centers as mean value of the objects in each cluster
until no change in cluster center calculation.
K-means Algorithm
For example, for the third object:
dis(1,3) = |1–0| + |1–0| = 2 and dis(2,3) = |1–0| + |1–1| = 1,
so this object is assigned to C2. The fifth object is equidistant from both clusters, so we arbitrarily assign it to C1.
After calculating the distance for all points, the clusters contain the following objects:
C1 = {(0,0),(1,0),(0.5,0.5)} and
C2 = {(0,1),(1,1),(5,5),(5,6),(6,6),(6,5),(5.5,5.5)}.
K-means Algorithm
New center for C1 = (0.5,0.16)
(0+1+0.5)/3 = 0.5, (0+0+0.5)/3 = 0.16
New center for C2 = (4.1,4.2)
(0+1+5+5+6+6+5.5)/7 = 4.1, (1+1+5+5+6+6+5.5)/7 = 4.2.
Reassign the ten objects to the closest cluster center, resulting in:
C1 = {(0,0),(0,1),(1,1),(1,0),(0.5,0.5)}
C2 = {(5,5),(5,6),(6,6),(6,5),(5.5,5.5)}.
New center for C1 = (0.5,0.5)
New center for C2 = (5.5,5.5).
K-means Algorithm
(0.5,0.16) and C2 = (4.1,4.2), so the loop is repeated.
Attendance
Thank You