1 of 20

Descriptive Data Mining

Hierarchical Clustering

1

Business Analytics

Lecture # 09

2 of 20

TOPICS to be COVERED

01

Taxonomy

02

Single Linkage

03

Complete Linkage

04

Group Average Linkage

05

Centroid Linkage

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

3 of 20

Hierarchical Clustering

  • Hierarchical clustering consider bottom-up approach that starts with each observation belonging to its own cluster and then sequentially merges the most similar clusters to create a series of nested clusters.

3

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

4 of 20

Hierarchical Clustering

4

b

d

c

e

a

a b

d e

c d e

a b c d e

Step 0

Step 1

Step 2

Step 3

Step 4

bottom-up

Initialization:

Each object is a cluster

Iteration:

Merge two clusters which are

most similar to each other;

Until all objects are merged

into a single cluster

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

5 of 20

  • How to measure the distance between clusters?

5

Single-link

Complete-link

Average-link

Centroid distance

Distance?

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

6 of 20

Hierarchical clustering

SINGLE LINKAGE

When using the single linkage clustering method, the similarity between two clusters is

defined by the similarity of the pair

of observations (one from each

cluster) that are the most similar.

Thus, single linkage will consider two clusters to be close if an observation in one of the clusters is close to at least one observation in the other cluster.

6

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

7 of 20

Example

  • Consider points for the one dimensional data set {7, 10, 20, 28, and 35}, perform hierarchical clustering and plot the dendogram to visualize it.
  • Solution: First, let’s visualize the data. 

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

8 of 20

Observing the plot above, we can intuitively conclude that:

  1. The first two points (7 and 10) are close to each other and should be in the same cluster
  2. Also, the last two points (28 and 35) are close to each other and should be in the same cluster
  3. Cluster of the center point (20) is not easy to conclude 

Let’s solve the problem by hand using both the types of agglomerative hierarchical clustering :

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

9 of 20

Single Linkage : In single link hierarchical clustering, we merge in each step the two clusters, whose two closest members have the smallest distance.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

10 of 20

Dendogram

Using single linkage two clusters are formed:

  • Cluster 1 : (7,10)
  • Cluster 2 : (20,28,35)

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

11 of 20

COMPLETE LINKAGE

  • The complete linkage clustering method defines the similarity between two clusters as the similarity of the pair of observations (one from each cluster) that are the most different.

  • Thus, complete linkage will consider two clusters to be close if their most-different pair of observations are close. This method produces clusters such that all member observations of a cluster are relatively close to each other.

11

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

12 of 20

Example

  • Consider points for the one dimensional data set {7, 10, 20, 28, and 35}, perform hierarchical clustering and plot the dendogram to visualize it.
  • Solution: First, let’s visualize the data. 

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

13 of 20

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

14 of 20

Dendogram

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

15 of 20

Using complete linkage two clusters are formed :

  • Cluster 1 : (7,10,20)
  • Cluster 2 : (28,35)

Conclusion : Hierarchical clustering is mostly used when the application requires a hierarchy, e.g creation of a taxonomy.

However, they are expensive in terms of their computational and storage requirements.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

16 of 20

  • The single linkage and complete linkage methods define between-cluster similarity based on the single pair of observations in two different clusters that are most similar or least similar.

16

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

17 of 20

GROUP AVERAGE LINKAGE

  • The group average linkage clustering method defines the similarity between two clusters to be the average similarity computed over all pairs of observations between the two clusters.

  • If Cluster 1 consists of n1 observations and Cluster 2 consists of n2 observations, the similarity of these clusters would be the average of n1x n2 similarity measures.

17

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

18 of 20

CENTROID LINKAGE

  • Centroid linkage uses the averaging concept of cluster centroids to define between-cluster similarity. The centroid for cluster k, denoted as ck, is found by calculating the average value for each variable across all observations in a cluster.

  • that is, a centroid is the average observation of a cluster. The similarity between cluster k and cluster j is then defined as the similarity of the centroids ck and cj.

18

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

19 of 20

19

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

20 of 20

Thank You !

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.