1 of 27

�

Data Mining_Anoop Chaturvedi

Swayam Prabha

Course Title

Multivariate Data Mining- Methods and Applications

Lecture 31

Self-Organizing Map

Anoop Chaturvedi

Department of Statistics, University of Allahabad

Prayagraj (India)

Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha

2 of 27

Self Organizing Map (SOM) or Kohonen Self Organizing Feature Map or Kohonen Neural Network

An artificial neural network that is trained using unsupervised learning to produce a low-dimensional representation of the input space.

SOM is consist of a grid of nodes with each node associated with a weight vector of the same dimensionality as the input data.
Outcomes are not predefined and it is un-supervised learning scheme.
No hidden layers only input and output layers.
Output ⇒ Only winning node appears (fires).
A relatively simple network and can be trained very rapidly.

Data Mining_Anoop Chaturvedi

3 of 27

Data Mining_Anoop Chaturvedi

4 of 27

Data Mining_Anoop Chaturvedi

5 of 27

Data Mining_Anoop Chaturvedi

Advantages of SOM:

No assumptions about the distributions of variables or independence among variables is required.
Can be easily implemented even to solve complex nonlinear problems.
Effectively handle noisy and missing data and big size samples.

6 of 27

Applications

Used for dimensionality reduction, clustering, visualization, and pattern recognition.
Particularly useful for exploring and understanding high-dimensional data to reveal underlying patterns and relationships in an interpretable manner.
Has applications in Industrial instrumentation, pattern recognition, process control, robotics, telecommunications, medical applications, optimization, and product management.

Data Mining_Anoop Chaturvedi

7 of 27

Training Steps:

Initialization ⇒ Weights are randomly initialized.

Training ⇒ Present input samples to the SOM and adjust the weights to better match the input patterns using the following steps:

(i) Neighborhood Selection ⇒ A neighborhood around the best-matching unit (BMU) is selected. The BMU is the node whose weight vector is closest to the input sample in the input space.

(ii) Weight Update ⇒ The weights of the nodes within the selected neighborhood are updated to be more similar to the input sample.

(iii) Iterations ⇒ Training process continues for a fixed number of iterations or until convergence is achieved,

Data Mining_Anoop Chaturvedi

8 of 27

Data Mining_Anoop Chaturvedi

9 of 27

Data Mining_Anoop Chaturvedi

10 of 27

Data Mining_Anoop Chaturvedi

11 of 27

Data Mining_Anoop Chaturvedi

12 of 27

Data Mining_Anoop Chaturvedi

13 of 27

Data Mining_Anoop Chaturvedi

14 of 27

Data Mining_Anoop Chaturvedi

15 of 27

Data Mining_Anoop Chaturvedi

Inverse

Power

Linear

16 of 27

Data Mining_Anoop Chaturvedi

17 of 27

Unified Distance Matrix (U Matrix )

Constructed by calculating the distances between neighboring neurons in the SOM grid

These distances are visualized as a grid of values, using a grayscale or color scale. Useful in visualizing the clustering structure and the smoothness of the SOM. Areas of low values (dark regions) indicate regions where neurons are close together in the input space, indicating the presence of clusters or similar patterns.

Areas of high values (light regions) indicate regions where neurons are far apart, representing transitions between different clusters or patterns.

Data Mining_Anoop Chaturvedi

18 of 27

Data Mining_Anoop Chaturvedi

19 of 27

Data Mining_Anoop Chaturvedi

20 of 27

Hierarchical SOM

SOMs are used for dimensionality reduction and visualization of high-dimensional data

HSOMs provide a more structured organization of the data into multiple levels of abstraction.

Tree of maps. Lower maps act as a pre-processing stage.

Nodes in each level of the hierarchy are themselves SOMs.

Enables to capture complex relationships within the data at different levels of granularity.

The data is clustered and organized into smaller and more manageable groups at each level of the hierarchy.

Data Mining_Anoop Chaturvedi

21 of 27

Data Mining_Anoop Chaturvedi

Example: iris data (aweSOM package of R)

22 of 27

Quality Measures for SOM

Quantization error ⇒ Average squared distance between the data points and the map’s prototypes to which they are mapped. Lower is better.

Percentage of explained variance ⇒ Share of total variance that is explained by the clustering (=1-(quantization error)/(total variance). Higher is better.

Topographic error ⇒ Share of observations for which the best-matching node is not a neighbor of the second-best matching node. Lower is better. 0 indicates excellent topographic representation (all best and second-best matching nodes are neighbors), 1 is the maximum error (best and second-best nodes are never neighbors).

Data Mining_Anoop Chaturvedi

23 of 27

Data Mining_Anoop Chaturvedi

Kaski-Lagus error = (mean distance between points and their best-matching prototypes)+(mean geodesic distance between the points and their second-best matching prototype)

Geodesic distance ⇒ Pairwise prototype distances following the SOM grid

U-plot ⇒ Darker cells are close to their neighbors

24 of 27

Data Mining_Anoop Chaturvedi

Super-classes of SOM ⇒ Cluster the SOM map into super-classes, groups of cells with similar profiles.

Use classic clustering algorithms on the map’s prototypes.

1 of 27

2 of 27

3 of 27

4 of 27

5 of 27

6 of 27

7 of 27

8 of 27

9 of 27

10 of 27

11 of 27

12 of 27

13 of 27

14 of 27

15 of 27

16 of 27

17 of 27

18 of 27

19 of 27

20 of 27

21 of 27

22 of 27

23 of 27

24 of 27

25 of 27

26 of 27

27 of 27