�
Data_Mining_Anoop Chaturvedi
1
Swayam Prabha
Course Title
Multivariate Data Mining- Methods and Applications
Lecture 15
Sample PCA and Applications
By
Anoop Chaturvedi
Department of Statistics, University of Allahabad
Prayagraj (India)
Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha
Data_Mining_Anoop Chaturvedi
2
Data_Mining_Anoop Chaturvedi
3
Data_Mining_Anoop Chaturvedi
4
Data_Mining_Anoop Chaturvedi
5
Data_Mining_Anoop Chaturvedi
6
Data_Mining_Anoop Chaturvedi
7
Data_Mining_Anoop Chaturvedi
8
Scree plot for PC of iris data
Data_Mining_Anoop Chaturvedi
9
Example: Dataset: decathlon2 from the factoextra package of R
Athletes’ performance during two sporting events.
27 individuals (athletes) described by 13 variables (sport disciplines).
A subset of the first 23 active individuals and the first 10 active variables are selected for PCA.
Data_Mining_Anoop Chaturvedi
10
Data_Mining_Anoop Chaturvedi
11
Data_Mining_Anoop Chaturvedi
12
Data_Mining_Anoop Chaturvedi
13
Data_Mining_Anoop Chaturvedi
14
Data_Mining_Anoop Chaturvedi
15
Data_Mining_Anoop Chaturvedi
16
Data_Mining_Anoop Chaturvedi
17
Data_Mining_Anoop Chaturvedi
18
Red dashed line⇒ Expected average contribution
Data_Mining_Anoop Chaturvedi
19
Data_Mining_Anoop Chaturvedi
20
PCA results for individuals (athletes): Contributions of individuals to PC1 and PC2
Data_Mining_Anoop Chaturvedi
21
Data_Mining_Anoop Chaturvedi
22
BOURGUIGNON, Karpov and Clay contribute the most to both dimensions
Data_Mining_Anoop Chaturvedi
23
Average Contribution
Example: PCA on Image Processing
The cumulative effect of the six principal components, adding one PC at a time.
R-packages: “jpeg”, "factoextra“, "gridExtra“, "ggplot2“, "magick“, "imgpalr“
The color photo has three matrices pixel by pixel, each for one component of RGB (Red, Green, Blue) color.
For converting to grayscale, sum up RGB shades and divide by max value to scale up to a maximum of 1.
Data_Mining_Anoop Chaturvedi
24
Run individual PCA on shades, R, G, and B giving an eigenvector of shades.
Data_Mining_Anoop Chaturvedi
25
Data_Mining_Anoop Chaturvedi
26
Original Image
Each color scale (R, G, B) gets its matrix and PCA.
Integrate new shades into the picture.
The image becomes clearer as we increase the number of principal components.
Data_Mining_Anoop Chaturvedi
27