Quiz 9- Principal Component Analysis & Clustering
This quiz has 7 questions with a total score of 13 points.
Sign in to Google to save your progress. Learn more
Email *
1. Load the Iris dataset, drop the target column species and standardise the feature values using StandardScaler.

Then, run PCA on the standardised dataset and report the top (i.e. largest) eigenvalue (to the nearest 2 decimal places) from the result of PCA. Hint: An eigenvalue represents the captured/explained variance in the direction of the respective eigenvector.
*
2 points
2. Using the PCA fitted in Question 1, find out and report the explained variance ratio  (to the nearest 2 decimal placesby the first component.
*
2 points
3.  After observing the explained variance ratio from Question 2, find out and report how many principal components we should keep to preserve at least 90% of variance of the data? *
2 points
4. Perform K-means clustering with K = 2 on the standardised data from Question 1. Now, using the first single data sample from the data standardised in from Question 1, find out and report the distance of that data point to each cluster centre (write the values comma separated to the nearest 2 decimal). Note : Use 2022 as the random seed value. *
3 points
5. PCA looks to find homogeneous subgroups among the observations. *
1 point
6.  In K-means clustering, we seek to partition the observations into a pre-specified number of clusters.  *
1 point
7. Consider a dataset with four observations: A, B, C, and D, and the following pairwise distances between them (shown in the image below).

Perform agglomerative hierarchical clustering using complete linkage and visualise the dendrogram. Now, if you cut the dendrogram at a height of 3, how many distinct clusters will be formed?

*
2 points
Captionless Image
A copy of your responses will be emailed to the address you provided.
Submit
Clear form
Never submit passwords through Google Forms.
reCAPTCHA
This form was created inside of University of Sheffield. Report Abuse