Welcome (back) to Stat 494: Statistical Genetics!
While we wait to get started...
Goals for today
Journal Club #5
Topic: PCA
An early (and very well-known!) application of Principal Component Analysis to genetic data
Discussion Leaders: Lucas, Noah, Tam
Journal Club Debrief
Stretch Break!
Journal Club Debrief
Key Points:
Journal Club Debrief
Key Points (continued):
(Image source: Supplementary Material)
Journal Club Debrief
However…
Journal Club Debrief
However…
Journal Club Debrief
Connection to upcoming content:
the PCA-based methods used here are based on genotypic patterns of variation and do not take advantage of signatures of population structure that are contained in patterns of haplotype variation
p. 100, last paragraph before Methods Summary, when discussing limitations and areas for future work
Genetic Ancestry
Journal Club Debrief - what's next?
Journal Club #6:
Understanding PCA
Principal component analysis
Principal component analysis
Source: ISLR Figure 6.14
Principal component analysis
Source: ISLR Figure 6.15
Principal component analysis
PC1 = a11x1 + a12x2 + … + a1pxp
PC2 = a21x1 + a22x2 + … + a2pxp
…
PCp = ap1x1 + ap2x2 + … + appxp
STAT 253 review (or preview)
Discuss at your table:
Check out these Stat 253 materials if you need a refresher:
https://kegrinde.github.io/stat253_coursenotes/
(Unit 6 > 19 Principal Component Analysis)
PC1 = a11x1 + a12x2 + … + a1pxp
PC2 = a21x1 + a22x2 + … + a2pxp
…
PCp = ap1x1 + ap2x2 + … + appxp
Don't peek at the next slide
PCA vocab
PC1 = a11x1 + a12x2 + … + a1pxp
PC2 = a21x1 + a22x2 + … + a2pxp
…
PCp = ap1x1 + ap2x2 + … + appxp
PC1i = a11x1i + a12x2i + … + a1pxpi
Principal component analysis
𝜙1x1 + 𝜙2x2 + … + 𝜙pxp ,
PC1 = a11x1 + a12x2 + … + a1pxp is the one with the highest variance (subject to the constraint that 𝜙12 + 𝜙22 + … + 𝜙p2 = 1)
Principal component analysis
𝜙1x1 + 𝜙2x2 + … + 𝜙pxp ,
PC1 = a11x1 + a12x2 + … + a1pxp is the one with the highest variance (subject to the constraint that 𝜙12 + 𝜙22 + … + 𝜙p2 = 1)
Principal component analysis
𝜙1x1 + 𝜙2x2 + … + 𝜙pxp ,
PC1 = a11x1 + a12x2 + … + a1pxp is the one with the highest variance (subject to the constraint that 𝜙12 + 𝜙22 + … + 𝜙p2 = 1)
Lab 4: principal component analysis
What's Next?
What's Next?