Welcome (back) to Stat 494: Statistical Genetics!
While we wait to get started...
Goals for today
Journal Club #6
Topic: Genetic Ancestry Inference
Discussion Leaders: Nick, Ronan, Sam
Journal Club Debrief
Stretch Break!
Journal Club Debrief
Key Points:
Figure 5B. The proportion of individuals that self-report as European American, Latino, and African American for each 2% bin of African ancestry and Native American ancestry. The proportion for each 2% bin is shown as a pie chart, with slices colored in proportion to the absolute numbers of individuals from each self-reported identity that carry those levels of genome-wide ancestry. Pie charts are omitted for bins where there were no individuals with those corresponding levels of Native American and African ancestry.
Journal Club Debrief
Connections to previous topics:
p. 38 (Methods)
p. 39 (Methods)
Statistical methods for inferring genetic ancestry
Supervised (requires labels or outcome values) → need to have some people with known ancestry so we can train model
Unsupervised (doesn't require labels or outcome values)
Journal Club Debrief - What's Next?
Journal Club #7:
What's Next?
Upcoming Project Checkpoints
Due Dates:
To help with this, let's do a quick round of project "speed dating"...
Project "Speed Dating" - Round 1
Discuss:
Project "Speed Dating" - Round 2
Move to another table.
Make sure there are at least 2 people you didn't talk to during the last round.
Discuss:
Project "Speed Dating" - Round 3
We'll do this one more time in class on Thursday.
(And then I'll give you time in class to complete the Project Preferences Survey.)
When you get to class on Thursday, sit with at least 2 people you didn't talk to today!
What's Next?
Understanding PCA
Lab 4: check-in
Part 1:
Lab 4: work time
Continue working on Parts 2 and 3!
NOTE:
Challenge: genetic data are observational
Genetic ancestry is a potential confounding variable in GWAS:
allele frequency of the SNP we're testing differs across ancestral populations
environmental factors or causal SNPs elsewhere in genome that differ across ancestry groups
Challenge: genetic data are observational
If we know that genetic ancestry is a potential confounding variable, then we should adjust for it in our GWAS models:
E[y | xj, 𝛑] = 𝛼 + 𝛽j xj + 𝛄𝛑,
← inferred using PCA or other techniques!