Alpha and Beta Microbial Diversity Metrics
Ranacapa Analysis
Tutorial by Chris Dao and Amanda Freise
UCLA MIMG Dept.
PUMA
Biodiversity can be measured at multiple scales
http://www.webpages.uidaho.edu/veg_measure/Modules/Lessons/Module%209(Composition&Diversity)/9_2_Biodiversity.htm
α-diversity: within an individual sample
Site A = 7 species
Site B = 5 species
β-diversity: diversity between multiple samples
Site A and C have highest β-diversity:
10 species that differ between them,
only 2 species in common
PUMA
Defining alpha diversity
PUMA
Defining alpha diversity
PUMA
Accurately estimating diversity
PUMA
Samples may have inconsistent sequencing results
PUMA
Rarefaction: a normalization method
PUMA
Rarefaction curves indicate species coverage
Number of sequences
most or all species have been sampled
this site has not been exhaustively sampled
only a small fraction of species been sampled
PUMA
Rarefaction curves
Compare subsample of 50k reads in each sample
PUMA
Ranacapa provides several built-in analyses
Once we calculate alpha diversity, how can we ask scientifically meaningful questions?
Are some types of samples more or less diverse than other samples?
How do we categorize samples? METADATA!
PUMA
Are these community differences significant?
PUMA
Objective: explore the relationship between environmental parameters (metadata) and diversity
METADATA - Data about the data
Compare categories:
-Two samples directly (Sample A vs. Sample B)
-Two groups of samples (Burned samples vs. Unburned samples)
-Multiple groups of samples (Low vs. Medium vs. High soil phosphate levels)
For our analyses metadata needs to be categorical (low, med., high) rather than continual (i.e. 3.2, 5.3, 8.5)
PUMA
Metadata Table Example
Sample ID | Team Name | Sample Location | Burn status | Phosphate content |
S18.K0010A2 | LIT | Skirball | Burned | Low |
S18.K0011C1 | FIF | Skirball | Unburned | Low |
S18.K0011C2 | FIF | Skirball | Burned | Medium |
S18.K0033C1 | SIT | Skirball | Unburned | Medium |
S18.K0033C2 | GoB | Botanical Garden | Unburned | Medium |
S18.K0145B1 | TDD | Botanical Garden | Unburned | Low |
S18.K0147C2 | BBB | Skirball | Unburned | Medium |
S18.K0148A1 | AL-Gs | Skirball | Burned | Low |
S18.K0154C1 | SIT | Skirball | Burned | Low |
S18.K0154C2 | FC | Skirball | Unburned | Medium |
S18.K0191C1 | LIT | Skirball | Burned | Medium |
S18.K0192B2 | AL-Gs | Botanical Garden | Unburned | Low |
S18.K0192C1 | TDD/FC | Skirball | Burned | High |
S18.K0195C1 | BBB | Skirball | Burned | High |
Metadata Categories
PUMA
Ranacapa provides several built-in analyses
Once we calculate alpha diversity, how can we ask scientifically meaningful questions?
Are some types of samples more or less diverse than other samples?
PUMA
Grouping samples together using metadata allows us to make inferences about ecological parameters, ie, burned vs. unburned
PUMA
T-test
PUMA
Variance
Image from: http://www.statisticshowto.com/sample-variance/
PUMA
ANOVA, or Analysis of Variance
PUMA
ANOVA example
PUMA
ANOVA post-hoc tests
PUMA
ANOVA example
Which ANOVA will have a lower p-value?
Recall that ANOVA partitions variance between categories and within categories
If there is more variance is between categories and less within categories, then the categories explain the variance in the data well (lower p-value)
Comparison 1
Comparison 2
ANOVA example
Recall that ANOVA partitions variance between categories and within categories
p=0.82
If there is more variance is between categories and less within categories, then the categories explain the variance in the data well (lower p-value)
p=0.02
Comparison 1
Comparison 2
ANOVA table in Ranacapa
Alpha diversity plots
Comparison 1
Alpha diversity stats
Degrees of freedom sample size
Sum/mean of squares variance
F-statistic
p-value
higher value probability the observed F- means more statistic is due to chance variance is
explained
ANOVA post-hoc tests
Moving on to beta diversity…
Representation of beta-diversity
PUMA
Representation of beta-diversity
PUMA
Representation of beta-diversity
PUMA
Beta-diversity distance matrix
Bray-Curtis distance matrix of 109BL-S18 samples made using QIIME2
PUMA
Highly multidimensional datasets
ASV table (~2800 dimensions) Distance matrix (14 dimensions)
PUMA
Overview of ordination methods
PUMA
Beta Diversity (Ordination)
Microbial community profile visualization
Qualitative question: do data points with the same metadata label cluster with each other?
Each dot represents a sample, colored based on its associated metadata category grouping
PUMA
Beta Diversity in Ranacapa
Each dot represents a sample, colored based on its associated metadata category grouping
PUMA
PCA is an axis transformation
Here’s a reduction from 2D to 1D
Our actual PCA is a reduction from 2800D to 2D or 3D!
PUMA
PCoA and NMDS similar to PCA
PUMA
Beta Diversity Cluster Analysis
Image from ranacapa demo data
PUMA
Beta-diversity group significance is calculated by permutational multivariate ANOVA
PERMANOVA is the same principle as an ANOVA, except it:
1. uses user-defined distance metrics instead of variance
2. permutes the dataset to assess statistical significance
R2 Effect size: % variance explained
P-value: statistical significance
PUMA