Inference of single-cell phylogenies from lineage tracing data using Cassiopeia
Jones et al., Genome Biology 2020
Presented by: Kowshika Sarker
Contributions
CRISPR-Cas9
Existing Works
Cassiopeia
Cassiopeia-ILP
Steiner tree (ST) problem
G = (V , E): STVs → A minimum cost tree that spans Vs
Vs = V: STVs = MST → Polynomial algorithm
In general, STVs is NP-hard
Cassiopeia-ILP
Cassiopeia-ILP
Cassiopeia-ILP
Cassiopeia-Greedy
Cassiopeia-Greedy
Without
prior
With
prior
Cassiopeia-Greedy
Perfect phylogeny
Cassiopeia-Greedy
Perfect phylogeny
Cassiopeia-Greedy
Cassiopeia-Greedy
Cassiopeia-Greedy
Cassiopeia-Hybrid
Split with cutoff for each subset (e.g., 200 cells)
Evaluation: Accuracy
Evaluation: Optimality
Evaluation: Scalability
Evaluation: Scalability
Evaluation: Robustness
Bootstrapping analysis
Sampling with replacement
Character – M sampled from M
Tree – 100 sampled from 10
Metric: Transfer Bootstrap Expectation (TBE)
minimum number of taxa to be removed to make the two bipartitions identical
Evaluation: Robustness
Evaluation: Parallel Evolution
Evaluation: Parallel Evolution
Cassiopeia-Greedy
Evaluation: Parallel Evolution
Simulation Framework
Simulation Framework
Lineage Tracer Design Insights
Information Capacity (IC)
Lineage Tracer Design Insights
Indel Distribution
θ: 0 < θ < 1
Greater θ → better accuracy
Especially for low #states or large #samples
Lineage Tracer Design Insights
Experimental Dataset
34,557 cells
Experimental Dataset
Evaluation: Experimental dataset
Evaluation: Experimental dataset
Evaluation: Experimental dataset
Categorical variable – Plate ID in experiment
Evaluation: Experimental dataset
Evaluation: Experimental dataset
Evaluation: Generalizability
GESTALT linage tracing data
Evaluation: Generalizability
Emerging base-editor tracers
High-character low-state regimes
Evaluation: Generalizability
Varying mutation rate across target sites
Future Directions