Hereditary Stratigraphy Methods for Phylogenetic Inference over Distributed EC Populations
June 1, 2023 @ GPTP
Matthew Andres Moreno
Ecology & Evolutionary Biology/Complex Systems
University of Michigan
@MorenoMatthewA
slides: https://hopth.ru/cc
Phylogenetic Analysis in Evolutionary Computing
@MorenoMatthewA
slides: https://hopth.ru/cc
(Hernandez et al.)
perfect tracking
@MorenoMatthewA
slides: https://hopth.ru/cc
serial perfect tracking: easy, efficient, & robust
@MorenoMatthewA
slides: https://hopth.ru/cc
serial perfect tracking: easy, efficient, & robust
Parallel & distributed perfect tracking: complex, potentially fragile & expensive
slides: https://hopth.ru/cc
serial perfect tracking: easy, efficient, & robust
Parallel & distributed perfect tracking: complex, potentially fragile & expensive
Bio phylogenetic analysis through post-hoc inference is robust and decentralized.
@MorenoMatthewA
slides: https://hopth.ru/cc
serial perfect tracking: easy, efficient, & robust
Parallel & distributed perfect tracking: complex, potentially fragile & expensive
Bio phylogenetic analysis through post-hoc inference is robust and decentralized.
🐶
🐰
@MorenoMatthewA
slides: https://hopth.ru/cc
serial perfect tracking: easy, efficient, & robust
Parallel & distributed perfect tracking: complex, potentially fragile & expensive
Bio phylogenetic analysis through post-hoc inference is robust and decentralized.
🐶
🐰
infer
@MorenoMatthewA
slides: https://hopth.ru/cc
🐶
🐰
serial perfect tracking: easy, efficient, & robust
Parallel & distributed perfect tracking: complex, potentially fragile & expensive
Bio phylogenetic analysis through post-hoc inference is robust and decentralized.
🐶
🐰
infer
Research Question:
How to design genomes to maximize phylogenetic reconstructability?
@MorenoMatthewA
slides: https://hopth.ru/cc
🐶
🐰
serial perfect tracking: easy, efficient, & robust
Parallel & distributed perfect tracking: complex, potentially fragile & expensive
Bio phylogenetic analysis through post-hoc inference is robust and decentralized.
🐶
🐰
infer
Research Question:
How to design genomes to maximize phylogenetic reconstructability?
@MorenoMatthewA
new methodology & plug-’n’-play software tools
slides: https://hopth.ru/cc
🐶
🐰
slides: https://hopth.ru/cc
@MorenoMatthewA
Talk structure
goal
slides: https://hopth.ru/cc
@MorenoMatthewA
Talk structure
focus on sexual populations
goal
slides: https://hopth.ru/cc
@MorenoMatthewA
Talk structure
focus on sexual populations
goal
??? how
how
result
slides: https://hopth.ru/cc
@MorenoMatthewA
Talk structure
⏩
⏩
✂️
focus on sexual populations
genomes from
asexual population
phylogeny
goal
1
slides: https://hopth.ru/cc
phylogeny
genomes from
asexual population
slides: https://hopth.ru/cc
goal
1
phylogeny
genomes from
asexual population
slides: https://hopth.ru/cc
goal
1
genomes from
asexual population
phylogeny
??? how
how
slides: https://hopth.ru/cc
genome
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
slides: https://hopth.ru/cc
genome
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
slides: https://hopth.ru/cc
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
slides: https://hopth.ru/cc
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
gen 1
➔
slides: https://hopth.ru/cc
📌
📌
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
gen 1
gen 2
➔ ➔
slides: https://hopth.ru/cc
📌
📌
📌
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
gen 1
gen 2
gen 3
➔ ➔ ➔
slides: https://hopth.ru/cc
📌
📌
📌
📌
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
gen 1
gen 2
gen 3
➔ ➔ ➔
slides: https://hopth.ru/cc
📌
gen 3
📌
📌
📌
📌
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
gen 1
gen 2
gen 3
➔ ➔ ➔
slides: https://hopth.ru/cc
📌
gen 3
📌
📌
📌
📌
📌
genomes from
asexual population
phylogeny
??? how
how
📌
instrumentation
gen 0
gen 1
gen 2
gen 3
➔ ➔ ➔
slides: https://hopth.ru/cc
📌
gen 3
📌
📌
📌
📌
📌
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
⬇️ pruning ⬇️
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
Tradeoff: space vs MRCA estimate uncertainty
⬇️ pruning ⬇️
Uncertainty when estimating MRCA
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
Tradeoff: space vs MRCA estimate uncertainty
⬇️ pruning ⬇️
Uncertainty when estimating MRCA
✂️
–– pocket slides —
Instrumentation space vs. accuracy
??? how
how
slides: https://hopth.ru/cc
Tradeoff: space vs MRCA estimate uncertainty
⬇️ pruning ⬇️
Uncertainty when estimating MRCA
✂️
–– pocket slides —
(happy to say a little more in Q&A)
68 bytes/genome; 262,144 generations w/ pop size 32,768 leaves (100 subsample shown)
Example phylogeny reconstruction
result
genomes from
sexual population
phylogeny
slides: https://hopth.ru/cc
goal
1
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
<
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
<
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
<
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
<
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
<
phylogeny
genomes from
sexual population
slides: https://hopth.ru/cc
how
<
Example phylogenetic reconstruction
result
✂️
@MorenoMatthewA
genomes from
sexual population
historical population
size estimates
goal
2
genomes from
sexual population
historical population
size estimates
goal
2
genomes from
sexual population
historical population
size estimates
goal
2
genomes from
sexual population
historical population
size estimates
time
Population size
goal
2
genomes from
sexual population
historical population
size estimates
how
slides: https://hopth.ru/cc
4 observations –> 95% CI spanning 8-fold magnitude
genomes from
sexual population
historical population
size estimates
how
max(🎲,🎲,🎲,🎲,🎲,🎲,🎲,🎲,🎲)
slides: https://hopth.ru/cc
4 observations –> 95% CI spanning 8-fold magnitude
genomes from
sexual population
historical population
size estimates
how
max(🎲,🎲,🎲,🎲,🎲,🎲,🎲,🎲,🎲)
max(🎲,🎲,🎲)
vs
slides: https://hopth.ru/cc
4 observations –> 95% CI spanning 8-fold magnitude
Example population size estimation
result
@MorenoMatthewA
detection of gene-level selection
goal 3
slides: https://hopth.ru/cc
detection of gene-level selection
goal
3
slides: https://hopth.ru/cc
detection of gene-level selection
goal 3
slides: https://hopth.ru/cc
vs.
slides: https://hopth.ru/cc
detection of gene-level selection
how
16 generations
Gen. n
✂️
–– pocket slides —
slides: https://hopth.ru/cc
detection of gene-level selection
how
16 generations
Gen. n
📸.
✂️
–– pocket slides —
Detection of gene-level selection
result
@MorenoMatthewA
allele frequency
Detection of gene-level selection
result
@MorenoMatthewA
allele frequency
signal
Conclusion
From extant members of a distributed population,
@MorenoMatthewA
slides: https://hopth.ru/cc
Conclusion
From extant members of a distributed population,
Methods & tools that may be useful in your system,
@MorenoMatthewA
slides: https://hopth.ru/cc
Acknowledgment
Collaborators
Advisors
Emily Dolson
Santiago Rodriguez Papa
Charles Ofria
Luis Zaman
@MorenoMatthewA
Bibliography
O'Neill, Bill. "Digital evolution." PLoS Biology 1.1 (2003): e18.
Dolson, Emily, and Charles Ofria. "Spatial resource heterogeneity creates local hotspots of evolutionary potential." ECAL 2017, the Fourteenth European Conference on Artificial Life. MIT Press, 2017.
Dolson, Emily, and Charles Ofria. "Ecological theory provides insights about evolutionary computation." Proceedings of the Genetic and Evolutionary Computation Conference Companion. 2018.
Lalejini, Alexander et al. “Phylogeny-informed Lexicase Selection.” GPTP (2023).
Hagstrom, George I., et al. "Using Avida to test the effects of natural selection on phylogenetic reconstruction methods." Artificial life 10.2 (2004): 157-166.
Lenski, R. E., Ofria, C., Pennock, R. T., & Adami, C. (2003). The evolutionary origin of complex features. Nature, 423(6936), 139–144.
Kauffman, Stuart, and Simon Levin. "Towards a general theory of adaptive walks on rugged landscapes." Journal of theoretical Biology 128.1 (1987): 11-45.
@MorenoMatthewA
slides: https://hopth.ru/cc
Images
Eric Gaba for Wikimedia Commons, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
David Abián, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
Jose Guadalupe Hernandez, Alexander Lalejini, and Emily Dolson. 2022. Phylogenetic diversity predicts future success in evolutionary computation. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '22). Association for Computing Machinery, New York, NY, USA, 23–24. https://doi.org/10.1145/3520304.3534079
slides: https://hopth.ru/cc
@MorenoMatthewA
Questions?
@MorenoMatthewA
slides: https://hopth.ru/cc
Pruning: distribution
more recent
more ancient
@MorenoMatthewA
Pruning: distribution
more recent
more ancient
⬆️ evenly pruned vs. recency-proportional pruned ⬇️
@MorenoMatthewA
Pruning: distribution
more recent
more ancient
⬆️ evenly pruned vs. recency-proportional pruned ⬇️
more retention
less retention
@MorenoMatthewA
435 bytes/genome; 262,144 generations w/ pop size 32,768 leaves (100 subsample shown)
Example phylogeny reconstruction
result
slides: https://hopth.ru/cc
detection of gene-level selection
how
a
b
c
d
e
f
g
h
i”
j
ab
cd
cde
📸.
16 generations
i”j
hi”j
ghi”j
cdef
abcdefi’
abcdefghij
i’
…
…
…
…
…
…
…
…
…
…
…
Gen. n
∴ ≥10 copies
@ 16 gen
x
x
x
x
x
x
gen 0
gen 1
gen 2
gen 3
gen 4
gen 5
gen 6
gen 7
gen 8
➔ ➔ ➔ ➔ ➔ ➔ ➔ ➔
rank 0
rank 1
rank 2
rank 3
rank 4
rank 5
rank 6
rank 7
rank 8
record alloc:
recency-proportional
O(n )
__🗄️🗂️ oldest 🗂🗄️__
O(log n)
t=8
t=16
⏩ 8 gens ⏩
t=8
t=16
⏩ 8 gens ⏩
⏩ 8 gens ⏩
t=8
t=16
⏩ 8 gens ⏩
space
complexity
time –>
space –>
time –>
space –>
— — - upper bound - — —
time –>
__🗄️🗂️ oldest 🗂🗄️__
__🗄️🗂️ oldest 🗂🗄️__
__🗄️🗂️ oldest 🗂🗄️__
__🗄️🗂️ oldest 🗂🗄️__
__🗄️🗂️ oldest 🗂🗄️__
__🗄️🗂️ oldest 🗂🗄️__
__🗄️🗂️ oldest 🗂🗄️__
__🌟✨ newest ✨🌟__
__🌟✨ newest ✨🌟__
__🌟✨ newest ✨🌟__
__🌟✨ newest ✨🌟__
__🌟✨ newest ✨🌟__
❌
❌
❌
❌
❌
❌
❌
❌
❌
❌
❌
❌
❌
❌
__🌟✨ newest ✨🌟__
❌
__🌟✨ newest ✨🌟__
❌
❌
❌
❌
❌
❌
❌
❌
❌
__🌟✨ newest ✨🌟__
❌
❌
❌
❌
❌
space –>
t=8
t=16
O(1)
record alloc: uniform