Investigating genome function in non-model mammals
Elinor K. Karlsson, Ph.D.
Associate Professor, University of Massachusetts Medical School
Director of Vertebrate Genomics, Broad Institute of MIT and Harvard
Rachel Johnston
Broad Institute Genomics Platform
Capacity today at the
Broad Institute Genomics Platform:
about one human genome every 3 minutes
Mersha, Tesfaye B., et al. "Resolving clinical phenotypes into endotypes in allergy: molecular and omics approaches." Clinical Reviews in Allergy & Immunology 60.2 (2021): 200-219
All of these (except proteomics & metabolomics?) are assayed with DNA sequencing technology
More sequencing capacity = we sequence more things
Also:
single cell sequencing
spatial transcriptomics
3D genome structure
CCGGGACGCGGCGGCCGCAGGGGCAGCGGCGACGGCAGCACCGGCGGCAGCACCGGCCGGAGCACCAGCGAGAGCAGCAGCAGCCGCGGCAGCGGCGTCCCGAGTGCCCGCGGCGCGCGGCGCAGCGATGCGGTCCCCACGGACGCGCGACCGGCCCGGGCGCCCCCTGAGCCTCCTGCTCGCCCTGCTCTGCGCCCTGCGAGCCAAGGTAGGAGCCTGCCGGGCCTCCCTCCCGGCCCGCCCTCTCCTTTCCCTCGCAGCTCCCTGGACGCCTGCGGCTGAGCGCCTCGAGCGGGCGCGGGAGCCCCCGGGCGCCCCGCTCCCCGGCTGGGTCCCCCGCGCCCGGGGGGCTGCACCTGGGCCGGGAAGTCCCGGACCTCCTACGTGTCCCCTCCACCCTCCGGGCGGGGCCGGGGGCTGTGTTCTCTCCGCTCCTCGCCCCCGGGAGCACTCGTCGGATTTCTCCGGCACTCGTGCATTTTGTGCTCGGGAATCACGTTGAGTCCTTGCACCCAGTTTTGCAAAATCCTTTTCACTCGGCGCCGGGGGCCCGTCGGGGCGGGCGGGGAGGGAGTCGCCGCTTCCACCCTCGGAGCAAACCCCTCTCCCCCGCGCTGACCTCCCTCCCCCCTCGCCGGCAGGTGTGCGGCGCCTCGGGCCAGTTCGAGCTGGAGATCCTGTCCATGCAGAACGTGAACGGGGAGCTGCAGAACGGGCACTGCTGCGGCGGCGTCCGGAGCCCGGGGGACCGCAAGTGCTCTCACGACGAGTGTGACACGTACTTCAAAGTGTGCCTCAAGGAGTACCAGTTCCGCGTCACGGCCGGGGGGCCCTGTAGCTTCGGGTCGGGCTCCACGCCAGTCCTCGGGGGCAACACCTTCAATCTCAAGGCCGGCCGTGGCAGCGAACGCAACCGCATCGTGCTGCCTTTCAGTTTCGCCTGGCCGGTGAGTGCGCCACGCGGGGAGGGCGGCCCGCCGGGGTCCCAGCCCGCGCCGGCGGAGCCCCGCGGCCTCGCCAGAGGGACGGCGGGTCTGGGCTGGAGCCTGCGCGCCGGCTGGCAAAGCCTTGCCGGGTGCGGCGTGAGGCGGCTGCGACTCCGGTTACGGTCTCCGCGGCCTCTTGCCTAGCGCGCGACAGTGGGGAGCCCGCGGGGGCTCGCGGGGGCTCGCGGGGCAAAGCTCCCAGGGAGGCGGGCTTATTAAACCTGCATCTAGAAGGCCCCAGAGTGACCCTACCCCCCCCCCCCCCCAGCTGGGCGTCTTTATGGACGATCTCTCTTTGCTTAACGAATTGAACCTGATGCGCCGTGGAAGGCGACGCGCAGTTCTGGCCTTCGAAGCCGTCCAAATGGTCACTCCCCCCTTTCTCGTGAGCTGCCGCGAGGGCGGGTGTGCCCTTCCTGGAGGGCGTGGGGGAGCCAGTTTCCCGCCGCTGCCCGGGAGACTTTGGGGCGTGCGGGGACGCGCTCCGGGCTGGACGGGAGGGTGCGGGGGCGGCCGCCGCCTGGGAGGTGGGTGGGGGTTGACTCTGGGCCGAGGCTGGAGCCGGGAGCCCGAGAGCCTCCGCCCCCTGCCGGCCCCTTCCTGCCCTCGCCCCCGCGCGGTGTAGCCCTTGTGGCCTCACTTCCAGCGGTTCCCGGGCTCTGGAACCAGCTGTGTTTGCAAACTTCCCCGGGAAGGGCGGGGGTGCGCCCTCGTCCGGCGGGCTCCGGCCCCCGCTAGCCTCCGGGGGCGTGTCAAGGCCTGGAGGGGGCGGCCCGCTGGAGGCGGCTGCGGAGAAAGTGCCAGGGCTCGGACCCCTGCCCCGGGCGCCGCTCGAGGCCCCGGGCCCGGCTGGCCCGCCCTGCAACCACCTTTCAGTTTC
?
3.2
billion
bases
of
DNA
Big question: what positions in the genome are functional / important in human health / good therapeutic targets?
Comparative genomics
In 100 million years, across all mammals*, all possible mutations have been tested
* placental
It means that any individual with that change didn’t have descendants.
If we don’t see any changes at a position in the genome, that doesn’t mean it hasn’t changed
branch length = average number of nucleotide substitutions per genomic position
| # samples | Branch length | Expected number of false positives* |
29 Mammals Project | 29 | 4.9 | 22,995,049 |
200 Mammals Project | 241 | 16.6 | 191 |
Humans only (gnomAD) | 71,702 | 0.17 | 2,604,359,690 |
* expected number of 100% conserved positions by random chance
The 200 mammals project
Resolving sequence conservation to single base pair
Kerstin Lindblad-Toh, Uppsala University & Broad Institute
The 200 mammals project
placental
The 200 mammals project
240+
placental
The Zoonomia project
UMass Chan Medical School (Elinor Karlsson, Zhiping Weng)
Uppsala University (Kerstin Lindblad-Toh, Jennifer Meadows)
Carnegie Mellon University (Andreas Pfenning)
Earlham Institute (Wilfried Haerty)
Harvard School of Dental Medicine (Martin Nweeia)
Inst of Evolutionary Biology, Univ Pompeu Fabra (Tomas Marques-Bonet & Arcadi Navarro)
Institute for Systems Biology, Seattle (Arian Smit and Robert Hubley)
Karolinska Institite/UNC (Pat Sullivan)
Max Planck Institute (Michael Hiller)
San Diego Zoo (Oliver Ryder)
Senckenberg Research Institute, (Irnia Ruf)
ZoonomiaProject.org
Smithsonian-Mason School of Conservation (Klaus-Peter Koepfli)
Texas A & M (Bill Murphy)
Texas Tech (David Ray)
University College Dublin (Emma Teeling and Graham Hughes)
University of California San Francisco (Katie Pollard)
University of California, Davis (Harris Lewin)
University of California, Riverside (Mark Springer)
University of California, Santa Cruz (Benedict Paten, Beth Shapiro)
University of East Anglia (Federica Di Palma)
University of Nevada Las Vegas (Allyson Hindle)
University of Southern California (Steven Gazal)
University of Maine (Danielle Levesque)
Zoonomia Consortium
Cynthia Steiner
Aryn Wilder
Oliver Ryder
Frozen zoo: 10,000 vertebrate animals representing over 1,100 taxa, including over 200 at-risk species.
Problem: we needed to maximize species diversity
Cynthia Steiner
Aryn Wilder
Oliver Ryder
Frozen zoo: 10,000 vertebrate animals representing over 1,100 taxa, including over 200 at-risk species.
Problem: we needed to maximize species diversity
Nicole Foley
Mark Springer
Bill Murphy
Humans
Dogs
Cats
Sheep
Cattle
Horse
Goats
Mouse
Guinea pig
Ferret
Making genome alignment (example)
Kumar, Sudhir, and Alan Filipski. "Multiple sequence alignment: in pursuit of homologous DNA positions." Genome research 17.2 (2007): 127-135.
Reference-free alignment with Cactus
Benedict Paten & Joel Armstrong
… ~$100K, 2.1 million core hours, and many months later …
3.2
billion
bases
of
DNA
Scored conservation using PhyloP
Katie Pollard
Michael Dong
a
Conservation & functional scores in promoter of LDLR
phyloP
variant effect
Conservation & functional scores in promoter of LDLR
Zoonomia
more variants in humans
ClinVar
“benign”
ClinVar
“pathogenic”
deleterious
score > 20
(CADD)
More constrained across mammals
More constrained across mammals
Cornelia Fanter
Diane Genereux
Linda Goodman
Michael Hiller
Allyson Hindle
Graham Hughes
Irene M. Kaplow
Amanda Kowalczyk
Danielle Levesque
Irina Ruf
What’s Next? Forward genomics / genotype to phenotype
Note: these are species-level or population-level phenotypes,
not individual phenotypes.
Forward genomics
Today
Ancient past
Forward genomics
Today
Ancient past
428-way human-referenced TOGA alignment of protein-coding sequences (Michael Hiller and Bogdan Kirilenko)
240-way reference-free whole genome Cactus alignment
Challenge 1: Genomics
Challenge 3: Phenotypes
Challenge 2: Methods
Dog genomics
In genomics, sample size is more important than (almost) anything else
We don’t know exactly which phenotype to measure to maximize power to detect genetic associations
We know more samples gives us more power
Solution: survey data
Useful for disease prediction
Useful for developing therapeutics & interventions
more detailed phenotypes required
single-cell phenotypes increasingly important
Behavior & health are a complex interaction between genetics and environment
global proliferation of dogs
!
more direct human influence on selection
creation of modern breeds
1800s - present
15-20,000 years ago
domestication from ancestral wolf
Health and Disease
Behavior and Disposition
Movies by Alice Moon-Fanelli
normal behavior done too much
distressing, time-consuming and impairing
onset in adolescence
poor response to treatment (including SSRIs)
highly heritable
Compulsive disorder in dogs
Compare 92 affected and 67 healthy dobermans
chromosome
1 gene: CDH2
Dodman et al, Molecular psychiatry 2010
Noh et al, Nature Communications (2017)
CDH2
unknown
Are dog breeds useful?
Problems:
Hard to get enough dogs in one breed
Breed comparisons are confounded by aesthetics
Regions of association are huge
Miss the (huge) power offered by admixed dogs
Advantages:
Only a sparse marker panel required
More power because of reduced genetic diversity
Breeds have characteristic phenotypes (?)
Open data
DNA extraction, sequencing, and imputation
directly to Gencove
owner activates kit and swabs dog
directly to owner
kit supplier ships to owner
owner funds
grant funds
fill out surveys on:
open canine
survey & genetic
data set
Streamlined approach to data collection
DarwinsArk.org
DNA extraction, sequencing, and imputation
directly to Gencove
owner activates kit and swabs dog
directly to owner
kit supplier ships to owner
owner funds
grant funds
fill out surveys on:
open canine
survey & genetic
data set
Streamlined approach to data collection
DarwinsArk.org
Brittney Logan
You can sequence your dog for only $150!!!
www.darwinsark.org
Ancestry-inclusive dog genomics challenges popular breed stereotypes
Morrill et al. Science. 2022.
PhD Candidate Kathleen Morrill at UMass Chan Medical
A community science and open data platform for studies of companion and working animal behavior and health
Darwin’s Dogs Project
Working Dog Program
Dog Cancer Project
behavior and health in companion dogs
work history and enrollment of working dogs
cancer history and enrollment for cancer projects
enrolled: 30,926 dogs
sequenced: >3,250 dogs
Agonistic Threshold
Dog Sociability
Environmental Engagement
Proximity Seeking
Human Sociability
Arousal Level
Toy-directed Motor Patterns
Biddability
1
2
3
4
5
6
7
8
We identified 8 major behavioral factors
We assembled a diverse genetic cohort of 2,155 dogs
Most mutts were not “simple” cross-breeds
We successfully map traits in a diverse cohort
1x10-102
N=1783
We successfully map traits in a diverse cohort
When your dog is standing next to someone of average height, how high are their shoulders?
N=1951
Removing all dogs with >50% breed ancestry
When your dog is standing next to someone of average height, how high are their shoulders?
N=970
Associations for behavior appear as well
DOG howls
Never
Rarely
Sometimes
Often
Always
N=948
Associations for behavior appear as well
Locus 1
Locus 2
Sodium channel SCN3A (Nav1.3) regulation of human cerebral cortical folding and oral motor development
Smith et al. Neuron. 2018.
Breed ancestry affects appearance
Effect of ancestry on ear shape
linear mixed-effects regression
Data:
Surveys + DNA
Predicting ancestry from appearance is hard
Data:
Surveys + DNA
MuttMix (muttmix.org): 26,639 participants guessed breed ancestry of 30 mutts
Effect of ancestry on howling
Huskies howl more often than other dogs.
Beagles also howl more than other dogs.
Effect of ancestry on howling
Owners say:
Ancestry affects, to a lesser extent, behavior
Biddability distinguishes breeds and heritable by breed
Rare!
Rare!
Data:
Surveys + DNA
Canine & human compulsive disorders
over-grooming, flank sucking
blanket ingesting
trances, fixations
repetitive movement, circling or tail chasing
stereotyped canine behavior
Canine compulsive behaviors
Food allergies and related behaviors
What about cancer?
Goal�Expand sample sizes of dog cancer studies
Approach: Darwin’s Ark for cancer
Diagnosis & response to therapy: Owner surveys & electronic medical records
GxE: Environment through owner surveys & passive sampling
Germline genome: saliva samples collected by owners
Cancer genome: Blood samples through their veterinarian + blood biopsy
Can we extend this to cancer?
Newly launched
darwinsark.org/cancer-project/
Cancer project launched darwinsark.org/cancer-project/
+ environmental surveys
Environment surveys (3 surveys with 10 questions)
Geographic location
Features of home (year built, fuel sources, flooring)
Use of pesticides / preventatives (e.g. for ticks)
Food and water sources
+ passive environmental sampling
Polycylic aromatic hydrocarbons (PAHs)
Flame Retardants
Pesticides/Herbicides
Phthalates
Kim Anderson and Peter Hoffman @Oregon State
+ blood biopsy
Broad Institute Blood Biopsy Team and Genomics Platform
Cell-free DNA includes tumor DNA
+ blood biopsy
Broad Institute Blood Biopsy Team and Genomics Platform
PI: Elinor Karlsson
Karlsson Lab
Funding and Support:
NHGRI R01 HG008742
NHGRI U24 HG009446
NIMH R21 MH109938
NIA U19 AG057377
NCI R01 CA255319
NCI R37 CA218570
NCI F32 CA247088
OD R24 OD018250
NSF EF-2022007
A very special thanks to Dr. Diane M. Riccio and Mr. Daniel J. Riccio Jr. for supporting my travels to present this research today.
@morrilleen
kmorrill@broadinstitute.org
Thank you to all of our study participants and their dogs!
kathleen.morrill@umassmed.edu
Staff Members:
Brittney Kenney
Michele Koltookian
Project Support:
Charlie Lieu
Past Fellows:
Hyun Ji Noh
Linda Boettger�Jesse McClure
PhD Students:
Gaurav Chauhan
Shirley Xue Li
Kathleen Morrill
Chris Husted
Post-doc Fellows:
Kathryn Lord
Kate Megquier
Michelle White
Lucas Moreira
Rachel Daniels
Scientists:
Ross Swofford
Jason Turner-Meier
Diane Genereux
Jessica Hekman
UMass BIB
Zhiping Weng
Mingshi Gao
Andrés Colubri
Yinan Dong
darwinsark.org
darwinsark.org/muttomics
How does breed affect the probability of finding a given dog personality?
Andres Colubri and Yinan Dong