1 of 98

Investigating genome function in non-model mammals

Elinor K. Karlsson, Ph.D.

Associate Professor, University of Massachusetts Medical School

Director of Vertebrate Genomics, Broad Institute of MIT and Harvard

2 of 98

3 of 98

4 of 98

5 of 98

6 of 98

7 of 98

Rachel Johnston

8 of 98

Broad Institute Genomics Platform

9 of 98

Capacity today at the

Broad Institute Genomics Platform:

about one human genome every 3 minutes

10 of 98

Mersha, Tesfaye B., et al. "Resolving clinical phenotypes into endotypes in allergy: molecular and omics approaches." Clinical Reviews in Allergy & Immunology 60.2 (2021): 200-219

All of these (except proteomics & metabolomics?) are assayed with DNA sequencing technology

More sequencing capacity = we sequence more things

Also:

single cell sequencing

spatial transcriptomics

3D genome structure

11 of 98

CCGGGACGCGGCGGCCGCAGGGGCAGCGGCGACGGCAGCACCGGCGGCAGCACCGGCCGGAGCACCAGCGAGAGCAGCAGCAGCCGCGGCAGCGGCGTCCCGAGTGCCCGCGGCGCGCGGCGCAGCGATGCGGTCCCCACGGACGCGCGACCGGCCCGGGCGCCCCCTGAGCCTCCTGCTCGCCCTGCTCTGCGCCCTGCGAGCCAAGGTAGGAGCCTGCCGGGCCTCCCTCCCGGCCCGCCCTCTCCTTTCCCTCGCAGCTCCCTGGACGCCTGCGGCTGAGCGCCTCGAGCGGGCGCGGGAGCCCCCGGGCGCCCCGCTCCCCGGCTGGGTCCCCCGCGCCCGGGGGGCTGCACCTGGGCCGGGAAGTCCCGGACCTCCTACGTGTCCCCTCCACCCTCCGGGCGGGGCCGGGGGCTGTGTTCTCTCCGCTCCTCGCCCCCGGGAGCACTCGTCGGATTTCTCCGGCACTCGTGCATTTTGTGCTCGGGAATCACGTTGAGTCCTTGCACCCAGTTTTGCAAAATCCTTTTCACTCGGCGCCGGGGGCCCGTCGGGGCGGGCGGGGAGGGAGTCGCCGCTTCCACCCTCGGAGCAAACCCCTCTCCCCCGCGCTGACCTCCCTCCCCCCTCGCCGGCAGGTGTGCGGCGCCTCGGGCCAGTTCGAGCTGGAGATCCTGTCCATGCAGAACGTGAACGGGGAGCTGCAGAACGGGCACTGCTGCGGCGGCGTCCGGAGCCCGGGGGACCGCAAGTGCTCTCACGACGAGTGTGACACGTACTTCAAAGTGTGCCTCAAGGAGTACCAGTTCCGCGTCACGGCCGGGGGGCCCTGTAGCTTCGGGTCGGGCTCCACGCCAGTCCTCGGGGGCAACACCTTCAATCTCAAGGCCGGCCGTGGCAGCGAACGCAACCGCATCGTGCTGCCTTTCAGTTTCGCCTGGCCGGTGAGTGCGCCACGCGGGGAGGGCGGCCCGCCGGGGTCCCAGCCCGCGCCGGCGGAGCCCCGCGGCCTCGCCAGAGGGACGGCGGGTCTGGGCTGGAGCCTGCGCGCCGGCTGGCAAAGCCTTGCCGGGTGCGGCGTGAGGCGGCTGCGACTCCGGTTACGGTCTCCGCGGCCTCTTGCCTAGCGCGCGACAGTGGGGAGCCCGCGGGGGCTCGCGGGGGCTCGCGGGGCAAAGCTCCCAGGGAGGCGGGCTTATTAAACCTGCATCTAGAAGGCCCCAGAGTGACCCTACCCCCCCCCCCCCCCAGCTGGGCGTCTTTATGGACGATCTCTCTTTGCTTAACGAATTGAACCTGATGCGCCGTGGAAGGCGACGCGCAGTTCTGGCCTTCGAAGCCGTCCAAATGGTCACTCCCCCCTTTCTCGTGAGCTGCCGCGAGGGCGGGTGTGCCCTTCCTGGAGGGCGTGGGGGAGCCAGTTTCCCGCCGCTGCCCGGGAGACTTTGGGGCGTGCGGGGACGCGCTCCGGGCTGGACGGGAGGGTGCGGGGGCGGCCGCCGCCTGGGAGGTGGGTGGGGGTTGACTCTGGGCCGAGGCTGGAGCCGGGAGCCCGAGAGCCTCCGCCCCCTGCCGGCCCCTTCCTGCCCTCGCCCCCGCGCGGTGTAGCCCTTGTGGCCTCACTTCCAGCGGTTCCCGGGCTCTGGAACCAGCTGTGTTTGCAAACTTCCCCGGGAAGGGCGGGGGTGCGCCCTCGTCCGGCGGGCTCCGGCCCCCGCTAGCCTCCGGGGGCGTGTCAAGGCCTGGAGGGGGCGGCCCGCTGGAGGCGGCTGCGGAGAAAGTGCCAGGGCTCGGACCCCTGCCCCGGGCGCCGCTCGAGGCCCCGGGCCCGGCTGGCCCGCCCTGCAACCACCTTTCAGTTTC

?

12 of 98

3.2

billion

bases

of

DNA

13 of 98

Big question: what positions in the genome are functional / important in human health / good therapeutic targets?

14 of 98

Comparative genomics

15 of 98

16 of 98

In 100 million years, across all mammals*, all possible mutations have been tested

* placental

17 of 98

It means that any individual with that change didn’t have descendants.

If we don’t see any changes at a position in the genome, that doesn’t mean it hasn’t changed

18 of 98

branch length = average number of nucleotide substitutions per genomic position

19 of 98

# samples

Branch length

Expected number of false positives*

29 Mammals Project

29

4.9

22,995,049

200 Mammals Project

241

16.6

191

Humans only (gnomAD)

71,702

0.17

2,604,359,690

* expected number of 100% conserved positions by random chance

20 of 98

The 200 mammals project

Resolving sequence conservation to single base pair

Kerstin Lindblad-Toh, Uppsala University & Broad Institute

21 of 98

The 200 mammals project

placental

22 of 98

The 200 mammals project

240+

placental

23 of 98

The Zoonomia project

24 of 98

UMass Chan Medical School (Elinor Karlsson, Zhiping Weng)

Uppsala University (Kerstin Lindblad-Toh, Jennifer Meadows)

Carnegie Mellon University (Andreas Pfenning)

Earlham Institute (Wilfried Haerty)

Harvard School of Dental Medicine (Martin Nweeia)

Inst of Evolutionary Biology, Univ Pompeu Fabra (Tomas Marques-Bonet & Arcadi Navarro)

Institute for Systems Biology, Seattle (Arian Smit and Robert Hubley)

Karolinska Institite/UNC (Pat Sullivan)

Max Planck Institute (Michael Hiller)

San Diego Zoo (Oliver Ryder)

Senckenberg Research Institute, (Irnia Ruf)

ZoonomiaProject.org

Smithsonian-Mason School of Conservation (Klaus-Peter Koepfli)

Texas A & M (Bill Murphy)

Texas Tech (David Ray)

University College Dublin (Emma Teeling and Graham Hughes)

University of California San Francisco (Katie Pollard)

University of California, Davis (Harris Lewin)

University of California, Riverside (Mark Springer)

University of California, Santa Cruz (Benedict Paten, Beth Shapiro)

University of East Anglia (Federica Di Palma)

University of Nevada Las Vegas (Allyson Hindle)

University of Southern California (Steven Gazal)

University of Maine (Danielle Levesque)

Zoonomia Consortium

25 of 98

Cynthia Steiner

Aryn Wilder

Oliver Ryder

Frozen zoo: 10,000 vertebrate animals representing over 1,100 taxa, including over 200 at-risk species.

Problem: we needed to maximize species diversity

26 of 98

Cynthia Steiner

Aryn Wilder

Oliver Ryder

Frozen zoo: 10,000 vertebrate animals representing over 1,100 taxa, including over 200 at-risk species.

Problem: we needed to maximize species diversity

27 of 98

Nicole Foley

Mark Springer

Bill Murphy

Humans

Dogs

Cats

Sheep

Cattle

Horse

Goats

Mouse

Guinea pig

Ferret

28 of 98

Making genome alignment (example)

Kumar, Sudhir, and Alan Filipski. "Multiple sequence alignment: in pursuit of homologous DNA positions." Genome research 17.2 (2007): 127-135.

29 of 98

Reference-free alignment with Cactus

Benedict Paten & Joel Armstrong

30 of 98

31 of 98

… ~$100K, 2.1 million core hours, and many months later …

32 of 98

3.2

billion

bases

of

DNA

Scored conservation using PhyloP

Katie Pollard

Michael Dong

33 of 98

a

Conservation & functional scores in promoter of LDLR

phyloP

variant effect

34 of 98

Conservation & functional scores in promoter of LDLR

Zoonomia

35 of 98

more variants in humans

ClinVar

“benign”

ClinVar

“pathogenic”

deleterious

score > 20

(CADD)

More constrained across mammals

More constrained across mammals

36 of 98

Cornelia Fanter

Diane Genereux

Linda Goodman

Michael Hiller

Allyson Hindle

Graham Hughes

Irene M. Kaplow

Amanda Kowalczyk

Danielle Levesque

Irina Ruf

What’s Next? Forward genomics / genotype to phenotype

Note: these are species-level or population-level phenotypes,

not individual phenotypes.

37 of 98

Forward genomics

Today

Ancient past

38 of 98

Forward genomics

Today

Ancient past

428-way human-referenced TOGA alignment of protein-coding sequences (Michael Hiller and Bogdan Kirilenko)

240-way reference-free whole genome Cactus alignment

Challenge 1: Genomics

Challenge 3: Phenotypes

Challenge 2: Methods

39 of 98

Dog genomics

40 of 98

In genomics, sample size is more important than (almost) anything else

41 of 98

42 of 98

43 of 98

44 of 98

45 of 98

We don’t know exactly which phenotype to measure to maximize power to detect genetic associations

We know more samples gives us more power

Solution: survey data

46 of 98

Useful for disease prediction

47 of 98

Useful for developing therapeutics & interventions

more detailed phenotypes required

single-cell phenotypes increasingly important

48 of 98

49 of 98

Behavior & health are a complex interaction between genetics and environment

50 of 98

global proliferation of dogs

!

more direct human influence on selection

creation of modern breeds

1800s - present

15-20,000 years ago

domestication from ancestral wolf

51 of 98

Health and Disease

Behavior and Disposition

52 of 98

Movies by Alice Moon-Fanelli

normal behavior done too much

distressing, time-consuming and impairing

onset in adolescence

poor response to treatment (including SSRIs)

highly heritable

Compulsive disorder in dogs

Compare 92 affected and 67 healthy dobermans

chromosome

1 gene: CDH2

Dodman et al, Molecular psychiatry 2010

53 of 98

Noh et al, Nature Communications (2017)

CDH2

unknown

54 of 98

Are dog breeds useful?

Problems:

Hard to get enough dogs in one breed

Breed comparisons are confounded by aesthetics

Regions of association are huge

Miss the (huge) power offered by admixed dogs

Advantages:

Only a sparse marker panel required

More power because of reduced genetic diversity

Breeds have characteristic phenotypes (?)

55 of 98

Open data

56 of 98

57 of 98

58 of 98

DNA extraction, sequencing, and imputation

directly to Gencove

owner activates kit and swabs dog

directly to owner

kit supplier ships to owner

owner funds

grant funds

fill out surveys on:

  • behavior
  • morphology
  • environment
  • food allergies

open canine

survey & genetic

data set

Streamlined approach to data collection

DarwinsArk.org

59 of 98

DNA extraction, sequencing, and imputation

directly to Gencove

owner activates kit and swabs dog

directly to owner

kit supplier ships to owner

owner funds

grant funds

fill out surveys on:

  • behavior
  • morphology
  • environment
  • food allergies

open canine

survey & genetic

data set

Streamlined approach to data collection

DarwinsArk.org

Brittney Logan

You can sequence your dog for only $150!!!

www.darwinsark.org

60 of 98

Ancestry-inclusive dog genomics challenges popular breed stereotypes

Morrill et al. Science. 2022.

PhD Candidate Kathleen Morrill at UMass Chan Medical

61 of 98

A community science and open data platform for studies of companion and working animal behavior and health

Darwin’s Dogs Project

Working Dog Program

Dog Cancer Project

behavior and health in companion dogs

work history and enrollment of working dogs

cancer history and enrollment for cancer projects

enrolled: 30,926 dogs

sequenced: >3,250 dogs

62 of 98

Agonistic Threshold

Dog Sociability

Environmental Engagement

Proximity Seeking

Human Sociability

Arousal Level

Toy-directed Motor Patterns

Biddability

1

2

3

4

5

6

7

8

We identified 8 major behavioral factors

63 of 98

64 of 98

We assembled a diverse genetic cohort of 2,155 dogs

65 of 98

Most mutts were not “simple” cross-breeds

66 of 98

We successfully map traits in a diverse cohort

1x10-102

N=1783

67 of 98

We successfully map traits in a diverse cohort

When your dog is standing next to someone of average height, how high are their shoulders?

N=1951

68 of 98

Removing all dogs with >50% breed ancestry

When your dog is standing next to someone of average height, how high are their shoulders?

N=970

69 of 98

Associations for behavior appear as well

DOG howls

Never

Rarely

Sometimes

Often

Always

N=948

70 of 98

Associations for behavior appear as well

Locus 1

Locus 2

Sodium channel SCN3A (Nav1.3) regulation of human cerebral cortical folding and oral motor development

Smith et al. Neuron. 2018.

71 of 98

Breed ancestry affects appearance

Effect of ancestry on ear shape

linear mixed-effects regression

Data:

Surveys + DNA

72 of 98

Predicting ancestry from appearance is hard

Data:

Surveys + DNA

MuttMix (muttmix.org): 26,639 participants guessed breed ancestry of 30 mutts

73 of 98

Effect of ancestry on howling

Huskies howl more often than other dogs.

Beagles also howl more than other dogs.

Effect of ancestry on howling

Owners say:

74 of 98

Ancestry affects, to a lesser extent, behavior

Biddability distinguishes breeds and heritable by breed

Rare!

Rare!

Data:

Surveys + DNA

75 of 98

Canine & human compulsive disorders

over-grooming, flank sucking

blanket ingesting

trances, fixations

repetitive movement, circling or tail chasing

stereotyped canine behavior

Canine compulsive behaviors

76 of 98

Food allergies and related behaviors

77 of 98

What about cancer?

78 of 98

GoalExpand sample sizes of dog cancer studies

Approach: Darwin’s Ark for cancer

Diagnosis & response to therapy: Owner surveys & electronic medical records

GxE: Environment through owner surveys & passive sampling

Germline genome: saliva samples collected by owners

Cancer genome: Blood samples through their veterinarian + blood biopsy

Can we extend this to cancer?

79 of 98

Newly launched

darwinsark.org/cancer-project/

80 of 98

81 of 98

82 of 98

83 of 98

84 of 98

85 of 98

86 of 98

87 of 98

88 of 98

89 of 98

Cancer project launched darwinsark.org/cancer-project/

90 of 98

+ environmental surveys

Environment surveys (3 surveys with 10 questions)

Geographic location

Features of home (year built, fuel sources, flooring)

Use of pesticides / preventatives (e.g. for ticks)

Food and water sources

91 of 98

+ passive environmental sampling

Polycylic aromatic hydrocarbons (PAHs)

Flame Retardants

Pesticides/Herbicides

Phthalates

Kim Anderson and Peter Hoffman @Oregon State

92 of 98

+ blood biopsy

Broad Institute Blood Biopsy Team and Genomics Platform

Cell-free DNA includes tumor DNA

93 of 98

+ blood biopsy

Broad Institute Blood Biopsy Team and Genomics Platform

    • Send owners a kit
    • Ask them to ask veterinarian to draw blood sample
    • Return by mail within 7 days
    • Sequence tumor & germline DNA from a blood sample

94 of 98

95 of 98

PI: Elinor Karlsson

Karlsson Lab

Funding and Support:

NHGRI R01 HG008742

NHGRI U24 HG009446

NIMH R21 MH109938

NIA U19 AG057377

NCI R01 CA255319

NCI R37 CA218570

NCI F32 CA247088

OD R24 OD018250

NSF EF-2022007

A very special thanks to Dr. Diane M. Riccio and Mr. Daniel J. Riccio Jr. for supporting my travels to present this research today.

@morrilleen

kmorrill@broadinstitute.org

Thank you to all of our study participants and their dogs!

kathleen.morrill@umassmed.edu

Staff Members:

Brittney Kenney

Michele Koltookian

Project Support:

Charlie Lieu

Past Fellows:

Hyun Ji Noh

Linda Boettger�Jesse McClure

PhD Students:

Gaurav Chauhan

Shirley Xue Li

Kathleen Morrill

Chris Husted

Post-doc Fellows:

Kathryn Lord

Kate Megquier

Michelle White

Lucas Moreira

Rachel Daniels

Scientists:

Ross Swofford

Jason Turner-Meier

Diane Genereux

Jessica Hekman

UMass BIB

Zhiping Weng

Mingshi Gao

Andrés Colubri

Yinan Dong

darwinsark.org

96 of 98

darwinsark.org/muttomics

How does breed affect the probability of finding a given dog personality?

Andres Colubri and Yinan Dong

97 of 98

98 of 98