1 of 21

Profiling of antibody repertoires and immunoglobulin loci enables large-scale

analysis of adaptive immune responses

Yana Safonova, Ph.D.

Assistant Professor

Johns Hopkins University

2 of 21

Newly emerging and re-emerging diseases

2

3 of 21

Studying immune responses

  • Variety of threats to human body is huge and unpredictable
  • Genome is too small to encode defences against all these threats

Immune system = innate (inherited) + adaptive (acquired) immune systems

3

Immune system has an ability to adapt to various threats (antigens) using agents (e.g., antibodies) that are not encoded in the genome

Specificity rule: one antibody recognizes one antigen

antibody

antigen

4 of 21

VDJ recombination

Antibodies are not directly encoded in the genome – they are encoded in B-cells that result from somatic recombinations of the immunoglobulin loci

4

V

V

V

D

D

D

J

J

D

J

D

V

V

J

antibody-encoding gene

55 V genes 27 D genes 6 J genes

5 of 21

VDJ recombination generates billions of antibodies

Antibodies are not directly encoded in the genome – they are encoded in B-cells that result from somatic recombinations of the immunoglobulin loci

5

V

V

V

D

D

D

J

J

D

J

D

V

V

J

J

V

D

random insertions/deletions at the V-D and D-J junctions

55 V genes 27 D genes 6 J genes

VDJ recombination generates billions of antibodies!

6 of 21

Somatic hypermutations (SHMs)

Antibodies are not directly encoded in the genome – they are encoded in B-cells that result from somatic recombinations of the immunoglobulin loci

6

V

V

V

D

D

D

J

J

D

J

D

V

V

J

antibody gene

Randomly generated somatic hypermutations optimize the binding affinity of antibodies in an evolution-like process

J

D

V

V

J

somatic hypermutations

55 V genes 27 D genes 6 J genes

VDJ recombination generates billions of antibodies!

7 of 21

Repertoire sequencing data

7

  • The first Rep-Seq dataset was sequenced in 2009
  • 10,000s Rep-seq datasets are available today

V

D

J

Illumina MiSeq / Pacbio CCS sequencing read

VDJ sequences from

DNA or RNA

error-prone Rep-Seq reads

× 7

× 3

× 2

× 2

antibody repertoire

× 1

8 of 21

Antibody repertoires and responses

8

adapted from Favresse et al., Clinical Microbiology and Infection, 2021

Can we use Rep-Seq technologies to explain the variance in antibody responses?

Antibody responses to SARS-CoV-2

red / blue - exposed / naive donors

9 of 21

9

Antibody repertoire is unique for an individual

  • Antibody repertoires of different individuals barely overlap
  • Overlapping antibodies typically represent frequent VDJ recombinations rather than functional antibodies
  • Immunogenomics needs new computational methods

10 of 21

Variations in IG genes and diseases

10

Cytomegalovirus + IGHV3-30, IGKV3-11

Thomson et al., 2008

Rheumatic heart disease + IGHV4-61

Parks et al., 2017

Influenza + IGHV1-69

Lingwood et al., 2012

Avnir et al., 2016

Kawasaki disease + IGHV3-66

Johnson et al., 2020

  • The immune system often favors specific IG genes to fight specific diseases
  • Variations in these genes are associated with susceptibility to diseases and failures/successes of the immune response

11 of 21

Dissecting antibody responses to COVID-19

11

~4% of Abs are derived from IGHD3-22 and have YYDxxG

He et al., Nat Immunol, 2022

Kim et al., Sci Trans Med, 2021

Stereotypic antibodies

YYDxxG antibodies

IGHV3-53 / IGHV3-66 +

YYDxxG antibodies target a more conservative part of the RBD

12 of 21

Gene usage QTLs of human IGH locus

12

SNP associations with IGHV3-66 usage

The region containing top SNPs contain regulatory elements

Rodriguez, Safonova, et al., bioRxiv: https://doi.org/10.1101/2022.07.04.498729

13 of 21

Long CDR3s (≥72 nt) of human Abs are efficient against HIV

Unusual VDDJ recombinations were hypothesized by Tonegawa and discovered by Meek in 1989

Tandem D-D fusions often double CDR3 length and result in ultralong antibodies

Many broadly neutralizing antibodies against HIV result from tandem D-D fusions

13

V

D

J

V

D

J

D

Sok et al., Nature, 2017

CDR3

14 of 21

The Cryptic RSS Hypothesis explains 95% of tandem D-D fusions

14

3 DNA turns

2 DNA turns

cryptic nonamers

D

J

D

V

D

Safonova and Pevzner, Genome Res, 2020

23

12

cryptic nonamers

heptamer

nonamer

Recombination signal sequences

15 of 21

Cattle antibody responses to the BRD vaccine

15

Safonova et al., Genome Res, 2022

  • The Bovine Respiratory Disease (BRD) is a major cause of economic losses in cattle agriculture
  • BRD is associated with four viruses, no treatment is available
  • The only way to fight BRD is to prevent it using a vaccine

16 of 21

10% of cattle antibodies have ultra-long CDR3s

16

adapted from Wang et al., Cell, 2013

IGHV1-7 + IGHD8-2 + IGHJ2-4

IGHD8-2

GTAGTTGTCCTGATGGTTATAGTTATGGTTATGGTTGTGGTTATGGTTATGGTTGTAGTGGTTATGATTGTTATGGTTATGGTGGTTATGGTGGTTATGGTGGTTATGGTTATAGTAGTTATAGTTATAGTTATACTTACGAATATAC

Cys bonds

C:TGT,TGC

G:GGT

S:AGT

Y:TAT,TAC

17 of 21

Vaccination triggers production of ultralong CDR3s

17

  • Vaccination boosts the global fraction of ultralong CDR3s
  • The fraction of ultralong CDR3s correlates (albeit weakly) with the final titers

Safonova et al., Genome Res, 2022

18 of 21

Genotypes of cattle V genes shape Ab responses

18

  • 3 common genotypes are associated with the final titers
  • C1 has both high titers and a good correlation with ultralong CDR3s

Safonova et al., Genome Res, 2022

19 of 21

Discovery of novel immunoglobulin genes

IgDetective de novo detects immunoglobulin genes in mammalian assemblies though search of RSSs and then IG genes

19

Sirupurapu, Safonova, Pevzner, Genome Res, 2022

1000+ V genes from 20 mammalian species from Vertebrate Species Project

20 of 21

New family of bat immunoglobulin V genes

20

Prabakaran and Chowdhury, Cell Rep, 2020

QVQLQESGPGLVKPSQTLSLTCAVSGFSITTSGYCWHWIRQLPGKGLDWIAIICYDGSTAYNPALKSRSSISRDTSKNQFSLQLKSVTTEDTAVYYCAR

Sirupurapu, Safonova, Pevzner, Genome Res, 2022

21 of 21

21

Vinnu Bhardwaj

Andrey Bzikadze

Ishaan Gupta

Vikram Sirupurapu

William Gibson

Justin Kos

Oscar Rodriguez

Kaitlyn Shields

Catherine Silver

Melissa Smith

David Tieri

Acknowledgements

Harvard Medical School

Wayne Marasco

La Jolla Institute for Immunology

Shane Crotty

Scripps Institute

Raiees Andrabi

Vaughn Smider

Smithsonian Conservation

Biology Institute

Klaus-Peter Koepfli

Tulane University

Hannah Frank

UCSD

Massimo Franceschetti

Siavash Mirarab

Ramesh Rao

University of New South Wales

Andrew Collins

University of Oslo

Victor Grieff

Geir Kjetil Sandve

University of Southern California

Serghei Mangul

USDA

Benjamin Rosen

Timothy Smith

Yale University

Steven Kleinstein

Pavel Pevzner

UCSD

Corey Watson

U of Louisville