1 of 41

Evolution of SARS-CoV-2, immune escape, and the emergence variants of concern

Richard Neher, University of Basel

2 of 41

3 of 41

4 of 41

Millions of publicly available viral genomes

Genomes per quarter

5 of 41

What can we learn from so much data?

6 of 41

Pandemic scale trees – UShER and Taxonium

  • Now >16M SARS-CoV-2 sequences
  • Pre-pandemic tools can’t deal with such volumes
  • New tools: UShER, CoV-spectrum, Taxonium�

→ Using UShER (by UCSC), Angie Hinrichs has continuously updated a tree of most available data

��Rough estimate:

  • Each leaf contributes 1 month of evolution�→ more than 1M years of evolution
  • Every position mutated 100s of times somewhere on the tree.�→ should allow inference of position specific properties

Taxonium by Theo Sanderson

7 of 41

Mutation spectra are dominated by C->T and G->T mutations

  • Expected patterns of purifying selection (more later)�
  • Fixed or high frequency mutations behave differently (more later)

Bloom and Neher, 2023

UShER tree is annotated with mutations:

C3G

C3G

C3G

8 of 41

Mutation spectra differ between pre-Omicron and Omicron variants

  • C->T and G->T mutations are likely driven by processes that vary with the degree of immune activation
  • C->T mutations depend strongly on secondary structure (Hensel, 2024)
  • See also Lamb et al, 2024, Ruis et al, 2022

Bloom et al, 2023

9 of 41

Zach Hensel. 10.1101/2024.02.27.581995

Mutation rates vary from site to site and depend on context

10 of 41

Site specific fitness effect estimates across most of the SARS-CoV-2 genome

  1. Use four-fold synonymous sites as proxy for expected counts for each mutation�(we are in the process of modeling expected counts more carefully)
  2. Compare observed counts to expected counts.
  3. Aggregate mutations for each amino acid substitution
  4. Compute fitness proxy as����
  5. Branching process models: deleterious mutations with selection coefficient s�(epsilon is the sampling rate)��

11 of 41

Bloom and Neher, 2023

12 of 41

Interactive plots at ��jbloomlab.github.io/SARS2-mut-fitness/

13 of 41

Fitness costs of mutations in the E protein

Bloom and Neher, 2023

14 of 41

Concordance decreases with divergence – epistasis

Bloom and Neher, 2023

15 of 41

Selection beyond the coding sequence

  • In coding regions, constraints on nucleotide sequence can only be seen at synonymous sites
  • But there are sufficiently many to detect conserved regions�→ only a minority of the genome is strongly constrained

Bloom and Neher, 2023, and in prep

16 of 41

From mutations and purifying selection to divergence…

17 of 41

18 of 41

BA.1

BA.2

BA.5

Delta

Alpha

  • Rapid evolution �(~30 changes per year) �Coronaviruses were traditionally thought of as rather stable.
  • Stepwise dynamics:
    • Slow within variants
    • Rapid jumps in between
  • Rapid jumps possibly due to chronic infections; many hallmarks of adaptation

See also Duchene et al, Hill et al.

19 of 41

Robust determination of within-Clade evolutionary rates

  • Use sequences that have all lineage defining mutations (removes problematic sequences)
  • Linear regression on the number of additional synonymous or amino acid mutations�(shared ancestry is a minor problem since most clades have approximately star like phylogenies)
  • Straightforward to do for different proteins, regions etc. �

→ Amino-acid and synonymous rate estimates for each clade

Neher, 2022

20 of 41

Within vs Backbone rates:

  • All clades compatible with a common backbone rate
  • Within clade rates are systematically lower

Synonymous rate:

  • All variants roughly 6 changes per year
  • Very little variation
  • Overall rates similar, around 7 changes/year

Amino acid rate:

  • The overall rate from clade to clade is much higher than the within clade rate

Amino acid rates within clades declined with time

Neher, 2022

21 of 41

2021: Delta – global dominance

  • Several mutations that increase transmissibility
  • Increased severity of disease
  • Moderately reduce immune recognition (less than Beta)

22 of 41

November 2021: Omicron

Omicron

Delta

Alpha

2019 origin

  • Heavily mutated sister variant of previous VOCs
  • Several distinct variants
  • High rate of reinfections
  • Very rapid spread

COG-UK

→ Michael Desai’s talk later this week!!

23 of 41

Early Omicron diversity

BA.1

BA.2

BA.3

BA.1/2/3 show signs of recombination

24 of 41

Kleynhans, J. et al. SARS-CoV-2 Seroprevalence after Third Wave of Infections, South Africa. Emerg Infect Dis 28, 1055-1058 (2022).

Seroprevalence in South-Africa

→ by end of 2021, most people were infected or vaccinated (outside of some parts of East-Asia)

25 of 41

XBB is likely a recombinant between two BA.2 descendents

26 of 41

Latest successful highly divergent variant: BA.2.86

Khan et al, 2023

  • Chronic infections are common in people living with unsuppressed HIV
  • Rapid humoral immune escape of SARS-CoV-2 in many such individuals
  • Emerged in April 2023, initially slow growth
  • Sub-variant JN.1 rapidly took over in late 2023

BA.2 from early 2022

BA.2.86

27 of 41

Emergence of VOCs: probably chronic infections

Chaguza et al, 2023

28 of 41

Adaptation is more efficient in chronic vs acute infection

Sigal et al, in prep.

29 of 41

Persistent infections are common in people with advanced HIV

Karim, … Sigal. Nature Comm 2024

30 of 41

Karim, … Sigal. Nature Comm 2024

SARS-CoV-2 is cleared with HIV suppressed, AB titers go up.

31 of 41

KP.3

KP.2

32 of 41

Jian, … Cao. 10.1101/2024.04.19.590276

33 of 41

Acknowledgements

  • Jesse Bloom, Erick Matsen, Hugh Haddox, Georg Angehrn
  • Cornelius Roemer & Ivan Aksametov
  • Alex Sigal and team!
  • Sequence data contributors around the world (shared via GISAID or INSDC)

34 of 41

Comparison with deep mutational scanning data

35 of 41

Limited selection on amino acid sequences in accessory proteins

  • Stop codons in ORF6, ORF7a/b, ORF8, and ORF10 don’t seem to matter
  • Circulating variants have stop codons in these genes
  • ORF3a has little selection on the amino acid sequence, but stop codons are deleterious up to position ~240

Bloom and Neher, 2023

mutations to stop

36 of 41

Well known RNA elements are clearly visible

Ribosomal slippage site

Transcription regulatory sequences

37 of 41

Strong signal of conservation in the center of E (+two TRS of E and M)

38 of 41

Adaptation during chronic infections vs acute transmission chains

Acute transmission

  • Small transmission bottlenecks
  • Limited activation of adaptive immunity between infection and onward transmission

→ selection for immune escape is rather inefficient, despite the large number of acute infections �(see also Morris et al, Elife, 2020)

Chronic infections

  • Continuously large population
  • Consistent selection for variants that escape a ‘catching up’ immune system
  • Large number of immuno-compromised people living with HIV.

Within-host diversification

unchanged

neutral

adaptive

Same founder as previous infection�(adaptive mutations lost)

Founder with neutral mutation

(adaptive mutations lost)

Founder adaptive mutation, potentially preferentially transmitted but still rare

39 of 41

Transmission bottlenecks reduce the efficiency of selection

Fitness landscape

  • 100 potential adaptive mutations
  • Within-host advantage 10%�

Acute infections:

  • 10M concurrent infections
  • Transmission bottlenecks
  • 10% transmission advantage
  • Serial interval 5 days

Chronic infections:

  • 10M cells
  • Generation time 1 day

0 adaptive �mutations

1

2

Simple computer simulation

Simulation: Chronic

Simulation: Community

40 of 41

41 of 41

Estimates are consistent across clades and geographies

Bloom and Neher, 2023