1 of 16

Adversarial Use of Protein Language Models for Modeling Escape

Sayantani B. Littlefield and Roy H. Campbell

HealthSec’25, Honolulu, HI

1

2 of 16

About the Authors

Roy H. Campbell

Professor Emeritus Computer Science

University of Illinois Urbana-Champaign

Sayantani B. Littlefield

Computer Science PhD Candidate

University of Illinois Urbana-Champaign

On the job market!

2

3 of 16

Introduction

SARS-CoV-2 case study
adversarial model for escape

high escape variants evade immunity

ESM-2: suite of masked language models, transformer based
Input: Protein sequences

Wildtype sequence: the first sequenced SARS-CoV-2 (no mutations)
Oher protein sequences (with mutations)

Output: condensed vector representation

3

4 of 16

Overview of LLMs

4

Bias:

data
the way it does its task

Prompt engineering:

ESM-2 model pre-trained on UniRef protein sequences
Case study on SARS-CoV-2 sequences
We are mutating the inputs and seeing how the model responds to it

Scoring:

EVEscape (Thadani et al., Nature 2023): for scoring escape

fitness, accessibility, dissimilarity

Rives et al. PNAS 2019, Rao et al. biorXiv 2020, Lin et al. biorXiv 2022

5 of 16

Motivation - Security

Prior works in jailbreak attacks on biological systems

SafeGenes, Zhan et al. (2025)
GeneBreaker, Zhang et al. (2025)

Why does security need to be studied in this area?

Comparing synthetic perturbations (mutations) to a known biological framework, where high escape mutations have been studied
inform us whether random mutations could lead to a high escape variant of a virus - including any perturbations in the model

5

The SafeGenes framework [21] carried out attacks using small perturbations to the prompts and Fast Gradient Sign Method (FGSM) to study the adversarial robustness of the ESM1 [5] and ESM2 [7]. They used these approaches to observe the shifts in downstream classification tasks like identifying whether a genetic variant is pathogenic or not.

The GeneBreaker framework [22] exposed vulnerabilities in DNA foundation models like Evo1 [23] and Evo2 [24] using jailbreak attacks. The goal of this framework was to generate genetically similar variants of viruses like SARS-CoV-2 and HIV based on “nucleotide identity” that could be malicious from a biological perspective.

The authors stated that it is possible to use these foundation models to generate “evolutionary distinct” spike sequences, which forms the basis of this paper

6 of 16

Descriptive Statistics

Varied random seed across different models
High mean cosine distance and low SD across varying seeds for each model - sign of consistency

6

7 of 16

Results (1)

Varied mutations in wildtype sequence and compared each mutated sequence with the original wildtype sequence

orange line = the average number of SARS-CoV-2 mutations

cosine distance (wt_seq, mut_i_seq)

7

8 of 16

Results (2)

8

Known COVID-19 mutations Alpha, Beta, Gamma, Delta, Omicron

9 of 16

Results (3)

9

Synthetically mutated sequences

Model with 3B parameters shows less noise compared to the model with 8M parameters

10 of 16

Results (4)

10

Synthetically mutated sequences

Model with 3B parameters shows less noise compared to the model with 8M parameters

11 of 16

Results (5) - Reviewer suggestions

Fitness scores biological DMS data (Starr et al., Cell 2020)
Antibody-escape maps (Greaney et al., Nature Communications, 2021)

11

12 of 16

Results (5) - Reviewer suggestions

Fitness scores biological DMS data (Starr et al., Cell 2020)
Antibody-escape maps (Greaney et al., Nature Communications, 2021)

12

13 of 16

Defensive Measures

The discussion of vulnerabilities comes with associated risks, as attackers may use such information to further exploit protein language models

Responsible disclosure

Publishing research or open-source models

Restrict model weights

Authenticate dataset access

Add filters

Should the escape model be jailbroken, it would permit the use of sequences that can cause medical harm

13

14 of 16

Conclusion

User with malicious intent could only observe a fraction of high escape and high fitness sequences (it is a lot of work for only a bit of gain)
Trade-off in terms of noise with increase in parameters
Changing the random seed: no significant difference
Recommendation of defensive measures
Link to code: https://github.com/sblittlefield/covid-adversarial

14

15 of 16

Acknowledgment

This work utilizes the NCSA HAL and DeltaAI resources supported by the National Science Foundation’s Major Research Instrumentation program, grant #1725729, as well as the University of Illinois at Urbana-Champaign
All laboratories and authors that contributed sequence data used in this paper
Reviewers for their suggestions and feedback

15

16 of 16

Thank you!

16