1 of 46

Flatness-aware Sequential Learning Generates Resilient Backdoors

Hoang Pham¹, The-Anh Ta², Anh Tran³, Khoa D. Doan¹

2 of 46

Machine Learning Models in Practice

The increasing complexity of Machine Learning Models and Training Processes has promoted training outsourcing and Machine Learning as a Service (MLaaS)

2

Training Data

Training ML Model

Prediction

Input Data

Trained Model

MLaaS

Provider

This creates a paramount security concern in the model building supply chain

3 of 46

Backdoor Attack

Backdoor attacks can lead to harmful consequences when ML models are deployed in real life.

3

Training Data

Training ML Model

Input Data

Trained Model

Trigger

Backdoor Attack influences the model prediction by modifying the model’s behavior during the training process with a backdoor trigger

Prediction

Clean

Pred: No turn right ✅

Black Square

Pred: Turn right ❌

4 of 46

Backdoor Attack

The trigger is fixed:

Patched, Blended, SIG, Refool: training algorithm is not modified

The trigger is dynamic:

Wanet, LIRA: training algorithm is modified

4

credit: Doan et al., 2021

5 of 46

Limitations of Conventional Backdoor Learning

Existing backdoors can be easily removed with fine-tuning

5

6 of 46

Limitations of Conventional Backdoor Learning

Existing backdoors can be easily removed with fine-tuning

6

7 of 46

Limitations of Conventional Backdoor Learning

Existing backdoors can be easily removed with fine-tuning

⇒ How to train resistant backdoor models?

7

8 of 46

Limitations of Conventional Backdoor Learning

Existing backdoors can be easily removed with fine-tuning

⇒ How to train resistant backdoor models?

⇒ Guide the backdoor model to be trapped into a backdoor region even after fine-tuning defenses

8

9 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

9

10 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

We guide the model to a flatter backdoor region with SAM optimizer

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

10

11 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

11

We relies on Continual Learning to mitigate backdoor forgetting

12 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

12

We relies on Continual Learning to mitigate backdoor forgetting

13 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

13

We relies on Continual Learning to mitigate backdoor forgetting

⇒ Force the model converge into low clean loss basin but deeper within the backdoor area

14 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

14

We relies on Continual Learning to mitigate backdoor forgetting

⇒ Familiarizing backdoored model with clean-only data and bypass fine-tuning defenses

15 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model on both clean and poisoned data D₀

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

15

SBL can be used to train existing backdoor attacks to enhance their resilience!

16 of 46

Key Results

16

17 of 46

Key Results

17

18 of 46

Key Results

18

19 of 46

Key Results

19

20 of 46

Key Results

20

21 of 46

Key Results

21

SBL helps existing trigger-based attacks bypass advanced fine-tuning defenses

22 of 46

SBL trapped the model in backdoored region

22

23 of 46

SBL trapped the model in backdoored region

23

24 of 46

Learning Dynamic During Fine-Tuning Defense

In early stage, gradient norm values of CBL significantly higher than those of SBL.

⇒ Fine-tuned CBL model can be more easily pushed further away from backdoor minimum

⇒ Fine-tuning defenses easily find backdoor-free local minima.

24

25 of 46

Ablation Study: The Role of SAM Optimizer

CBL with SAM optimizer helps backdoored model bypass FT-SAM defense 🙂

25

26 of 46

Ablation Study: The Role of SAM Optimizer

CBL with SAM optimizer helps backdoored model bypass FT-SAM defense 🙂
But it fails with other fine-tuning defenses 🥲

26

27 of 46

Ablation Study: The Role of Continual Learning

Continual Learning improve the backdoor resistance

27

28 of 46

SAM and CL collaboratively enhance the Resilience

SAM and CL collaboratively generate �resilience backdoor

28

29 of 46

SBL works with different architectures

29

30 of 46

SBL works with low poisoning rates

30

31 of 46

Conclusion

SBL significantly intensifies the resilience of the implanted backdoor
SBL can incorporate with existing trigger-based attacks�
Future directions:

We encourage backdoor research to look at countermeasures against SBL
Studying SBL risks in “pretraining then finetuning” paradigm

For example, LLM, Generative Vision Models…

SBL for generating unremovable watermark

31

32 of 46

THANK YOU!

MAIL-Research @ VinUniversity, Vietnam

32

33 of 46

Key Results

SBL help existing trigger-based attacks bypass advanced fine-tuning defenses

34 of 46

Existing Backdoor Attack

Goal: Train a backdoored model that hard to purify

Main categories of backdoor attacks based on Attacker’s Capability:

Manipulating data:

BadNets (Gu et al., 2017)
Blend attack (Chen et al., 2017)
SIG (Barni et al., 2019)
…

Manipulating model training:

Wanet (Anh et al., 2019)
Lira (Doan et al., 2021)
…

35 of 46

Backdoor Defense

Goal: Remove backdoors from model

Main categories for Backdoor Defense:

Backdoor detection: suspicious samples?

STRIP (Gao et al., 2019)
Neural Cleanse (Wang et al., 2019)
…

Backdoor erasing:

Fine-tuning (Liu et al., 2018)
NAD (Li et al., 2021)
SAM-FT (Zhu et al., 2023)

36 of 46

In early stage of fine-tuning defense, gradient norms of model trained by CBL significantly higher than those of SBL.

⇒ Backdoor models trained by CBL are more easily pushed further away from backdoor minimum

CBL with SAM optimizer helps backdoored model bypass FT-SAM defense, but it fails with other fine-tuning defenses

Sequential learning enhances the backdoor resistance even with out SAM

37 of 46

Backdoor Attack

Backdoor attacks can lead to harmful consequences when ML models are deployed in real life.

Training Data

Training ML Model

Prediction

Input Data

Trained Model

Trigger

Backdoor Attack influences the model prediction by modifying the model’s behavior during the training process with a backdoor trigger

Clean

Black Square

Pred: No turn right ✅

Pred: Turn right ❌

38 of 46

Limitations of Conventional Backdoor Learning

Backdoored model θ_B learned by Conventional Backdoor Learning (CBL) is easily pushed out of backdoor region after fine-tuned θ_F on clean data with appropriate learning rate

⇒ How to train resistant backdoor models?

38

39 of 46

Limitations of Conventional Backdoor Learning

Backdoored model θ_B learned by Conventional Backdoor Learning (CBL) is easily pushed out of backdoor region after fine-tuned θ_F on clean data with appropriate learning rate

⇒ How to train resistant backdoor models?

⇒ Learning to guide the backdoor model to be trapped into a backdoor region even after fine-tuning defenses

39

40 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training backdoor model

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

40

41 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training Backdoor Model

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

41

We use Continual Learning methods to mitigate backdoor forgetting

42 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training Backdoor Model

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

42

We use Continual Learning methods to mitigate backdoor forgetting

⇒ Force the model converge into low clean loss basin but deeper within the backdoor area

43 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training Backdoor Model

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

43

We use Continual Learning methods to mitigate backdoor forgetting

⇒ Force the model converge into low clean loss basin but deeper within the backdoor area

⇒ Familiarizing backdoored model with clean-only data and bypass fine-tuning defenses

44 of 46

Proposed Framework: Sequential Backdoor Learning (SBL)

Two step backdoor learning framework:

Step 0: Training Backdoor Model

Step 1: Mimicking the fine-tuning defenses on clean data with constraints

44

SBL can incorporate with existing backdoor attacks to enhance the resilience

45 of 46

Key Results

45

SBL helps existing trigger-based attacks bypass advanced fine-tuning defenses

46 of 46

Key Results

46

SBL helps existing trigger-based attacks bypass advanced fine-tuning defenses