1 of 19

Part IV:

BackdoorBench

A Comprehensive Benchmark of Backdoor Learning�Home Page: http://backdoorbench.com

Codebase: https://github.com/SCLBD/BackdoorBench

2 of 19

BackdoorBench: Overview

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

12 Backdoor attack
20 Backdoor defense
20 Analysis tools
4 Datasets
8 Models
5 Metrics
762 Checkpoints
1 Leaderboard

3 of 19

BackdoorBench: Overview

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

4 of 19

BackdoorBench: Attacks

12 Backdoor attack methods

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

5 of 19

BackdoorBench: Defense

20 Backdoor attack methods

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

6 of 19

BackdoorBench: Analysis Tools

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

7 of 19

Metrics:

Clean Accuracy (C-ACC, ACC) :

Attack Successful Rate (ASR) :

Robust Accuracy (RA) :

BackdoorBench: Metrics

Clean Test Dataset:

Poisoning data generating function:

Target label:

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

8 of 19

Metrics:

Defense Effectiveness Rate (DER) :

Robust Improvement Rate (RIR) :

BackdoorBench: Metrics

Clean Test Dataset:

Poisoning data generating function:

Target label:

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

9 of 19

BackdoorBench: Analysis Overview

Analysis:

10 of 19

Analysis: Effect of Poisoning Ratio

Higher poisoning ratios may not lead to stronger attack performance and it may be more easily defended by some defense methods.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

11 of 19

Analysis: Effect of Defense in Feature Level

Backdoored models with higher poisoning ratios will highlight the difference between poisoned and clean samples, which will be utilized by defenses.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

12 of 19

Analysis: Effect of Datasets

One attack or defense method may have different performance on different datasets

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

13 of 19

Analysis: Effect of Model Structures

One attack or defense method may have different performance on different model architectures

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

14 of 19

Analysis: Quick Learning

Losses & Accuracy on training samples and testing samples
Gradient signal to noise ratios (GSNR)
Norms of average gradient on all training samples, clean training samples, and poisoned training samples;
Pairwise cosine similarities between average gradients on all training samples, clean training samples, and poisoned training samples.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

15 of 19

Analysis: Quick Learning

Observations:

Higher GSNR of poisoned samples shows backdoor has better generalization performance
Gradient on poisoned samples has larger norms and higher cosine similarity to the average gradient.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

16 of 19

Analysis: Memorization and Forgetting

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

17 of 19

Analysis: Memorization and Forgetting

Observation:

For poisoned training samples:

when the poisoning ratio is low (e.g., 0.1%, 0.5%), the forgetting numbers of poisoned samples are often larger than those of clean samples;
when the poisoning ratio is high (e.g., 5%, 10%), the forgetting numbers of poisoned samples are often smaller than those of clean samples.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

18 of 19

Analysis: Trigger Generalization

Analysis of trigger generalization in Blended attack: the training trigger with 10% (a), 20% (b) and 30% (c) transparency. For each case, we evaluate the attack success rate of testing triggers with the transparency 10%, 20%, and 30%, respectively.

Trigger Generalization: backdoored model trained with one trigger could be also activated by other triggers.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

19 of 19

Analysis: Trigger Generalization

Observation:

For Blended Attack:

Backdoor models trained with high-transparency triggers are rarely activated by low-transparency triggers
Backdoor models trained with low transparency triggers can be activated by a trigger with higher transparency

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.