1 of 19

1

Part IV:

BackdoorBench

A Comprehensive Benchmark of Backdoor Learning�Home Page: http://backdoorbench.com

Codebase: https://github.com/SCLBD/BackdoorBench

2 of 19

2

BackdoorBench: Overview

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

  • 12 Backdoor attack
  • 20 Backdoor defense
  • 20 Analysis tools
  • 4 Datasets
  • 8 Models
  • 5 Metrics
  • 762 Checkpoints
  • 1 Leaderboard

3 of 19

3

BackdoorBench: Overview

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

4 of 19

4

BackdoorBench: Attacks

  • 12 Backdoor attack methods

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

5 of 19

5

BackdoorBench: Defense

  • 20 Backdoor attack methods

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

6 of 19

6

BackdoorBench: Analysis Tools

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

7 of 19

7

Metrics:

    • Clean Accuracy (C-ACC, ACC) :

    • Attack Successful Rate (ASR) :

    • Robust Accuracy (RA) :

BackdoorBench: Metrics

Clean Test Dataset:

Poisoning data generating function:

Target label:

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

8 of 19

8

Metrics:

    • Defense Effectiveness Rate (DER) :

    • Robust Improvement Rate (RIR) :

BackdoorBench: Metrics

Clean Test Dataset:

Poisoning data generating function:

Target label:

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

9 of 19

9

BackdoorBench: Analysis Overview

Analysis:

10 of 19

10

Analysis: Effect of Poisoning Ratio

  • Higher poisoning ratios may not lead to stronger attack performance and it may be more easily defended by some defense methods.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

11 of 19

11

Analysis: Effect of Defense in Feature Level

  • Backdoored models with higher poisoning ratios will highlight the difference between poisoned and clean samples, which will be utilized by defenses.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

12 of 19

12

Analysis: Effect of Datasets

  • One attack or defense method may have different performance on different datasets

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

13 of 19

13

Analysis: Effect of Model Structures

  • One attack or defense method may have different performance on different model architectures

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

14 of 19

14

Analysis: Quick Learning

    • Losses & Accuracy on training samples and testing samples
    • Gradient signal to noise ratios (GSNR)
    • Norms of average gradient on all training samples, clean training samples, and poisoned training samples;
    • Pairwise cosine similarities between average gradients on all training samples, clean training samples, and poisoned training samples.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

15 of 19

15

Analysis: Quick Learning

Observations:

    • Higher GSNR of poisoned samples shows backdoor has better generalization performance
    • Gradient on poisoned samples has larger norms and higher cosine similarity to the average gradient.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

16 of 19

16

Analysis: Memorization and Forgetting

 

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

17 of 19

17

Analysis: Memorization and Forgetting

Observation:

    • For poisoned training samples:
      1. when the poisoning ratio is low (e.g., 0.1%, 0.5%), the forgetting numbers of poisoned samples are often larger than those of clean samples;
      2. when the poisoning ratio is high (e.g., 5%, 10%), the forgetting numbers of poisoned samples are often smaller than those of clean samples.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

18 of 19

18

Analysis: Trigger Generalization

Analysis of trigger generalization in Blended attack: the training trigger with 10% (a), 20% (b) and 30% (c) transparency. For each case, we evaluate the attack success rate of testing triggers with the transparency 10%, 20%, and 30%, respectively.

Trigger Generalization: backdoored model trained with one trigger could be also activated by other triggers.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.

19 of 19

19

Analysis: Trigger Generalization

Observation:

    • For Blended Attack:
      1. Backdoor models trained with high-transparency triggers are rarely activated by low-transparency triggers
      2. Backdoor models trained with low transparency triggers can be activated by a trigger with higher transparency

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning, NeurIPS D&B Track, 2022.