[IJACAI’22]
: Machine Unlearning works mainly focus on how to eliminate the contribution of a training sample to the model.
🡪 Before requesting MU, it is necessary to verify whether one's own data samples have been trained on the Target model first.
Motivation
Introduction
: The data owner has only black-box access to the target model
: The data owner can actively add markers to her data samples because she has full control and knowledge of her data
MIB
Threat Model
MIB
Sample generation:
MIB
STEP1. Generating Marked Data
p: trigger
g: Backdoor-sample-generation function:
Goal :
x’ : backdoor sample
yt : target label
Pr : attack success probability
MIB
STEP1. Generating Marked Data
Backdoor-sample-generation function:
: element-wise product
v is a mapping parameter that has the same form as x with each element ranges in [0, 1].
If an unauthorized party includes these marked samples to the training dataset to train a DNN model,
the model will finally learn the correlation between the trigger and the target label, i.e., the model will be backdoored.
Best practice:
MIB
STEP2. Traning
MIB
STEP3. Membership Inference.
represents the backdoor attack success probability of the target model.
β represents the backdoor attack success probability of a clean model.
β = 1/K (i.e., random chance), K : number of classes
MIB
STEP3. Membership Inference.
→ How large does the ASR need to be to reject H0?
MIB
Theoretical Analysis
where β(= 1/K) and tτ is the τ quantile of the t distribution with m − 1 degrees of freedom.
🡪 With a limited number of queries, the data owner can claim membership of her data with τ confidence when the ASR value for the target model is sufficiently high.
*m : number of query