Scaling Law for Adversarial Robustness
KAP Scaling
Reza Bayat
Agenda
Motivation
Eykholt et al. CVPR 2018
Adversarial Attacks
Photo: Goodfellow et al, Explaining and Harnessing Adversarial Examples, ICLR 2015
Adversarial Attacks
Denote by a perturbation function (i.e. adversarial attack) which, for a given d , generates a perturbed sample within the ϵ-neighborhood of , , by solving the following maximization problem:
Prior work
Adversarial Training (Madry et al, 2018)�Train the model on examples that maximize the loss.�
Prior work
TRADES (Zhang et al, 2019)�Pushes the decision boundary away from data.�
Cycle of defense creating and breaking
Other works: Can we prove robustness to adversarial attacks?
Certified Robustness
Slide from: Chen and Nguyen et al. Verifying Robustness of Neural Networks with a Probabilistic Approach
But why epsilon?
Gilmer et al. Motivating the Rules of the Game for Adversarial Example Research
Other kinds of methods
On the other hand, several lines of work have proposed training methods using additive random noise as a way to improve the neural network robustness while keeping the computational overhead comparatively low.
References:
KAP - Background
Method
Method
Method
Kernel Average Pool (KAP)
Here, we wish to substitute each kernel in the neural network model with an ensemble of kernels performing the same function such that the ensemble output is the average of individual kernel outputs.
Kernel average pool (KAP)
Given an input , the kernel average pool operation with kernel size K, computes the following function.
Kernel average pools yield topographically organized kernel ensembles
CIFAR10 and CIFAR100
Tiny-ImageNet and ImageNet
Why scaling?
KAP Scaling - CIFAR100 - Clean accuracy
KAP Scaling - CIFAR100 - Noisy accuracy
KAP Scaling - CIFAR100 - ADV accuracy on AA attack
KAP Scaling - CIFAR100 - ADV accuracy on PGD attack
Conclusion
Future works
Thanks
Adversarial Attacks