1 of 29

Scaling Law for Adversarial Robustness

KAP Scaling

Reza Bayat

2 of 29

Agenda

Motivation
What is adversarial example?
Prior defense methods
Kernel Average Pool(KAP)
KAP Scaling
Conclusion

3 of 29

Motivation

Common assumption: train and test distributions come from the same distributions.
Adversarial attacks intentionally violate this assumption.
This severely impacts the safety of ML-based systems in real world applications such as face recognition and autonomous driving.

Eykholt et al. CVPR 2018

4 of 29

Adversarial Attacks

Photo: Goodfellow et al, Explaining and Harnessing Adversarial Examples, ICLR 2015

5 of 29

Adversarial Attacks

Assume

Given a Classifier Function:

Denote by a perturbation function (i.e. adversarial attack) which, for a given d , generates a perturbed sample within the ϵ-neighborhood of , , by solving the following maximization problem:

6 of 29

Prior work

Adversarial Training (Madry et al, 2018)�Train the model on examples that maximize the loss.�

7 of 29

Prior work

TRADES (Zhang et al, 2019)�Pushes the decision boundary away from data.�

8 of 29

Cycle of defense creating and breaking

Step 1

Researchers create a new robust network, or a detection mechanism for adversarial inputs.

Step 2

Other researchers create a new attack that breaks the that defense.

Step 3

Repeat :)

9 of 29

Other works: Can we prove robustness to adversarial attacks?

Certified Robustness

Slide from: Chen and Nguyen et al. Verifying Robustness of Neural Networks with a Probabilistic Approach

10 of 29

But why epsilon?

Gilmer et al. Motivating the Rules of the Game for Adversarial Example Research

11 of 29

Other kinds of methods

On the other hand, several lines of work have proposed training methods using additive random noise as a way to improve the neural network robustness while keeping the computational overhead comparatively low.

References:

Liu et al. Towards robust neural networks via random self-ensemble. 2018
Cohen et al. Certified adversarial robustness via randomized smoothing. 2019

12 of 29

KAP - Background

Model ensembles have long been used in machine learning to reduce the variance in individual model predictions, making them more robust to input perturbations.
Pseudo-ensemble methods like DropOut have also been commonly used in deep learning models to improve generalization, but they didn’t improve that much the adversarial robustness.
However, the application of these techniques to improve neural networks' robustness against input perturbations have remained largely unexplored.
And also, the high computational cost of training multiple neural networks and averaging their outputs at test time often quickly becomes prohibitively expensive.

13 of 29

Method

We proposed a simple method for learning kernel ensembles in deep neural networks to improve their robustness against input perturbations such as adversarial attacks.
In contrast to prior methods such as DropOut that focus on improving the usefulness of individual features in the absence of others, our method focuses on learning feature ensembles that form local "committees" similar to those used in Boosting and Random Forests.
To create these committees in layers of a neural network, we introduce the Kernel Average Pool (KAP) operation that computes and forward-propagates the average activity of a small ensemble of kernels from each layer.
Moreover, the resulting models are remarkably robust against various forms of adversarial attacks.

14 of 29

Method

Intuitively, a model which has a high prediction variance (or similarly high risk variance) to noisy inputs, is more likely to exhibit extreme high risks for data points sampled from the same distribution (i.e. adversarial examples).
Indeed, classifiers that produce lower variance are often expected to generalize better and be more robust to input noise.
For example, classic ensemble methods like bagging, boosting, and random forests operate by combining the decisions of many weak (i.e. high variance) classifiers into a stronger one with reduced prediction variance and improved generalization performance.

15 of 29

Method

However, it has been shown that such ensemble models still remain prone to ensemble of adversarial attacks in practice.
It’s reasoned that individual networks participating in the ensembles may still learn different sets of non-robust representations leaving room for the attackers to find common weak spots across all individual models within the ensemble.
On the other hand, learning robust features leads to robust classification performance. Consequently, if individual kernels within a single network are robust, it would then become much more difficult to find adversaries that can fool the complete network.

16 of 29

Kernel Average Pool (KAP)

Here, we wish to substitute each kernel in the neural network model with an ensemble of kernels performing the same function such that the ensemble output is the average of individual kernel outputs.

17 of 29

Kernel average pool (KAP)

Given an input , the kernel average pool operation with kernel size K, computes the following function.

18 of 29

Kernel average pools yield topographically organized kernel ensembles

19 of 29

CIFAR10 and CIFAR100

20 of 29

Tiny-ImageNet and ImageNet

21 of 29

Why scaling?

It has been shown that increasing the number of models in the traditional ensemble models leads to better generalization.
However, due to the cost of training and time complexity at inference, having a large ensemble neural network is not practical.
Therefore, increasing the model’s capacity with KAP operation should lead to better performance since it corresponds to having more ensemble kernels.

22 of 29

KAP Scaling - CIFAR100 - Clean accuracy

23 of 29

KAP Scaling - CIFAR100 - Noisy accuracy

24 of 29

KAP Scaling - CIFAR100 - ADV accuracy on AA attack

25 of 29

KAP Scaling - CIFAR100 - ADV accuracy on PGD attack

26 of 29

Conclusion

Ensemble models reduce the variance of the output, and the lower the variance the model has, the more generalization performance it has.
KAP operation incorporates the ensemble approach into the model's kernels.
Increasing models capacity in terms of depth and width will increase the performance of the model on clean and noisy test data.
In the case of adversarial robustness, as we increased the model size in-depth, we have seen a linear increase in performance; however, it depends on the attack type.

27 of 29

Future works

Investigating the scaling properties of the KAP operation on larger datasets like ImageNet and ImageNet-22K.
Applying the KAP operation on pre-trained models.
Incorporating the KAP idea into Transformer based models such as ViT and Swin-Transformer.
Investigating the effect of increasing the model size on different types of attacks.

29 of 29

Adversarial Attacks

Assume

Feature learning function:
Task Classifier:
Predicted class for the sample input:

For is and is an attack function that generates perturbed example within ϵ-neighbourhood of by maximization the following objective:

1 of 29

2 of 29

3 of 29

4 of 29

5 of 29

6 of 29

7 of 29

8 of 29

9 of 29

10 of 29

11 of 29

12 of 29

13 of 29

14 of 29

15 of 29

16 of 29

17 of 29

18 of 29

19 of 29

20 of 29

21 of 29

22 of 29

23 of 29

24 of 29

25 of 29

26 of 29

27 of 29

28 of 29

29 of 29