1 of 82

Deep Learning in absence of training data

By: Gaurav Kumar Nayak

Advisor: Dr. Anirban Chakraborty

Department of Computational and Data Sciences

Indian Institute of Science, Bangalore, India

Indian Institute of Science

Bangalore, India

भारतीय विज्ञान संस्थान

बंगलौर, भारत

 

Department of Computational and Data Sciences

CDS

Department of Computational and Data Sciences

2 of 82

ML vs DL

2

  • ML : Features are extracted using handcrafted or static methods
  • DL : Feature are learnt from the training data itself

Other difference : Amount of data and Computational power needed

CDS.IISc.ac.in | Department of Computational and Data Sciences

3 of 82

Data Hungry Deep Models

  • Face Recognition system of Facebook (Deep Face)

3

  • Learning Model for robotic grasping (Alex et al.)
  • Dense Captioning (Andrej Karpathy)
  • Training set of 4 million facial images
  • More than 4000 identities
  • Reached accuracy level of 97.35%
  • Involve hand-eye coordination
  • To train their network a total of 800,000 grasp attempts were collected
  • 94000 images
  • 41,00,000 region grounded captions

References:

Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In  CVPR (pp. 1701-1708).

Levine, S., Pastor, P., Krizhevsky, A., & Quillen, D. (2016, October). Learning hand-eye coordination for robotic grasping with large-scale data collection. In International Symposium on Experimental Robotics (pp. 173-184). Springer, Cham.

Johnson, J., Karpathy, A., & Fei-Fei, L. (2016). Densecap: Fully convolutional localization networks for dense captioning. In CVPR (pp. 4565-4574).

CDS.IISc.ac.in | Department of Computational and Data Sciences

4 of 82

Data is Critical

4

  • Quality of feature depend on the training data

Deep Models performance strongly correlated with

‘the amount of training data

  • Problems with less data:
    1. Deep Models overfit
    2. Does not generalize well

  • Data plays a crucial role in training any deep network model

CDS.IISc.ac.in | Department of Computational and Data Sciences

5 of 82

Absence of Training Data (Privacy concerns)

  • Data is sensitive

5

  • Biometric Data
  • Medical Data

CDS.IISc.ac.in | Department of Computational and Data Sciences

6 of 82

Absence of Training Data (Proprietary Data)

  • Data is precious

6

  • Industry giants have much more data than others
  • Many Companies have proprietary rights over the data

  • Example: Google’s JFT-300M Proprietary Data
  • 300 million images
  • 18291 categories

Sun, C., Shrivastava, A., Singh, S., & Gupta, A. (2017). Revisiting unreasonable effectiveness of data in deep learning era. In ICCV (pp. 843-852).

References:

https://www.slideshare.net/ExtractConf/andrew-ng-chief-scientist-at-baidu

CDS.IISc.ac.in | Department of Computational and Data Sciences

7 of 82

Concerns on Deployment of trained Models

7

Given: a pretrained model and no training data

Lightweight Model

Robust to adversarial attacks

Knowledge Distillation

?

CDS.IISc.ac.in | Department of Computational and Data Sciences

8 of 82

Knowledge Distillation (KD)

  • Transfer the mapping function learned by a high-capacity Teacher model to a smaller Student model

8

Figure from https://towardsdatascience.com

CDS.IISc.ac.in | Department of Computational and Data Sciences

9 of 82

Knowledge Distillation (KD)

  • Transfer the mapping function learned by a high-capacity Teacher model to a smaller Student model

9

Figure from https://towardsdatascience.com

Useful for

Model Compression

CDS.IISc.ac.in | Department of Computational and Data Sciences

10 of 82

Knowledge Distillation (KD)

10

T

S

{X,Y}

Distillation Loss

Cross-entropy Loss

Hinton et al. Distilling the Knowledge in a Neural Network, arXiv:1503.02531, 2015

Dataset

CDS.IISc.ac.in | Department of Computational and Data Sciences

11 of 82

Requirement

11

Rely on

Labeled Data

CDS.IISc.ac.in | Department of Computational and Data Sciences

12 of 82

Is it a big problem ?

  • We may have trained models but not the training data

12

Rely on

Labeled Data

CDS.IISc.ac.in | Department of Computational and Data Sciences

13 of 82

Can we do Knowledge Distillation without (access to) training data (Zero-Shot)?

13

CDS.IISc.ac.in | Department of Computational and Data Sciences

14 of 82

GK Nayak, KR Mopuri, V Shaj, R V Babu, and A Chakraborty

Zero-Shot Knowledge Distillation

in Deep Networks

Indian Institute of Science

Bangalore, India

भारतीय विज्ञान संस्थान

बंगलौर, भारत

 

Department of Computational and Data Sciences

CDS

Department of Computational and Data Sciences

15 of 82

Data Free KD

15

T

S

{X,Y}

Distillation Loss

Cross-entropy Loss

Dataset

CDS.IISc.ac.in | Department of Computational and Data Sciences

16 of 82

Data Free KD

16

T

S

{X,Y}

Distillation Loss

Cross-entropy Loss

Dataset

Can samples be synthesized from the trained Teacher model ?

CDS.IISc.ac.in | Department of Computational and Data Sciences

17 of 82

Class Impressions: Parameters patterns

17

T

Pre-softmax

KR Mopuri et al., Ask, Acquire and Attack: Data-free UAP generation using Class impressions, ECCV 2018

c = Dog

CDS.IISc.ac.in | Department of Computational and Data Sciences

18 of 82

Training on CIs: Limitations

  • Generated samples are less diverse
  • One-hot vector labels are reconstructed → no latent/dark knowledge
  • Student suffers poor generalization

18

CDS.IISc.ac.in | Department of Computational and Data Sciences

19 of 82

Need an Improved modelling of the output space

19

CDS.IISc.ac.in | Department of Computational and Data Sciences

20 of 82

Dirichlet Distribution (on 2d simplex)

20

CDS.IISc.ac.in | Department of Computational and Data Sciences

21 of 82

21

Dirichlet modelling of output space

 

Class similarity matrix

Wk - weights learned by the Teacher’s softmax classifier for class ‘k’

 

CDS.IISc.ac.in | Department of Computational and Data Sciences

22 of 82

Data Impressions (DI)

22

T

Class similarity matrix

DI

Car

Cat

Horse

Truck

CDS.IISc.ac.in | Department of Computational and Data Sciences

23 of 82

Distillation with DIs

23

T

S

Distillation Loss

Cross-entropy Loss

Data Impressions (DI)

CDS.IISc.ac.in | Department of Computational and Data Sciences

24 of 82

Distillation with DIs

24

T

S

Distillation Loss

Data Impressions (DI)

CDS.IISc.ac.in | Department of Computational and Data Sciences

25 of 82

Results:

25

MNIST & F-MNIST

CIFAR-10

Teacher: LeNet

Student: LeNet-Half

Teacher: AlexNet

Student: AlexNet-Half

Data Impressions

Original Training Data

MNIST

CIFAR-10

F-MNIST

Class Impressions

Data Impressions (Ours)

MNIST

CIFAR-10

F-MNIST

CI Vs DI

CDS.IISc.ac.in | Department of Computational and Data Sciences

26 of 82

Results: Comparison

26

MNIST

CIFAR-10

Model

Performance

Teacher – CE

99.34

Student – CE

98.92

Student–KD

(Hinton et al., 2015)

60K original data

99.25

(Kimura et al., 2018)

200 original data

86.70

(Lopes et al., 2017)

(uses meta data)

92.47

ZSKD (Ours)

(24000 DIs, and no original data)

98.77

Model

Performance

Teacher – CE

83.03

Student – CE

80.04

Student – KD

(Hinton et al., 2015)

50K original data

80.08

ZSKD (Ours)

(40000 DIs, and no original data)

69.56

Model

Performance

Teacher – CE

90.84

Student – CE

89.43

Student – KD

(Hinton et al., 2015)

50K original data

89.66

(Kimura et al., 2018)

200 original data

72.50

ZSKD (Ours)

(48000 DIs, and no original data)

79.62

F-MNIST

CDS.IISc.ac.in | Department of Computational and Data Sciences

27 of 82

Summary

  • For the first time we have proposed a Zero-Shot KD approach

  • We showed that given only the neural network parameters, one can generate a synthetic dataset (Data Impressions) that can approximate the original training dataset without any prior knowledge about the training data distribution.

  • The effectiveness of the Data Impressions is demonstrated by training a student network from scratch for knowledge distillation.

27

For more details,

G. K. Nayak, M. K. Reddy, V. Shaj, R. Venkatesh Babu, A. Chakraborty, “Zero-Shot Knowledge Distillation in Deep Networks”, ICML, 2019.

ZSKD Code: https://github.com/vcl-iisc/ZSKD

CDS.IISc.ac.in | Department of Computational and Data Sciences

28 of 82

Extraction of Data Impressions (DI)

  • Not tied to any specific architecture
  • Not tied to any specific target application

Can Data Impressions be used across different computer vision applications ?

Does Robustness transfer to DI-Distilled Models ?

Can Data Impressions act as surrogate for original training samples ?

28

CDS.IISc.ac.in | Department of Computational and Data Sciences

29 of 82

GK Nayak, KR Mopuri, S Jain, and A Chakraborty

Mining Data Impressions from Deep Models as

Substitute for the Unavailable Training Data

Indian Institute of Science

Bangalore, India

भारतीय विज्ञान संस्थान

बंगलौर, भारत

 

Department of Computational and Data Sciences

CDS

Department of Computational and Data Sciences

30 of 82

Need for Proxy Data

30

CDS.IISc.ac.in | Department of Computational and Data Sciences

31 of 82

Data Impressions as Proxy Data

31

  • No Training Data needed
  • No any priors required
  • DIs generated by only using pretrained model

Agnostic to downstream applications

CDS.IISc.ac.in | Department of Computational and Data Sciences

32 of 82

Testing the effectiveness of Data Impressions

32

Verifying the effectiveness of the extracted Data Impressions (DIs):

  • Generalization”- The performance of adapted models (trained with DIs) on actual test data
  • Demonstrating on computer vision applications that have faced problem arising from data-free setup
  • Leverage data-free CV/ML problems and propose solution strategies using DIs

CDS.IISc.ac.in | Department of Computational and Data Sciences

33 of 82

Generic Nature of Data Impressions

33

Popular Applications (beyond KD) where data may not be accessible:

  • Unsupervised Domain Adaptation (Source Data - not shared)
  • Continual Learning (Samples of old classes’ data - not available)
  • Universal Adversarial Perturbations (without training data)

Independently tackled in literature - data generation is tied to task at hand (Application dependent)

Generation of Data Impressions - Application Independent and Architecture Independent

To show utility of DIs on diverse applications - proving DIs as reliable surrogates

CDS.IISc.ac.in | Department of Computational and Data Sciences

34 of 82

Data-free Knowledge Distillation

34

Investigating Robustness of DI-Distilled Models

  • DIs capture the adversarial robustness property of an adversarially trained Teacher and transfer robustness to Student models without enforcing explicit regularization or any additional penalty

CDS.IISc.ac.in | Department of Computational and Data Sciences

35 of 82

Source-free Unsupervised Domain Adaptation

35

CDS.IISc.ac.in | Department of Computational and Data Sciences

36 of 82

Source-free Unsupervised Domain Adaptation

36

Comparison with Source Dependent Approaches

Comparison with Source-free Domain Adaptation Methods

CDS.IISc.ac.in | Department of Computational and Data Sciences

37 of 82

Continual Learning in absence of old class data

37

CDS.IISc.ac.in | Department of Computational and Data Sciences

38 of 82

Data-free Universal Adversarial Perturbations

38

Mopuri, K. R., Uppala, P. K., & Babu, R. V. (2018). Ask, acquire, and attack: Data-free uap generation using class impressions. In ECCV .

CDS.IISc.ac.in | Department of Computational and Data Sciences

39 of 82

Can DIs be used for Data-free UAPs?

39

Class Impressions (CIs)

Data Impressions (DIs)

  • Estimated as inputs that maximize the softmax outputs/logit, corresponding to the specific class
  • CIs are class-specific
  • DIs are not tied to a specific class
  • Generated corresponding to softmax vectors sampled from a Dirichlet distribution
  • The samples generated for each class exhibit very little diversity

  • Sampling from Dirichlet distribution leads to diverse values of the entropy of the target softmax
  • Possibility of creating a training set composed of statistically uncorrelated as well as visually diverse image samples for UAP generation
  • CIs => DIs generated for one-hot encoded target softmax vectors

CIs - ‘special case of DIs’

“More generic than CIs”

CDS.IISc.ac.in | Department of Computational and Data Sciences

40 of 82

Data-free Universal Adversarial Perturbations (Ours)

40

Data

CDS.IISc.ac.in | Department of Computational and Data Sciences

41 of 82

Data-free Universal Adversarial Perturbations (Ours)

41

UAPs crafted from CIFAR-10 Data Impressions

UAPs from Data Impressions achieve better fooling rates and outperform those of Class Impressions by a minimum of 4.05%

CDS.IISc.ac.in | Department of Computational and Data Sciences

42 of 82

Summary

  • We introduced a novel and interesting problem of restoring training data from a trained deep neural network.

  • We aimed to restore the training data in a learning sense. Our objective is to restore data that can train models on related tasks and generalize well onto the natural data.

  • We have demonstrated the fidelity of the extracted samples, known as Data Impressions, via realizing excellent generalization for multiple tasks such as - – Knowledge distillation, crafting Adversarial Perturbations, Incremental Learning, and Domain Adaptation.

42

CDS.IISc.ac.in | Department of Computational and Data Sciences

43 of 82

Recall

43

Nayak, G.K., Mopuri, K.R., Shaj, V., Babu, R.V., & Chakraborty, A. (2019). Zero-Shot Knowledge Distillation in Deep Networks. ICML.

First work to demonstrate “Data free KD”

  • Drawback :  Not scalable and several iterations of backpropagations required to generate each sample

CDS.IISc.ac.in | Department of Computational and Data Sciences

44 of 82

Adversarial Belief Matching

44

Figure from Micaelli et al.

CDS.IISc.ac.in | Department of Computational and Data Sciences

45 of 82

DAFL: GAN based generation

45

CDS.IISc.ac.in | Department of Computational and Data Sciences

46 of 82

Making Scalable using GANs and Proxy Data

46

Addepalli, S., Nayak, G. K., Chakraborty, A., & Radhakrishnan, V. B. . DeGAN: Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier. In AAAI, 2020.

  • Drawback :  GAN training and careful balancing of losses 

CDS.IISc.ac.in | Department of Computational and Data Sciences

47 of 82

Existing Approaches for Data-free KD

47

Trained Teacher model

Direct composition of synthetic data

Learn training distribution via GAN that can seed proxy samples

Broad

Ways:

ZSKD [Nayak et al., ICML, 2019] DFKD [Lopes et al., NIPS Workshop, 2017]

ZSKT [Micaelli et al., NeurIPS, 2019]

DAFL [Chen et al., ICCV, 2019]

DeGAN [Addepalli et al., AAAI, 2020]

Existing Works:

Several iterations of backpropagations

Complicated Optimization requiring careful balancing of multiple losses

Drawbacks:

CDS.IISc.ac.in | Department of Computational and Data Sciences

48 of 82

Existing Approaches for Data-free KD

48

Trained Teacher model

Direct composition of synthetic data

Learn training distribution via GAN that can seed proxy samples

Broad

Ways:

ZSKD [Nayak et al., ICML, 2019] DFKD [Lopes et al., NIPS Workshop, 2017]

ZSKT [Micaelli et al., NeurIPS, 2019]

DAFL [Chen et al., ICCV, 2019]

DeGAN [Addepalli et al., AAAI, 2020]

Existing Works:

Several iterations of backpropagations

Complicated Optimization requiring careful balancing of multiple losses

Drawbacks:

The existing works suffer from heavy computational overhead

CDS.IISc.ac.in | Department of Computational and Data Sciences

49 of 82

Observation

  • Generated samples do not lie close to the original samples in the data manifold

  • Despite being far from real, effective for knowledge transfer

Motivates to investigate arbitrary transfer set for data-free KD

CDS.IISc.ac.in | Department of Computational and Data Sciences

50 of 82

Objective

Practical Importance

  • Can arbitrary transfer sets be effectively utilized for data-free KD?

50

  • Design important and strong baselines for KD

  • Computationally efficient

CDS.IISc.ac.in | Department of Computational and Data Sciences

51 of 82

Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation

GK Nayak, KR Mopuri, and A Chakraborty

Indian Institute of Science

Bangalore, India

भारतीय विज्ञान संस्थान

बंगलौर, भारत

 

Department of Computational and Data Sciences

CDS

Department of Computational and Data Sciences

52 of 82

Proposed Method: Motivation

52

CIFAR-10

CDS.IISc.ac.in | Department of Computational and Data Sciences

53 of 82

Proposed Method: Motivation

53

CIFAR-10

CDS.IISc.ac.in | Department of Computational and Data Sciences

54 of 82

Proposed Method: Motivation

DNNs often partition the arbitrary input domain into disproportionate classification regions

CIFAR-10

CDS.IISc.ac.in | Department of Computational and Data Sciences

55 of 82

Proposed Method: Illustration

55

CDS.IISc.ac.in | Department of Computational and Data Sciences

56 of 82

Arbitrary Transfer sets : Unbalanced v/s Balanced

56

CDS.IISc.ac.in | Department of Computational and Data Sciences

57 of 82

Augmentation helps the Underrepresented classes

57

  • After achieving some level of balance, we perform augmentations

  • Augmentations add diversity to the target classes and also boosts the underrepresented classes in the transfer set

CDS.IISc.ac.in | Department of Computational and Data Sciences

58 of 82

Comparison with state-of-the-art

58

CDS.IISc.ac.in | Department of Computational and Data Sciences

59 of 82

Explicit removal of overlapping classes

59

CIFAR-10

CDS.IISc.ac.in | Department of Computational and Data Sciences

60 of 82

Explicit removal of overlapping classes

60

CIFAR-10

DeGAN uses 45000 CIFAR-100 samples whereas we are effectively utilizing only 18818 samples when added on top of SVHN

CDS.IISc.ac.in | Department of Computational and Data Sciences

61 of 82

Generality of the Proposed Strategy

61

Teacher Model Trained on

Binary-MNIST

Binary-FMNIST

CDS.IISc.ac.in | Department of Computational and Data Sciences

62 of 82

Generality of the Proposed Strategy

62

CDS.IISc.ac.in | Department of Computational and Data Sciences

63 of 82

Summary

“Arbitrary” transfer sets:

  • Can deliver competitive KD performance.

  • Can yield important baselines for the data-free KD.

  • Being ‘target class-balanced’ maximizes the transfer performance.

  • KD performance further improves with similarity to original training data.

63

For more details,

G. K. Nayak, M. K. Reddy, A. Chakraborty, “Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation”, WACV, 2021.

Video Link: https://www.youtube.com/watch?v=7qiLHdr1iLk

CDS.IISc.ac.in | Department of Computational and Data Sciences

64 of 82

Conclusion

Three different approaches to perform knowledge distillation in absence of training data:

64

  • Strict data-free scenario

(without using any training data & publicly available data)

ZSKD method

(first to solve, but has scalability issues)

  • Using Proxy Data

(no training samples)

DeGAN method

(scalable, but complicated GAN training)

  • Using Arbitrary Data

(no training samples)

‘Target class-balanced’ transfer set method

(Computationally efficient,

competitive KD performance)

CDS.IISc.ac.in | Department of Computational and Data Sciences

65 of 82

Concerns on Deployment of trained Models

65

Given: a pretrained model and no training data

Lightweight Model

Robust to adversarial attacks

ZSKD

DeGAN

Arbitrary transfer sets (Target class-balanced)

?

CDS.IISc.ac.in | Department of Computational and Data Sciences

66 of 82

DAD: Data-free Adversarial Defense �at Test Time

Gaurav Kumar Nayak, Ruchit Rawal, Anirban Chakraborty

Accepted in WACV 2022

Indian Institute of Science

Bangalore, India

भारतीय विज्ञान संस्थान

बंगलौर, भारत

 

©Department of Computational and Data Science, IISc, 2016�This work is licensed under a Creative Commons Attribution 4.0 International License

Copyright for external content used with attribution is retained by their original authors

CDS

Department of Computational and Data Sciences

67 of 82

Adversarial Vulnerability

67

Motivation

Approach

Results

Conclusion

Deep Neural Networks are highly susceptible to ‘adversarial perturbations’

C.I. : Clean Image

A.I. : Adversarial Image

: Trained Model (Non Robust)

68 of 82

Existing Approaches

68

  • Data privacy and security

e.g. Patients’ data, biometric data

  • Data is property → Proprietary data

e.g. Google’s JFT-300M dataset

C.T.D. : Clean Training Data

A.I. : Adversarial Image

: Trained Model (Non Robust)

: Frozen : Trainable

Motivation

Approach

Results

Conclusion

69 of 82

Desired Objective

69

How to make the pretrained models robust against adversarial attacks in absence of original training data or their statistics?

  • The Pseudo-Data generation process is computationally expensive
  • Retraining the model on the generated data using adversarial defense techniques is an added computation overhead

Potential Solutions

Drawbacks

  • Generate Pseudo-Data from the pretrained model

(using methods such as ZSKD [1] , Deep Inversion [2] , DeGAN [3] )

  • Use generated data as a substitute for the unavailable training data

[1] G. K. Nayak, K. R. Mopuri, V. Shaj, V. B. Radhakrishnan, and A. Chakraborty, “Zero-shot knowledge distillation in deep networks,” in ICML, 2019.

[2] H. Yin, P. Molchanov, J. M. Alvarez, Z. Li, A. Mallya, D. Hoiem, N. K. Jha, and J. Kautz, “Dreaming to distill: Data-free knowledge transfer via deepinversion,” in CVPR, 2020.

[3] S. Addepalli, G. K. Nayak, A. Chakraborty, and R. V. Babu, “De-GAN : Data-Enriching gan for retrieving representative samples from a trained classifier,” in AAAI, 2020.

Motivation

Approach

Results

Conclusion

70 of 82

Proposed Approach

70

Motivation

Approach

Results

Conclusion

Test time adversarial detection and subsequent correction on input space (data) instead of model.

C.I. : Clean Image

A.I. : Adversarial Image

: Pretrained Model

(Non Robust)

LFC : Low Frequency

Component

71 of 82

Detection Module

71

Motivation

Approach

Results

Conclusion

72 of 82

Motivation for Correction Module

72

Motivation

Approach

Results

Conclusion

73 of 82

Correction Module

73

Motivation

Approach

Results

Conclusion

At a particular radius :

Normalized discriminability score :

Corrected Adversarial Sample :

Normalized Adversarial Contamination score :

Optimal Radius ( ) :

Maximum Radius at which >

74 of 82

Correction Module

74

Motivation

Approach

Results

Conclusion

75 of 82

Correction Module

75

Motivation

Approach

Results

Conclusion

76 of 82

Performance of Proposed Detection Module

76

  • We achieve a very high “Clean Sample Detection Acc.” (≈90%-99%) that allows us to preserve the model’s accuracy on clean samples. �
  • The trend is consistent across a broad range of model architectures, datasets and adversarial attacks.

Motivation

Approach

Results

Conclusion

77 of 82

Performance of Proposed Correction Module

77

  • The performance of the non-robust model on adversarially perturbed data with and without our correction module is denoted by A.A. (After Attack) and A.C. (After Correction) respectively.

Motivation

Approach

Results

Conclusion

  • Most notably, we achieve a performance gain of ≈ 35 − 40% on the state-of the-art Auto-Attack across different architectures on multiple datasets. SImilar trend is observed across other attacks as well.

78 of 82

Effectiveness of Proposed Radius Selection

78

  • Ablations on a “Random Baseline” (R.B.) wherein is chosen randomly (within our specified range) for each sample. �
  • The random baseline is significantly lower than our proposed approach indicating the usefulness of selecting appropriately.

Motivation

Approach

Results

Conclusion

79 of 82

Performance of Combined Detection and Correction

79

Motivation

Approach

Results

Conclusion

80 of 82

Comparison with Data Dependent Approaches

80

  • DAD achieves decent adversarial accuracy while maintaining a high clean accuracy, entirely at test-time.

Motivation

Approach

Results

Conclusion

81 of 82

Conclusion

81

  • Proposed for the first time a complete test time detection and correction approach for adversarial robustness in absence of training data

  • Our adversarial detection framework is based on source-free UDA

  • Our correction framework inspired from human cognition, analyzes the input data in Fourier domain and discards the adversarially corrupted high-frequency regions

  • We achieve significant improvement in adversarial accuracy, even against state-of-the-art Auto Attack without compromising much on the clean accuracy

  • Additional benefits:

    • Any state-of-the-art classifier-based adversarial detector can be easily adopted on our source-free UDA-based adversarial detection framework

    • Any data-dependent detection approach can benefit from our correction module at test time to correct adversarial samples after successfully detecting them.

Motivation

Approach

Results

Conclusion

82 of 82

Thank you !!

82

For any queries:

Email : gauravnayak@iisc.ac.in

Linked In : https://www.linkedin.com/in/gaurav-nayak-6227ba53/

Webpage : https://sites.google.com/view/gauravnayak/

CDS.IISc.ac.in | Department of Computational and Data Sciences