1 of 22

3LegRace: Privacy-Preserving DNN Training over TEEs and GPUs

Yue Niu, Ramy E. Ali, Salman Avestimehr

Ming Hsieh Dept. of Electrical and Computer Engineering

University of Southern California

2 of 22

Motivation: data privacy in machine learning

  • with GPUs,TPUs,...
  • with TEEs

High compute performance, but

Lack privacy protection

Strong privacy guarantee, but

Low compute performance

3 of 22

Problem Statement

How to combine GPUs and TEEs to achieve both performance and privacy guarantees?

4 of 22

Current solutions

  • Procedure:
    • first blind data in SGX;
    • then output to GPUs to accelerate computing;
    • at last, unblind output in SGX.
  • Pros: fully private inference (given the TEE is secure), small costs in TEEs.
  • Cons: only support NN inference y = y’ - w*n

Inference: Slalom [1]

5 of 22

Current solutions

  • Procedure:
    • first encrypt data in SGX using MPC;
    • then output to multiple nodes to accelerate computing;
    • at last, decrypt output from multiple nodes.
  • Pros: private inference/training, small costs in TEEs.
  • Cons: require non-colluding GPUs

Training: PrivateML [2], DarKnight [3]

6 of 22

Proposed Solution: AsymML

  • Decompose data in an asymmetric manner;
  • Low-rank part (with most information) are fed into TEEs;
  • The residuals (with less information) are fed into GPUs

7 of 22

Proposed Solution: A closer view

 

8 of 22

Proposed Solution: An observation

High correlation exists between channels in intermediate feature in NN models

9 of 22

Contribution 1: Asymmetric data/model decomposition

 

Forward

Backward

As original convolution

Complexity

O(r)

O(N)

10 of 22

Contribution 1: Asymmetric data/model decomposition

Compute cost comparison ( r/N = 1/16 )

11 of 22

Contribution 2: Theoretical guarantee of privacy

DP privacy guarantee:

12 of 22

Contribution 3: Theoretical analysis on low-rank structure

SVD-channel entropy:

: the necessary number of principal channels to reconstruct X.

SVD-channel entropy bound in CNNs:

Conv:

ReLU:

Pooling:

BNorm:

13 of 22

Contribution 4: AsymML implementation

  • Automatically convert model into AsymML format.
  • Two context (GPU, SGX) runs in parallel.
  • Support both forward and backward passes

14 of 22

Numerical Evaluation: Training on DNNs

  • 7.5/7.6x faster than SGX-only training on VGG16/VGG19.
  • 5.8/5.8 x faster than SGX-only training on ResNet18/ResNet34

15 of 22

Numerical Evaluation: Inference on DNNs

  • 7.7/11.2x faster than SGX-only inference on VGG16/VGG19.
  • 5.8/7.2 x faster than SGX-only inference on ResNet18/ResNet34
  • Almost as fast as Slalom

16 of 22

Numerical Evaluation: Training accuracy

blue dash: baseline acc of original models

red dots: accuracy of AsymML

blue arrows: accuracy improvement with the residual part

  • Training with only low-rank can achieve good accuracy

  • The residual part is still needed to match accuracy to original models.

17 of 22

Numerical Evaluation: model inversion attack[4]

  1. Prior knowledge distillation: the attacker trains a GAN model with public knowledge;

  1. Secret revelation: with the residual data, the attacker reconstruct images that has highest accuracy in the target model Mt

Metric:

  • PSNR

  • SSIM

  • Accuracy on the target model

18 of 22

Numerical Evaluation: model inversion attacks

19 of 22

original training data

residual data

reconstructed data

Numerical Evaluation: model inversion attacks

20 of 22

Limitations: CPU-GPU comm.

model: VGG16

Running time breakdown:

  • Forward (FWD)

  • Backward (BWD)

  • CPU-GPU communication

21 of 22

References

[1] Tramer, F. and Boneh, D., Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. In International Conference on Learning Representations (2018).

[2] So, J., Güler, B. and Avestimehr, A.S., CodedPrivateML: A fast and privacy-preserving framework for distributed machine learning. IEEE Journal on Selected Areas in Information Theory (2021).

[3] Hashemi, H., Wang, Y. and Annavaram, M., DarKnight: An accelerated framework for privacy and integrity preserving deep learning using trusted hardware. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (2021).

[4] Zhang, Y., Jia, R., and et al, The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020).

22 of 22

Q & A

Contact: yueniu@usc.edu