1 of 22

3LegRace: Privacy-Preserving DNN Training over TEEs and GPUs

Yue Niu, Ramy E. Ali, Salman Avestimehr

Ming Hsieh Dept. of Electrical and Computer Engineering

University of Southern California

2 of 22

Motivation: data privacy in machine learning

with GPUs,TPUs,...

with TEEs

High compute performance, but

Lack privacy protection

Strong privacy guarantee, but

Low compute performance

3 of 22

Problem Statement

How to combine GPUs and TEEs to achieve both performance and privacy guarantees?

4 of 22

Current solutions

Procedure:

first blind data in SGX;
then output to GPUs to accelerate computing;
at last, unblind output in SGX.

Pros: fully private inference (given the TEE is secure), small costs in TEEs.
Cons: only support NN inference y = y’ - w*n

Inference: Slalom [1]

5 of 22

Current solutions

Procedure:

first encrypt data in SGX using MPC;
then output to multiple nodes to accelerate computing;
at last, decrypt output from multiple nodes.

Pros: private inference/training, small costs in TEEs.
Cons: require non-colluding GPUs

Training: PrivateML [2], DarKnight [3]

6 of 22

Proposed Solution: AsymML

Decompose data in an asymmetric manner;
Low-rank part (with most information) are fed into TEEs;
The residuals (with less information) are fed into GPUs

7 of 22

Proposed Solution: A closer view

8 of 22

Proposed Solution: An observation

High correlation exists between channels in intermediate feature in NN models

9 of 22

Contribution 1: Asymmetric data/model decomposition

Forward
Backward		As original convolution
Complexity	O(r)	O(N)

10 of 22

Contribution 1: Asymmetric data/model decomposition

Compute cost comparison ( r/N = 1/16 )

11 of 22

Contribution 2: Theoretical guarantee of privacy

DP privacy guarantee:

12 of 22

Contribution 3: Theoretical analysis on low-rank structure

SVD-channel entropy:

: the necessary number of principal channels to reconstruct X.

SVD-channel entropy bound in CNNs:

Conv:

ReLU:

Pooling:

BNorm:

13 of 22

Contribution 4: AsymML implementation

Automatically convert model into AsymML format.
Two context (GPU, SGX) runs in parallel.
Support both forward and backward passes

14 of 22

Numerical Evaluation: Training on DNNs

7.5/7.6x faster than SGX-only training on VGG16/VGG19.
5.8/5.8 x faster than SGX-only training on ResNet18/ResNet34

15 of 22

Numerical Evaluation: Inference on DNNs

7.7/11.2x faster than SGX-only inference on VGG16/VGG19.
5.8/7.2 x faster than SGX-only inference on ResNet18/ResNet34
Almost as fast as Slalom

16 of 22

Numerical Evaluation: Training accuracy

blue dash: baseline acc of original models

red dots: accuracy of AsymML

blue arrows: accuracy improvement with the residual part

Training with only low-rank can achieve good accuracy

The residual part is still needed to match accuracy to original models.

17 of 22

Numerical Evaluation: model inversion attack^[4]

Prior knowledge distillation: the attacker trains a GAN model with public knowledge;

Secret revelation: with the residual data, the attacker reconstruct images that has highest accuracy in the target model Mt

Metric:

PSNR

SSIM

Accuracy on the target model

18 of 22

Numerical Evaluation: model inversion attacks

19 of 22

original training data

residual data

reconstructed data

Numerical Evaluation: model inversion attacks

20 of 22

Limitations: CPU-GPU comm.

model: VGG16

Running time breakdown:

Forward (FWD)

Backward (BWD)

CPU-GPU communication

21 of 22

References

[1] Tramer, F. and Boneh, D., Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. In International Conference on Learning Representations (2018).

[2] So, J., Güler, B. and Avestimehr, A.S., CodedPrivateML: A fast and privacy-preserving framework for distributed machine learning. IEEE Journal on Selected Areas in Information Theory (2021).

[3] Hashemi, H., Wang, Y. and Annavaram, M., DarKnight: An accelerated framework for privacy and integrity preserving deep learning using trusted hardware. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (2021).

[4] Zhang, Y., Jia, R., and et al, The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020).