1 of 18

Scalable Privacy-Preserving Distributed Learning

D. Froelicher, J. R. Troncoso-pastoriza, A. Pyrgelis, S. Sav, J. Sa Sousa, J.-P. Bossuat, J.-P. Hubaux�

Laboratory for Data Security (LDS)

1

2 of 18

Distributed Learning - Motivation

The training of accurate and generalisable machine learning models requires large and diverse datasets

→ Multiple entities collaborate to train a machine learning model on their joint data

2

3 of 18

Distributed Learning - Problem

Sensitive/personal data are difficult to share

Stringent regulations, e.g., GDPR.
Complex/costly data-access agreements
High repercussions in case of data leakage
Competition among stakeholders

→ Sensitive data are often siloed� → Each entity trains independently� → less diversity� → smaller training dataset� → less accurate

3

4 of 18

Distributed Learning - Current Approaches

4

Raw data

(a) Fully centralized

Trusted

3rd party

Examples:

- All of Us

- EGA

- Genomics England

(b) Meta-analysis

Example:

- https://covidclinical.net/�- sPLINK

Aggregated data

Trusted

3rd party

(c) Decentralized

(d) Differential Privacy Decentralized

= Partial Results Obfuscation

Examples:�- M. Kim et al. "Secure and Differentially Private Logistic Regression for Horizontally Distributed Data," TIFS 2019

- M. Abadi et al. Deep learning with differential privacy. In ACM CCS, 2016.

- Chaudhuri and C. Monteleoni. Privacy-preserving logistic regression. In NIPS, 2009.

(e) Cryptographic (SMC, HE) Decentralized

Examples:�- A. Gascón et al.. Privacy-preserving distributed linear regression on high-dimensional data. PETS, 2017.

- P. Mohassel and Y. Zhang. SecureML: A system for scalable privacy-preserving machine learning. In IEEE S&P, 2017.

Aggregated data

Examples:

- http://www.datashield.ac.uk�- Personalized Health Train (PHT)

- vantage6

Secret shared/encrypted data

Data Leakage

Introduce Bias

Limited #parties

SPINDLE (Multiparty Homomorphic Encryption (MHE))

→ Data + Model Confidentiality� as long as 1 entity is honest

→ No data outsourcing

→ Scale with #parties

→ Exact results

5 of 18

Building Blocks

Generic Secure Federated Learning with �Data Confidentiality + Model Confidentiality

5

SPINDLE

Multiparty Homomorphic Encryption (MHE)

Cooperative Gradient Descent

Extended MapReduce Abstraction

Generalized Linear Models:�- linear and logistic regressions

- Multinomial regression

6 of 18

MapReduce for Distributed Learning

6

DP₁

DP₂

DP₃

DP₄

DP₅

DP₆

DP₇

PREPARE

MapReduce Distributed Learning

PREPARE:�DPs agree on secu/learning params.

Each DP_i pre-process its data

DPs agree on initial global model

DP = Data Provider

7 of 18

MapReduce for Distributed Learning

7

MapReduce Distributed Learning

PREPARE

DP₁

DP₂

DP₃

DP₄

DP₅

DP₆

DP₇

MAP → W₁

MAP → W₂

MAP → W₇

MAP → W₆

MAP → W₅

MAP → W₄

MAP → W₃

Iterate

MAP:

Each DP_i locally trains on its data and uses the global model W_G to update its local model W_i

8 of 18

MapReduce for Distributed Learning

8

DP₁

DP₂

DP₃

DP₄

DP₅

DP₆

DP₇

Iterate

MAP:�Each DP_i locally train on its data and uses the global model W_G to update its local model W_i

COMBINE → W₄+ W₃+ W₅

COMBINE → W₂+ W₆+ W₇

COMBINE → ∑ W_i

COMBINE:

All W_iare combined

MapReduce Distributed Learning

PREPARE

9 of 18

MapReduce for Distributed Learning

9

DP₁

DP₂

DP₃

DP₄

DP₅

DP₆

DP₇

Iterate

MAP:

Each DP_i locally train on its data and uses the global model W_G to update its local model W_i

COMBINE:

All W_iare combined

REDUCE:�DP₁updates the global model W_G

REDUCE → W_G= f(∑ W_i, W_G^(iter-1))

MapReduce Distributed Learning

PREPARE

10 of 18

MapReduce for Distributed Learning

10

DP₁

DP₂

DP₃

DP₄

DP₅

DP₆

DP₇

Iterate

MAP:

Each DP_i locally train on its data and uses the global model W_G to update its local model W_i

COMBINE:

All W_iare combined

REDUCE:�DP₁updates the global model W_G

REDUCE → W_G= f(∑ W_i, W_G^(iter-1))

PREDICTION:

DP_iuses W_G to predict on new data

MapReduce Distributed Learning

PREPARE

11 of 18

MapReduce for Secure Distributed Learning

11

DP₁

DP₂

DP₃

DP₄

DP₅

DP₆

DP₇

Iterate

MAP:

Each DP_i locally train on its data and uses the global model ⟨W_G⟩ to update its local model ⟨W_i⟩

COMBINE:

All ⟨W_i⟩ are combined

REDUCE:�DP₁updates the global model ⟨W_G⟩

PREDICTION:

DP_iuses ⟨W_G⟩ to predict on ⟨new data⟩

= f( )

Secret Key

Public Key

Collective Public Key

MapReduce Secure Distributed Learning

PREPARE

12 of 18

Multiparty Homomorphic Encryption (MHE)

12

Public collective key

[1] J. H. Cheon, A. Kim, M. Kim, and Y. Song. Homomorphic encryption for arithmetic of approximate numbers. In ASIACRYPT, 2017.

[2] C. Mouchet, J. R. Troncoso-pastoriza, J.-P. Bossuat, and J. P. Hubaux. Multiparty homomorphic encryption: From theory to practice. In PETS’21 on Thursday 15th of July in Session 7B.

Each DP can independently:

Encrypt vector of values in one ciphertext with
Compute (add, mult, rotations, …) on ciphertexts encrypted with

All DPs collaborate to:

Decrypt ciphertexts encrypted with
Replace costly cryptographic operations by lightweight interactive protocols

We rely on the adaptation to CKKS^[1] of the multiparty scheme proposed by Mouchet et al. ^[2]

DPs’ public keys (corresponding secret keys are never revealed)

= f( )

Ciphertext: E (v₁, …., v_N)

13 of 18

Extended MapReduce for Secure Distributed Learning

13

Iterate

MAP:

Each DP_i locally train on its data and uses the global model ⟨W_G⟩ to update its local model ⟨W_i⟩

COMBINE:

All ⟨W_i⟩ are combined

REDUCE:�DP₁updates the global model ⟨W_G⟩

PREDICTION:

DP_iuses ⟨W_G⟩ to predict on new data

= f( )

MapReduce Secure Distributed Learning

PREPARE

Stochastic Gradient Descent:

features

Samples’ batch

x

Activation

x

f(features, samples’ batch)

x

Row-based Approach

Diagonal Approach

Activation

:

Encrypted

Cleartext

Least Square Approx.

Input Dimension

Encrypted

Cleartext

14 of 18

Extended MapReduce for Secure Distributed Learning

14

=

+

Iterate

MAP:

Each DP_i locally train on its data and uses the global model ⟨W_G⟩ to update its local model ⟨W_i⟩

COMBINE:

All ⟨W_i⟩ are combined

REDUCE:�DP₁updates the global model ⟨W_G⟩

PREDICTION:

DP_iuses ⟨W_G⟩ to predict on new data

= f( )

MapReduce Secure Distributed Learning

PREPARE

Parameterization

Input dimensions

Cryptographic/Security Parameters

Approximation Parameters

Learning Parameters

15 of 18

SPINDLE Evaluation: Accuracy

SPINDLE achieves accuracy close to centralized solution and (almost) same accuracy as non-secure distributed solutions

15

Logistic regression

L.R. one-Vs-all

Multinomial Regression

Evaluation Parameters

10 Data providers

128-bit security level

Legend

Dataset: Name [#samples x #features]

(1) Pima = Pima Indians Diabetes https://www.kaggle.com/uciml/pima-indians-diabetes-database

(2) BCW = Breast cancer Wisconsin (original)

https://archive.ics.uci.edu/ml/datasets/ breast+cancer+wisconsin+(original)

(3,4) MNIST

Y. LeCun and C. Cortes. Handwritten digit database. 2010.

16 of 18

SPINDLE Evaluation: Performance

16

5 data providers, 25600 record, 128-bit security; Each data provider: 2 Intel Xeon E5-2680 v3 CPUs, 2.5GHz frequency, 24 threads on 12 cores, 256GB RAM. Communication: 100Mbps, delay 20ms

Better than logarithmic increase with the number of features

17 of 18

SPINDLE Evaluation: Performance

17

128-bit security level; Default number of features = 32, |S|= # data providers; n = global dataset size, b is the batch size used in the stochastic gradient descent, One data provider: 2 Intel Xeon E5-2680 v3 CPUs, 2.5GHz frequency, 24 threads on 12 cores, 256GB RAM. Communication: 100Mbps, delay 20ms

Scales almost independently with �the number of data providers |S|

Scales linearly with the size of �the data providers datasets n

Efficient workload distribution

18 of 18

Conclusion

18

Generic framework to perform a cooperative gradient descent

Model and data confidentiality over the entire ML workflow
Data stays at the data providers’ premises
Similar accuracy as non-secure distributed solutions
Same tradeoff as non-secure solutions: performance vs. accuracy and not between privacy and accuracy
Scales with the number of DPs, with the DPs’ dataset sizes and with the number of features

Future Work: build on our widely-applicable solution to�

Securely train more complex ML models�Sinem Sav, Apostolos Pyrgelis, Juan R. Troncoso-Pastoriza, David Froelicher, Jean-Philippe Bossuat, Joao Sa Sousa, �and Jean-Pierre Hubaux. "POSEIDON: Privacy-preserving federated neural network learning." In NDSS 21.�
Target specific applications, e.g., Genome-Wide Association Study�David Froelicher, Juan R. Troncoso-Pastoriza, Jean Louis Raisaro, Michel Cuendet, Joao Sa Sousa, Jacques Fellay, and Jean-Pierre Hubaux. "Truly Privacy-Preserving Federated Analytics for Precision Medicine with Multiparty Homomorphic Encryption." bioRxiv (2021).

lds.epfl.ch