1 of 22

Cyber Lab

Spring 2024 - Week 7

https://l.acmcyber.com/s24-w7-lab

2 of 22

📣 Announcements

Cyber Academy: Professor Lixia Zhang

Talking about her time at the start of the internet!
Next Monday 6-8pm @ Boelter 4760

⛳️PBR: R3CTF

This Saturday
Meet for lunch 12pm @ De Neve Plaza
CTF is 1-6pm @ Boelter Penthouse (8500)

👾 Cyber X Studio Social:

Pizza, snacks, and video/board games!
Friday, May 31 (Week 9) in Boelter 4760

🏀Cyber Basketball: Friday 6-8pm @ Hitch courts

3 of 22

AI Privacy

4 of 22

Overview

Confidentiality of training data

Adherence to regulations
Right to be forgotten, GDPR, CCPA, etc.

Protect sensitive data

e.g. health data, insurance data

Ensure protection from identity-based attacks

e.g. reverse engineering facial recognition data

5 of 22

Risks?

AI is a function of all the data it uses

Data is hard to remove from the system
Data can be easily reconstructed
Membership inference, attribution inference

AI can be biased

Privacy-preserving machine learning (PPML)

Federated learning
Applied cryptography

Data rights

Differential privacy
Machine unlearning

Solutions?

6 of 22

AI and Data Background

Recall: ML is learning features and techniques from data
Challenge: the data we train on can contain sensitive information which must be secured

C.I.A. challenges

7 of 22

Model Inversion/Attribute Inference

A black-box attack to infer if private attributes of data from model output

What are sensitive attributes?

8 of 22

Review: Membership Inference

A black-box attack to infer if a specific datapoint was used to train a model or not

9 of 22

Fredrikson’s Attack - Overview

White-box method to reconstruct images from from vision models
Threat model:

We have access to an API we can query for classification, along with logits
We do not have access to internals of the model or training data

10 of 22

Fredrikson’s Attack

Intuition: based on the innate overfitting of features in classification

This can leak membership information because of model memorization

While some features are truly discriminative (e.g. face features), others are only on a particular training set by coincidence

This is another way of saying that’s the evidence of overfitting
For example, consider a hypothetical model trained to recognize celebrity faces. Suppose that in reality, each celebrity is wearing sunglasses in 10% of his or her respective pictures, so the presence of sunglasses is not an informative feature for this task. However, if the training data used to construct the model contained images of a particular subject wearing sunglasses with greater frequency, say 30%, then the model might learn a feature that detects sunglasses in an internal layer, and weight this feature towards prediction of that subject. Knowing that the presence of sunglasses is not predictive of identity on the true distribution, an attacker would infer that, all else being equal, a picture of this subject wearing sunglasses is more likely to be a training set member
Given confidence scores on specific features, we can somehow “learn” features and reconstruct

11 of 22

Fredrikson’s Attack - Results

While some features are truly discriminative (e.g. face features), others are only on a particular training set by coincidence

This is another way of saying that’s the evidence of overfitting
For example, consider a hypothetical model trained to recognize celebrity faces. Suppose that in reality, each celebrity is wearing sunglasses in 10% of his or her respective pictures, so the presence of sunglasses is not an informative feature for this task. However, if the training data used to construct the model contained images of a particular subject wearing sunglasses with greater frequency, say 30%, then the model might learn a feature that detects sunglasses in an internal layer, and weight this feature towards prediction of that subject. Knowing that the presence of sunglasses is not predictive of identity on the true distribution, an attacker would infer that, all else being equal, a picture of this subject wearing sunglasses is more likely to be a training set membe

12 of 22

Differential Privacy - a Solution

A mathematical way to leverage the “important” information from a dataset without revealing individual datapoints
introduce randomness�into the dataset that�does not affect�overall features to�be learned

While some features are truly discriminative (e.g. face features), others are only on a particular training set by coincidence

This is another way of saying that’s the evidence of overfitting
For example, consider a hypothetical model trained to recognize celebrity faces. Suppose that in reality, each celebrity is wearing sunglasses in 10% of his or her respective pictures, so the presence of sunglasses is not an informative feature for this task. However, if the training data used to construct the model contained images of a particular subject wearing sunglasses with greater frequency, say 30%, then the model might learn a feature that detects sunglasses in an internal layer, and weight this feature towards prediction of that subject. Knowing that the presence of sunglasses is not predictive of identity on the true distribution, an attacker would infer that, all else being equal, a picture of this subject wearing sunglasses is more likely to be a training set member
Given confidence scores on specific features, we can somehow “learn” features and reconstruct

13 of 22

Differential Privacy - Approaches

Definition: making neighboring datasets near-impossible to distinguish, to reduce the impact of any individual
Goal: model-agnostic private learning

While some features are truly discriminative (e.g. face features), others are only on a particular training set by coincidence

This is another way of saying that’s the evidence of overfitting
For example, consider a hypothetical model trained to recognize celebrity faces. Suppose that in reality, each celebrity is wearing sunglasses in 10% of his or her respective pictures, so the presence of sunglasses is not an informative feature for this task. However, if the training data used to construct the model contained images of a particular subject wearing sunglasses with greater frequency, say 30%, then the model might learn a feature that detects sunglasses in an internal layer, and weight this feature towards prediction of that subject. Knowing that the presence of sunglasses is not predictive of identity on the true distribution, an attacker would infer that, all else being equal, a picture of this subject wearing sunglasses is more likely to be a training set member
Given confidence scores on specific features, we can somehow “learn” features and reconstruct

14 of 22

Differential Privacy

We often use the Laplace method for DP
Typically, the Laplace mechanism is defined as��where each of Y_iare independent Laplace(delta/epsilon) random variables

While some features are truly discriminative (e.g. face features), others are only on a particular training set by coincidence

This is another way of saying that’s the evidence of overfitting
For example, consider a hypothetical model trained to recognize celebrity faces. Suppose that in reality, each celebrity is wearing sunglasses in 10% of his or her respective pictures, so the presence of sunglasses is not an informative feature for this task. However, if the training data used to construct the model contained images of a particular subject wearing sunglasses with greater frequency, say 30%, then the model might learn a feature that detects sunglasses in an internal layer, and weight this feature towards prediction of that subject. Knowing that the presence of sunglasses is not predictive of identity on the true distribution, an attacker would infer that, all else being equal, a picture of this subject wearing sunglasses is more likely to be a training set member
Given confidence scores on specific features, we can somehow “learn” features and reconstruct

15 of 22

Differential Privacy - Alternatives

Local-DP: noise can be determined and added locally on sources before data is aggregated for training

e.g. on device DP

DP-SGD: noise added to gradients during training

PII still not “learned”

DP can also be selective, better privacy-usability tradeoff

While some features are truly discriminative (e.g. face features), others are only on a particular training set by coincidence

This is another way of saying that’s the evidence of overfitting
For example, consider a hypothetical model trained to recognize celebrity faces. Suppose that in reality, each celebrity is wearing sunglasses in 10% of his or her respective pictures, so the presence of sunglasses is not an informative feature for this task. However, if the training data used to construct the model contained images of a particular subject wearing sunglasses with greater frequency, say 30%, then the model might learn a feature that detects sunglasses in an internal layer, and weight this feature towards prediction of that subject. Knowing that the presence of sunglasses is not predictive of identity on the true distribution, an attacker would infer that, all else being equal, a picture of this subject wearing sunglasses is more likely to be a training set member
Given confidence scores on specific features, we can somehow “learn” features and reconstruct

16 of 22

Federated Learning

17 of 22

Federated Learning

Protects your data from even being in the hands of ML companies
A user has a local copy of an ML model
Based on user interaction with the model, their personal copy of the model is updated locally

A local weight update is calculated

These weight updates are sent to a server and aggregated to form a consensus change
Shared model is then used globally

Comic introducing FL

18 of 22

Secure MPC - “Sharing” without Sharing

Challenge: adversaries can still partially reveal each participant’s training data based on the parameters they upload
A cryptographic protocol that allows multiple parties (devices) to secure compute a function (training) over their inputs while keeping inputs private
Allows for truly secure aggregation of weight updates

19 of 22

More:

Applied crypto for PPML

Homomorphic encryption

Data minimization
Privacy-preserving data aggregation

Governance and policy

GDPR: right to be forgotten, data portability, breach notification
CCPA: right to delete data, opt-out of data sale
HIPAA (health), ADPPA (privacy law), AI Act (EU)
...

20 of 22

21 of 22

Questions?

22 of 22

Thanks for coming! ❤️