1 of 22

Cyber Lab

Spring 2024 - Week 7

https://l.acmcyber.com/s24-w7-lab

2 of 22

📣 Announcements

  • Cyber Academy: Professor Lixia Zhang
    • Talking about her time at the start of the internet!
    • Next Monday 6-8pm @ Boelter 4760
  • ⛳️PBR: R3CTF
    • This Saturday
    • Meet for lunch 12pm @ De Neve Plaza
    • CTF is 1-6pm @ Boelter Penthouse (8500)
  • 👾 Cyber X Studio Social:
    • Pizza, snacks, and video/board games!
    • Friday, May 31 (Week 9) in Boelter 4760
  • 🏀Cyber Basketball: Friday 6-8pm @ Hitch courts

3 of 22

AI Privacy

4 of 22

Overview

  • Confidentiality of training data
    • Adherence to regulations
    • Right to be forgotten, GDPR, CCPA, etc.
  • Protect sensitive data
    • e.g. health data, insurance data
  • Ensure protection from identity-based attacks
    • e.g. reverse engineering facial recognition data

5 of 22

Risks?

  • AI is a function of all the data it uses
    • Data is hard to remove from the system
    • Data can be easily reconstructed
    • Membership inference, attribution inference
  • AI can be biased

Privacy-preserving machine learning (PPML)

  • Federated learning
  • Applied cryptography

Data rights

  • Differential privacy
  • Machine unlearning

Solutions?

6 of 22

AI and Data Background

  • Recall: ML is learning features and techniques from data
  • Challenge: the data we train on can contain sensitive information which must be secured
    • C.I.A. challenges

7 of 22

Model Inversion/Attribute Inference

  • A black-box attack to infer if private attributes of data from model output
    • What are sensitive attributes?

8 of 22

Review: Membership Inference

  • A black-box attack to infer if a specific datapoint was used to train a model or not

9 of 22

Fredrikson’s Attack - Overview

  • White-box method to reconstruct images from from vision models
  • Threat model:
    • We have access to an API we can query for classification, along with logits
    • We do not have access to internals of the model or training data

10 of 22

Fredrikson’s Attack

  • Intuition: based on the innate overfitting of features in classification
    • This can leak membership information because of model memorization

11 of 22

Fredrikson’s Attack - Results

12 of 22

Differential Privacy - a Solution

  • A mathematical way to leverage the “important” information from a dataset without revealing individual datapoints
  • introduce randomness�into the dataset that�does not affect�overall features to�be learned

13 of 22

Differential Privacy - Approaches

  • Definition: making neighboring datasets near-impossible to distinguish, to reduce the impact of any individual
  • Goal: model-agnostic private learning

14 of 22

Differential Privacy

  • We often use the Laplace method for DP
  • Typically, the Laplace mechanism is defined as����where each of Yi are independent Laplace(delta/epsilon) random variables

15 of 22

Differential Privacy - Alternatives

  • Local-DP: noise can be determined and added locally on sources before data is aggregated for training
    • e.g. on device DP
  • DP-SGD: noise added to gradients during training
    • PII still not “learned”
  • DP can also be selective, better privacy-usability tradeoff

16 of 22

Federated Learning

17 of 22

Federated Learning

  • Protects your data from even being in the hands of ML companies
  • A user has a local copy of an ML model
  • Based on user interaction with the model, their personal copy of the model is updated locally
    1. A local weight update is calculated
  • These weight updates are sent to a server and aggregated to form a consensus change
  • Shared model is then used globally

Comic introducing FL

18 of 22

Secure MPC - “Sharing” without Sharing

  • Challenge: adversaries can still partially reveal each participant’s training data based on the parameters they upload
  • A cryptographic protocol that allows multiple parties (devices) to secure compute a function (training) over their inputs while keeping inputs private
  • Allows for truly secure aggregation of weight updates

19 of 22

More:

  • Applied crypto for PPML
    • Homomorphic encryption
  • Data minimization
  • Privacy-preserving data aggregation

Governance and policy

  • GDPR: right to be forgotten, data portability, breach notification
  • CCPA: right to delete data, opt-out of data sale
  • HIPAA (health), ADPPA (privacy law), AI Act (EU)
  • ...

20 of 22

More Reading

21 of 22

Questions?

22 of 22

Thanks for coming! ❤️