1 of 50

Information-Theoretic Tools for Responsible Machine Learning

Shahab Asoodeh (McMaster University)

Flavio P. Calmon (Harvard University)

Mario Diaz (Universidad Nacional Autónoma de México)

Haewon Jeong (UC Santa Barbara)

2022 IEEE International Symposium on Information Theory (ISIT)

2 of 50

Part I: Overview

Shahab Asoodeh (McMaster University)

Flavio P. Calmon (Harvard University)

Mario Diaz (Universidad Nacional Autónoma de México)

Haewon Jeong (UC Santa Barbara)

2022 IEEE International Symposium on Information Theory (ISIT)

3 of 50

Shannon established

information theory

1948

4 of 50

Shannon established

information theory

Reliable data communication, processing, and storage

1948

1950~2000s

5 of 50

Shannon established

information theory

Reliable data communication, processing, and storage

Fast development of ML

1948

1950~2000s

2000s~2010s

6 of 50

Shannon established

information theory

Reliable data communication, processing, and storage

Fast development of ML

Emerging challenges in ML

1948

1950~2000s

2000s~2010s

Today

7 of 50

Challenges in Responsible Machine Learning

Data-driven algorithms are increasingly applied to individual-level data to support decision-making in applications of individual-level consequence.

8 of 50

Challenges in Responsible Machine Learning

Protests against discriminatory A-levels grading algorithm

Data-driven algorithms are increasingly applied to individual-level data to support decision-making in applications of individual-level consequence.

9 of 50

Challenges in Responsible Machine Learning

Protests against discriminatory A-levels grading algorithm

Data-driven algorithms are increasingly applied to individual-level data to support decision-making in applications of individual-level consequence.

10 of 50

Challenges in Responsible Machine Learning

Protests against discriminatory A-levels grading algorithm

Microsoft “AI” deployed to predict teenage pregnancy in Salta, Argentina

Data-driven algorithms are increasingly applied to individual-level data to support decision-making in applications of individual-level consequence.

11 of 50

Challenges in Responsible Machine Learning

Protests against discriminatory A-levels grading algorithm

Microsoft “AI” deployed to predict teenage pregnancy in Salta, Argentina

Data-driven algorithms are increasingly applied to individual-level data to support decision-making in applications of individual-level consequence.

12 of 50

Privacy

Fairness

13 of 50

Several companies and Governments are investing in fair and private ML…

Privacy

Fairness

14 of 50

Several companies and Governments are investing in fair and private ML…

Privacy

Fairness

15 of 50

Several companies and Governments are investing in fair and private ML…but responsible ML is hard.

Alphabet (Google’s) 10-K filing (Feb 2019):

Microsoft 10-K filing (Aug 2018):

16 of 50

Several companies and Governments are investing in fair and private ML…but responsible ML is hard.

Alphabet (Google’s) 10-K filing (Feb 2019):

Microsoft 10-K filing (Aug 2018):

17 of 50

Several companies and Governments are investing in fair and private ML…but responsible ML is hard.

Alphabet (Google’s) 10-K filing (Feb 2019):

Microsoft 10-K filing (Aug 2018):

18 of 50

Several companies and Governments are investing in fair and private ML…but responsible ML is hard.

Alphabet (Google’s) 10-K filing (Feb 2019):

Microsoft 10-K filing (Aug 2018):

19 of 50

Several companies and Governments are investing in fair and private ML…but responsible ML is hard.

Alphabet (Google’s) 10-K filing (Feb 2019):

Microsoft 10-K filing (Aug 2018):

20 of 50

Several companies and Governments are investing in fair and private ML…but responsible ML is hard.

Alphabet (Google’s) 10-K filing (Feb 2019):

Microsoft 10-K filing (Aug 2018):

How can Information Theory help?

21 of 50

Shannon established

information theory

Reliable data communication, processing, and storage

Fast development of ML

Emerging challenges in ML

1948

1950~2000s

2000s~2010s

Today

22 of 50

Shannon established

information theory

Reliable data communication, processing, and storage

Fast development of ML

Emerging challenges in ML

Can we apply information theory to address these new challenges?

23 of 50

The information-theoretic blueprint

Problem

Model

Mathematical analysis

Practice

24 of 50

The information-theoretic blueprint

Problem

Model

Mathematical analysis

Practice

Usually abstracts away computational considerations and focuses on underlying “information” modeled by probability measures

25 of 50

The information-theoretic blueprint

Problem

Model

Mathematical analysis

Practice

Usually abstracts away computational considerations and focuses on underlying “information” modeled by probability measures

Tools from probability theory, statistics, optimization, functional analysis, discrete math, etc.

26 of 50

The information-theoretic blueprint

Problem

Model

Mathematical analysis

Practice

Usually abstracts away computational considerations and focuses on underlying “information” modeled by probability measures

Tools from probability theory, statistics, optimization, functional analysis, discrete math, etc.

Operational limits, new algorithms, coding techniques, etc.

27 of 50

The information-theoretic blueprint

Problem

Model

Mathematical analysis

Practice

Today’s goal:

Demonstrate that metrics and methods widely used in private and fair ML can be understood using familiar information-theoretic tools and analyzed using the IT blueprint

28 of 50

Imagine that you are a data scientist…

29 of 50

Imagine that you are a data scientist…

Model

(e.g. past grades, questionnaire answers)

(e.g. academic performance)

Classifier

30 of 50

Model

(e.g. past grades, questionnaire answers)

(e.g. academic performance)

Classifier

Examples:

  • Logistic Regression
  • Random Forests
  • Neural networks with soft-max output layer
  • Platt-scaled SVMs
  • (…)

31 of 50

Model

Classifier

(e.g. past grades, questionnaire answers)

(e.g. academic performance)

32 of 50

“Training” the model

Model

Classifier

Model parameters

33 of 50

“Training” the model

Model

Classifier

Model parameters

Training dataset

34 of 50

“Training” the model

Model

Classifier

vs.

35 of 50

“Training” the model

Model

Classifier

vs.

Empirical loss minimization:

36 of 50

Challenge 1: Privacy

Model

Classifier

vs.

Empirical loss minimization:

Model parameters may reveal private information about the training dataset

37 of 50

Challenge 1: Privacy

Training

38 of 50

Challenge 1: Privacy

Training

39 of 50

Challenge 1: Privacy

Training

Differential privacy: neighboring datasets cannot be (statistically) distinguished

40 of 50

Challenge 2: Fairness

Model

Classifier

(e.g. past grades, questionnaire answers)

(e.g. academic performance)

41 of 50

Challenge 2: Fairness

Model

Classifier

(e.g. past grades, questionnaire answers)

(e.g. academic performance)

(e.g. sex, age, race)

Should I use the group attribute as an input to the model?

42 of 50

Challenge 2: Fairness

Does the model performance change conditioned on a group attribute?

Model

Classifier

(e.g. past grades, questionnaire answers)

(e.g. academic performance)

(e.g. sex, age, race)

43 of 50

Tutorial Outline

  • Part II: Information-theoretic Tools for Differential Privacy
  • Part III: (Central) Differential in Privacy in Machine Learning
  • Part IV: Fairness in Machine Learning
  • Part V: Fairness Interventions

44 of 50

Tutorial Outline

Part II

Part III

Part IV

Part V

Challenge

Focus

Privacy

Fairness

Metrics

Methods

45 of 50

Tutorial Outline

Part II

Part III

Part IV

Part V

Challenge

Focus

Privacy

Fairness

Metrics

Methods

46 of 50

Tutorial Outline

Part II

Part III

Part IV

Part V

Challenge

Focus

Privacy

Fairness

Metrics

Methods

47 of 50

Tutorial Outline

  • Part II: Information-theoretic Tools for Differential Privacy
  • Part III: (Central) Differential in Privacy in Machine Learning
  • Part IV: Fairness in Machine Learning
  • Part V: Fairness Interventions

Metrics

48 of 50

Tutorial Outline

  • Part II: Information-theoretic Tools for Differential Privacy
  • Part III: (Central) Differential in Privacy in Machine Learning
  • Part IV: Fairness in Machine Learning
  • Part V: Fairness Interventions

Metrics + Methods

49 of 50

By the end of the tutorial, you will be able to…

  1. Explain key definitions and mechanisms in differential privacy;
  2. Formulate differential privacy metrics in terms of f-divergences, and apply these divergences to prove properties of differentially private algorithms (e.g., composition, privacy-accuracy trade-offs, etc.);
  3. Describe open challenges in differential privacy applied to ML and how they can benefit from information-theoretic tools;
  4. Explain different fairness metrics and how to formulate these metrics mathematically while recognizing their limitations;
  5. Compare and contrast state-of-the-art fairness interventions both in terms of their mathematical formulations and their real-world performance;
  6. Identify fundamental open problems in fair ML.

50 of 50

Admistrivia

Website: https://sites.google.com/view/isit2022tutorial/home

Slack Channel: see website

Code will be available soon!