1 of 64

Determining Fairness Goals when Designing AI Systems

Fairness Tree Workshop at APPAM 2023

https://dssg.github.io/fairness_tree_workshop/

Rayid Ghani

Kit T Rodolfa

Lingwei Cheng

2 of 64

Before we start

3 of 64

Agenda

1:45pm

Introduction and Goals

1:55pm

Case Studies - Using ML for Policy Problems

2:10pm

Machine Learning, Fairness, and Equity - An Overview

2:30pm

Defining the Goals - Breakout

3:10pm

Break

3:25pm

Impact of Actions and Interventions - Breakout

4:05pm

Determining Fairness Metrics - Breakout

4:45pm

Discussion

5:00pm

Wrap-up: Things to Remember and Additional Resources (All)

4 of 64

About us

5 of 64

11 MILLION

people move through 3,100 Jails

in cost

22 BILLION

suffer from mental illness

64%

have a substance abuse disorder

68%

suffer from chronic health problems

44%

Reducing jail recidivism with proactive mental health interventions (Johnson County, KS)

6 of 64

Reducing mental and behavioral health crisis through proactive mental health interventions (Johnson and Douglas Counties, KS)

7 of 64

Using rental assistance support resources to prevent homelessness (Allegheny County, PA)

8 of 64

9 of 64

A Crisis in Tax Administration

10 of 64

Combined Human-AI/ML Systems

Allocation of Limited Resources

Balancing goals of equity, efficiency, and effectiveness

How do we develop responsible Human-AI collaborative systems to help make decisions that lead to fair and equitable outcomes?

11 of 64

How we (should) design AI systems

What values should the AI system be designed to achieve?

How do we build it to achieve those values?

How do we validate and monitor that it continues to achieve those values?

Values need to be explicitly embedded in every phase of the process

From scoping to design to development to deployment to monitoring

12 of 64

What values should we design for?

Fairness

Explainability

Robustness

Privacy

Transparency

Inclusiveness

Accountability

13 of 64

Objectives of this Workshop: Learn how to...

  1. Think about overall fairness and equity goals when building Data Science/ML/AI systems�
  2. Understand and elicit the fairness concerns and goals of various stakeholders�
  3. Map stakeholders goals to fairness goals to ML fairness metrics for designing, deploying, and evaluating AI systems

14 of 64

Part 1

Think about overall fairness and equity when building Data Science/ML/AI systems

15 of 64

The goal is not to make the ML model fair but to

make the overall system and outcomes fair

16 of 64

The goal is not to make the ML model fair but to

make the overall system and outcomes fair

AI/ML Model

Actions

Outcomes

17 of 64

Compared to what?

Current (Human) Decisions

Actions

Outcomes

Does the new system need to be perfect or can it be better than the status quo and still worth implementing?

18 of 64

There are (unfortunately) many sources of bias

...it’s not (just) the data

18

World

AI/ML Pipeline

Actions

Outcomes

Data

(Optional)Human Review

19 of 64

How do we make the overall system and outcomes fair ?

20 of 64

What is/are the desired fairness goal(s)?

Scenario 1: Prioritizing patients for diabetes screening

21 of 64

What is/are the desired fairness goal(s)?

Scenario 2: Identifying police officers for early interventions to prevent adverse incidents

22 of 64

Many Bias Measures: How do we select what we care about?

  • Statistical/Demographic Parity
  • Impact Parity
  • False Discovery Rate (1 - Precision) Parity
  • False Omission Rate Parity
  • False Positive Rate Parity
  • False Negative Rate (1 - Recall) Parity
  • ...

23 of 64

Many Bias Measures: How do we select what we care about?

24 of 64

25 of 64

Consider Three Metrics…

  • False Positive Rate�
  • False Negative Rate�
  • False Discovery Rate

26 of 64

Consider Three Metrics…

  • False Positive Rate: Among people who do not recidivate, the proportion the model incorrectly predicts will recidivate�
  • False Negative Rate: Among people who do recidivate, the proportion the model incorrectly predicts will not recidivate�
  • False Discovery Rate: Among the people who the model predicts will recidivate, how many actually do not

27 of 64

28 of 64

ProPublica identified considerable disparities:�

  • The algorithm was twice as likely to incorrectly predict black individuals were at high risk of recidivating (FPR)�
  • It was also nearly twice as likely to incorrectly predict white individuals were at low risk of recidivating (FNR)

29 of 64

However, the creator of the algorithm pointed out that the algorithm is well-balanced across races on precision (equivalently, FDR), claiming this is the correct measure of fairness in this context.

30 of 64

Who is right?

Is the COMPAS algorithm biased?

Can’t the algorithm achieve both� measures of fairness at the same time?

31 of 64

32 of 64

Incompatibility Between Fairness Metrics

33 of 64

Incompatibility Between Fairness Metrics

Prevalence

Fraction of

actual 1’s in

population

False Negative Rate

Among all actual 1’s,

fraction predicted to be 0

False Positive Rate

Among all actual 0’s,

fraction predicted to be 1

False Discovery Rate

Among all predicted 1’s,

fraction that are actual 0’s

=(1 – precision)

Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2), 153-163.

34 of 64

Incompatibility Between Fairness Metrics

If prevalence is unequal across groups...

35 of 64

Incompatibility Between Fairness Metrics

If prevalence is unequal across groups...

…and FDR (or precision)

is equal across groups...

36 of 64

Incompatibility Between Fairness Metrics

If prevalence is unequal across groups...

…and FDR (or precision)

is equal across groups...

…then either FPR or FNR can be equal across groups, but not both

37 of 64

Does that mean we cannot achieve fairness in ML models?

38 of 64

Policy Menu

Designing for Efficiency

72.7% Efficient

Equality

Additional Cost: 2%

Equity

Additional Cost: 2%

39 of 64

Breakout Session 1: Determining the Goals

40 of 64

Breakout Session 1: Determining the Goals

Who are the stakeholders?

What is their perspective?

  • On what the system should help achieve?
  • If we prioritize efficiency as a goal?
  • If we prioritize fairness and equity as a goal?

41 of 64

Case Studies

Child Welfare

Homelessness and Rental Assistance

Tax Audits

42 of 64

Case Studies - Breakout on Goals (Summary)

43 of 64

Actions/Interventions

AI/ML Model

Actions

Outcomes

44 of 64

Breakout Session 2: Determining the Benefits and the Costs/Harms of the Actions/Interventions

Allocate Intervention

Not Allocate Intervention

Have need/warranted

Assistive or Punitive?

Assistive or Punitive?

Do not have need/unwarranted

Assistive or Punitive?

Assistive or Punitive?

45 of 64

Breakout Session 2: Determining the Benefits and the Costs/Harms of the Actions/Interventions

Diabetes Screening

Allocate Intervention

Not Allocate Intervention

Have need/warranted

Patient: Help

Patient: Harm

Do not have need/unwarranted

Patient: Neutral-ish

Program Administrator: Waste of Money

Patient: Neutral�Program Administrator: Positive

46 of 64

Case Studies - Breakout on Actions/Interventions (Summary)

Role

Punitive, Assistive, Both

47 of 64

Breakout Session 3: Determining Fairness Metrics to Prioritize

48 of 64

Fairness Tree

49 of 64

50 of 64

Fairness Tree (Zoomed in)

51 of 64

Diabetes Screening

Allocate Intervention

Not Allocate Intervention

Have need/warranted

Patient: Help

Patient: Harm

Do not have need/unwarranted

Patient: Neutral-ish

Program Administrator: Waste of Money

Patient: Neutral�Program Administrator: Positive

52 of 64

Case Studies - Breakout on Fairness Metrics (Summary)

Role

Metrics to prioritize

Program Administrator: allocating resources to those who don’t need it is much worse than missing someone who may need it

False Positive Rate

Person being affected: missing people who need it is much worse than providing it to people who may not need it

False Negative Rate

Punitive: False Positive Rate(s) “Parity”

Assistive: False Negative Rate(s) Parity”

53 of 64

Is the fairness tree “the answer”?

No… but it’s intended as a starting point to help guide a conversation between ML experts, policy makers, and those affected by the decisions.

Ultimately, the choice of fairness metric(s) is highly dependent on context and stakeholder values.

54 of 64

How do we make the overall system and outcomes fair ?

55 of 64

Wrap-Up

56 of 64

The goal is not to make the ML model fair but to

make the overall system and outcomes fair

AI/ML Model

Actions

Outcomes

57 of 64

Things to remember

Make bias, fairness, and equity an integral part of every project: Scoping, community engagement, metrics, validation, monitoring outcomes

Understand how different phases of the project could lead to downstream bias

All bias metrics are not created equal - use the Fairness Tree to understand your problem/use case and select appropriate metrics

Audit and Explore bias reduction strategies

A perfectly fairn model does not mean fair outcomes. Think about the entire system (including actions) and measure outcomes

Compared to what?

c

58 of 64

Some useful practices

Create an environment where informed ethical discussions can take place

Talk through ethical issues at each stage of the project (instead of waiting till the end of stopping after the initial setup)

Consider the entire chain of data - collection to analysis to action

Consider how it affects people throughout the chain – especially the people being affected (and include them in these discussions)

Embed ethics into both technical processes as well as people processes

59 of 64

Additional Resources

  • Open Source Data Science Tools
    • Triage: ML Toolkit
    • Aequitas: Bias Audit Tool
    • Code for all projects: www.github.com/dssg

60 of 64

61 of 64

62 of 64

Resources

63 of 64

How do we scope data science projects?�More details at http://www.datasciencepublicpolicy.org/resources/data-science-project-scoping-guide/

Goals: Define the goal(s) of the project (equity, efficiency, effectiveness, etc.)

Actions: What actions/interventions will you inform?

Data: What data do you have internally?

What data do you need? � What can you augment from external and public sources?

Analysis: What analysis needs to be done? How will it be validated? How will the analysis achieve the goals defined above?

64 of 64