1 of 38

Trustworthy AI: a lens on fairness

Jessica Schrouff - GSK

2 of 38

Startup experience: AI for code

DeepMind / Google Health: AI for healthcare

Google Research / DeepMind: responsible AI

GSK: Director of Responsible AI

PhD in electrical Engineering, neuroscience focus

Post-doctorate(s) split across medical and CS contexts

Jessica Schrouff (she/her)

3 of 38

Safe

Trustworthy

Responsible

Ethics

Alignment

Robustness

Privacy

Interpretability

Fairness

Long-term

4 of 38

[1] Papagiannadis et al., 2024. https://doi.org/10.1016/j.jsis.2024.101885

AI Principles and sub-dimensions

Principle

Sub-dimensions

Accountability

Auditability, responsibility

Diversity, non-discrimination and fairness

Accessibility, no unfair bias

Human agency and oversight

Human review and recourse, human well-being

Privacy and data quality

Data quality and privacy, lawful access

Technical robustness and safety

Accuracy, reliability under changing inputs or contexts, general safety, resilience (to attacks)

Transparency

Explainability, communication and traceability

Social and environmental well-being

5 of 38

Unfairness:= Disparities in model output across demographic groups.

Fairness definition

[1] Barocas, S., Hardt, M. & Narayanan, A. Fairness and Machine Learning. https://fairmlbook.org/ (2019).

[2] Dwork et al. Fairness through awareness. ITCS '12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (2012).

[3] Kusner et al. Counterfactual fairness. 31st Conference on Neural Information Processing Systems (NIPS 2017).

[4] https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

[5] Reuters.link.

[6] Obermeyer et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science (2019).

[1,2,3]

Criminal justice [4]

Hiring [5]

Medicine [6]

6 of 38

Case study: bias in EHR [1]

[1] Obermeyer et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science (2019).

  • US healthcare system

  • High risk patient care management
    • Identify patients with poor health
    • Enrol them in care management programs
    • Resource-constrained

  • Algorithmic solution: predict patient health needs

7 of 38

Case study: bias in EHR [1]

[1] Obermeyer et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science (2019).

8 of 38

Case study: bias in EHR [1]

[1] Obermeyer et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science (2019).

  • Origins of the bias:
    • Unequal access to care
    • Lower expenses for Black patients
    • Black patients more ill when treated
    • Biased label, with cost being a poor proxy for health needs

9 of 38

Sources of unfairness

10 of 38

Biases in society

Figure from Leslie et al., 2021. Does “AI” stand for augmenting inequality in the era of covid-19 healthcare? BMJ.

[1] NHS population screening: identifying and reducing inequalities

[2] Sirugo et al., 2019. The Missing Diversity in Human Genetic Studies. Cell.

[3] Why we know so little about women’s health

[4] Women’s health research lacks funding

[5] Ebede. 2006. Disparities in dermatology educational resources. J Am Acad Dermatol.

  • Screening programs not available to all [1]
  • Clinical studies excluding certain subgroups [2,3]
  • Understudied health issues [4]
  • Biased clinical training [5]

11 of 38

Biases in data

Figure from Leslie et al., 2021. Does “AI” stand for augmenting inequality in the era of covid-19 healthcare? BMJ.

[1] Liu et al., 2020 A deep learning system for differential diagnosis of skin diseases. Nat. Med.

[2] Mullainathan and Obermeyer. 2021. On the Inequality of Predicting A While Hoping for B. AER Papers and Proceedings

[3] Rajkomar et al., 2018. Ensuring Fairness in Machine Learning to Advance Health Equity. Ann Int. Med.

  • Subgroups poorly represented [1]
  • Biased labels [2]
  • Data not missing at random (cohort or variables) [3]
  • Data granularity [3]

12 of 38

Biases in model building

Figure from Leslie et al., 2021. Does “AI” stand for augmenting inequality in the era of covid-19 healthcare? BMJ.

[1] Kasy and Abebe. 2021. Fairness, Equality, and Power in Algorithmic Decision-Making. FAccT.

[2] Asiedu et al. 2024. The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa. EAAMO.

[3] Hooker. 2021. Moving beyond “algorithmic bias is a data problem”. Patterns.

[4] D’Amour et al., 2022. Underspecification Presents Challenges for Credibility in Modern Machine Learning. JMLR.

  • Metric selection [1]
  • Out of context metrics [2]
  • Training choices, such as regularization [3]
  • Seed a model was trained on [4]

13 of 38

Biases in deployment

Figure from Leslie et al., 2021. Does “AI” stand for augmenting inequality in the era of covid-19 healthcare? BMJ.

[1] Ge et al., 2025. Rethinking Algorithmic Fairness for Human-AI Collaboration. Informs.

  • Selective compliance [1]

14 of 38

Biases in the system

Figure from Leslie et al., 2021. Does “AI” stand for augmenting inequality in the era of covid-19 healthcare? BMJ.

[1] Chen et al., 2021. Ethical Machine Learning in Healthcare. Annual Review of Biomedical Data Science.

[2] Gohar et al., 2024. Long-Term Fairness Inquiries and Pursuits in Machine Learning: A Survey of Notions, Methods, and Challenges. Arxiv.

  • Feed into each other
  • Important to consider fairness and equity across the ML pipeline [1]
  • Long term fairness impacts [2]

15 of 38

Measures of unfairness

16 of 38

Statistical (group) fairness

[1] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

[2] Dwork et al., 2012. Fairness through awareness.

 

 

 

17 of 38

Statistical (group) fairness

[1] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

[2] Hardt et al., 2016. Equality of Opportunity in Supervised Learning. NeurIPS.

 

 

 

18 of 38

Statistical (group) fairness

[1] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

[2] Hardt et al., 2016. Equality of Opportunity in Supervised Learning. NeurIPS.

 

  • Compare ROC curves per group
  • TPR[A=a] – TPR[A=b]
  • |TPR[A=a] – TPR[A=b]| + |FPR[A=a] – FPR[A=b]|

 

19 of 38

Statistical (group) fairness

[1] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

 

 

 

20 of 38

Statistical (group) fairness

[1] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

[2] Chouldechova,. 2016. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. FAccT.

 

  • Compare PPV and NPV per group [2]
  • Calibration curves per group [2]

 

21 of 38

Case study: bias in EHR [1]

[1] Obermeyer et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science (2019).

22 of 38

Statistical (group) fairness

[1] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

[2] Chouldechova,. 2016. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. FAccT.

  • Impossibility theorem: cannot satisfy all criteria simultaneously

Example:

- Group a: P(Y=1) = 0.4

- Group b: P(Y=1) = 0.6

- Perfect classifier: not independent

23 of 38

Individual fairness

[1] Dwork et al., 2012. Fairness through awareness.

 

  • Based on metric definitions d and D

24 of 38

Causal fairness

[1] Kusner.et al., 2017. Counterfactual fairness.

[2] Chiappa. Path-Specific counterfactual fairness.

[3] Barocas et al., 2023. Fairness and Machine Learning. https://fairmlbook.org/

[4] Veitch et al., 2021. Counterfactual invariance to spurious correlations: why and how to pass stress tests. NeurIPS.

  • Based on causal interventions or counterfactuals [1,2,3]
  • Not equivalent to replacing an attribute!
  • For some tasks, group fairness criteria match causal fairness [4]

25 of 38

Fairness and other RAI fields

26 of 38

Relationship with other fields [1]

[1] NeurIPS 2023 tutorial by G. Farnadi, E. Creager and Q.V. Liao

[2] Veitch et al., 2021. Counterfactual invariance to spurious correlations: why and how to pass stress tests. NeurIPS.

[3] Makar et al., 2022. Fairness and robustness in anti-causal prediction. TMLR

[4] Kim et al., 2018. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). ICML.

  • Robustness to demographic shift
  • Counterfactual invariance to sensitive attribute [2]
  • Robustness methods as mitigation for unfairness [3]

  • Interpretability:
    • Finding the “importance” of demographic attributes [4]

Different methods will yield different results, based on assumptions: e.g. occlusion = removing a feature, but not all features correlated with it

27 of 38

Fairness vs equity

28 of 38

Accuracy, equality and equity

[1] Wick et al., 2019. Unlocking Fairness: a Trade-off Revisited. NeurIPS.

[2] Brown et al., 2024. Detecting shortcut learning for fair medical AI using shortcut testing. Nature Communications.

[3] Schaekermann et al., 2024. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. eClinicalMedicine.

 

Group A

Group B

Group C

Outcomes

Current

AI

29 of 38

Accuracy, equality and equity

[1] Wick et al., 2019. Unlocking Fairness: a Trade-off Revisited. NeurIPS.

[2] Brown et al., 2024. Detecting shortcut learning for fair medical AI using shortcut testing. Nature Communications.

[3] Schaekermann et al., 2024. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. eClinicalMedicine.

 

Group A

Group B

Group C

Outcomes

Current

AI

30 of 38

Accuracy, equality and equity

[1] Wick et al., 2019. Unlocking Fairness: a Trade-off Revisited. NeurIPS.

[2] Brown et al., 2024. Detecting shortcut learning for fair medical AI using shortcut testing. Nature Communications.

[3] Schaekermann et al., 2024. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. eClinicalMedicine.

 

Group A

Group B

Group C

Outcomes

Current

AI

31 of 38

Accuracy, equality and equity

[1] Wick et al., 2019. Unlocking Fairness: a Trade-off Revisited. NeurIPS.

[2] Brown et al., 2024. Detecting shortcut learning for fair medical AI using shortcut testing. Nature Communications.

[3] Schaekermann et al., 2024. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. eClinicalMedicine.

 

Group A

Group B

Group C

Outcomes

Current

AI

32 of 38

Accuracy, equality and equity

[1] Wick et al., 2019. Unlocking Fairness: a Trade-off Revisited. NeurIPS.

[2] Brown et al., 2024. Detecting shortcut learning for fair medical AI using shortcut testing. Nature Communications.

[3] Schaekermann et al., 2024. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. eClinicalMedicine.

 

Group A

Group B

Group C

Outcomes

Current

AI

33 of 38

[1]Vandersluis and Savulescu. The selective deployment of AI in healthcare. (2024) Bioethics

  • Delay until there is enough understanding / data to estimate impact on patients and on health equity or deploy for some subgroups only? [1]
  • Regulations?
  • Synthetic data?
  • Novel mitigation techniques?
  • Human-Computer interactions?

What to do next?

34 of 38

The future of fairness

35 of 38

Fit in today’s landscape

Limitations

  • Mostly supervised settings
  • Simplified settings, with one sensitive attribute
  • Hard to scale (e.g. no data or metadata)

What it brings

  • Compares distributions, not instances
  • Maps to other terms such as “distributional” or “representation” harms, “stereotypes”
  • Provides metrics

36 of 38

Directions of research

  • Long-term fairness
  • Causal lens
  • Relationship with other fields [1]
  • Bias in LLMs, in RAG systems and in agents
  • Compounding biases
  • Not only technical but also ethical considerations

Very challenging and interdisciplinary field!

37 of 38

And it needs you!

  • Diverse and inclusive teams lead to better Responsible AI
  • Responsible AI leads to better models! [1]

[1] Papagiannidis et al. Responsible artificial intelligence governance: A review and research framework. (2024) The journal of strategic information systems.

38 of 38

Thank you

Jessica Schrouff – GSK

https://jessicaschrouff.github.io/