Fairness in Graphs
Evaggelia Pitoura Panayiotis Tsaparas
Archimedes RC Athens, Greece June 14, 2023
Department of Computer Science & Engineering
University of Ioannina
Greece
Archimedes Unit,
ATHENA RC
Greece
Part I: Roadmap
2
Introduction to Algorithmic Fairness
Centrality Fairness in Graphs
Knowledge graphs, entity resolution
Explainability for fairness
3
Introduction to Algorithmic Fairness
Algorithmic Fairness: Why?
4
We live in a world where opinions are formed, and decisions are assisted or even taken by AI algorithms: often black boxes; driven by enormous amount of data
Who to date?
From simple, or not that simple, personal ones
Where to dine?
What are the news?
What to read, watch, buy..?
Get informed, learn?
Which job to take? Which school to attend? Whom to follow? Whom to vote…? ..?
Shape our opinions and our view of the world
Algorithmic Fairness: Why?
5
And not just by individuals:
Concerns are raised regarding how much can/should we trust the AI algorithm?
Algorithmic (Un)Fairness: Examples
6
Example: The COMPAS recidivism prediction(1)
Commercial tool that uses a risk assessment algorithm to predict some categories of future crime
Used in courts in the US for bail and sentencing decisions
Study by ProPublica
(1) https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Algorithmic (Un)Fairness: Examples
7
Example: Word Embeddings(2)
Trained on a corpus of Google News texts
The trained embedding exhibit female/male gender stereotypes, learning that "doctor" is more similar to man than to woman
Such embeddings as input to downstream ML tasks
(2) Bolukbasi et al. "Man is to computer programmer as woman is to homemaker? debiasing word embeddings." Advances in Neural Information Processing Systems (2016).
representation of words/texts as a vector of numbers
Why? Algorithms work with numbers.
8
Statistical bias
Non-representative sampling
Sampling bias: the dataset selected as input to an algorithm is not representative of the full population to which the algorithm will be applied.
Selection bias: often used to refer to non-random selection of input.
Statistical Bias
What is bias?
What is the cause of unfairness: bias
Societal bias
objectionable social structures, human biases, and preferences that are:
Long list of biases: confirmation bias, normative biases, functional biases induced by the platforms, behavioral and temporal biases, cognitive biases
Bias
Overloaded term used to capture various forms of misusing data and information, prejudice behavior, and favoritism.
According to Oxford English Dictionary
What is (algorithmic) fairness?
9
Bias may come from:
What is the cause of unfairness: bias
Algorithmic Fairness: What?
10
Algorithmic fairness: Lack of discrimination
the results of an algorithm should not be influenced by protected, or sensitive attributes, such as gender, religion, age, sexual orientation, race, etc
Two levels:
Depends on the algorithm: Classification, Recommendation, Ranking, Set Selection, Clustering, etc
11
Individual Fairness
How to define distances, especially in the input space
Group Fairness I
12
Output
Binary classifier
1 is the positive class (i.e., the class that leads to a favorable decision, e.g., getting a job, or being offer a specific medical treatment)
Classification
Group Fairness II
13
demographic parity (statistical parity, independence)
preserves the input ratio: the demographics of the individuals receiving a favorable outcome the same as demographics of the underlying population
Equity: members of each group have the same chance of getting the favorable output.
If there 10% of women among the patients, 10% of those that receive the treatment are women
Group Fairness III
14
TP (true positive) rate for the non-protected group
TP rate for the protected group
For example:
Also know as equal opportunity
| | |
| TP | FP |
| FN | TN |
Predicted
Actual
Confusion Matrix
equal opportunity vs statistical parity: as with statistical parity, the members of the two groups have the same chance of getting the favorable outcome, but only when these members qualify
Equalized odds: both true and false positive rates equal for the two groups
Counterfactual Fairness
15
A decision is fair towards an individual, if it is the same in both the actual world and a counterfactual world where the individual belonged to a different group
Casual analysis
Achieving Fairness: How
16
Algorithm
Output
Input
Where do we stand now?
17
Academic Research
Dedicated conferences: ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT)
Toolkits
Fairness metrics and algorithms to mitigate unfairness
- Tensor Flow Remediation
https://www.tensorflow.org/responsible_ai
Additional resources
Where do we stand now?
18
Overview Papers:
Evaggelia Pitoura, Kostas Stefanidis, Georgia Koutrika: Fairness in rankings and recommendations: an overview. VLDB J. 31(3): 431-458 (2022)
Evaggelia Pitoura: Social-minded Measures of Data Quality: Fairness, Diversity, and Lack of Bias. ACM J. Data Inf. Qual. 12(3): 12:1-12:8 (2020)
Marina Drosou, H. V. Jagadish, Evaggelia Pitoura, Julia Stoyanovich: Diversity in Big Data: A Review. Big Data 5(2): 73-84 (2017)
Additional resources
Part I: Roadmap
19
Introduction to Algorithmic Fairness
Centrality Fairness in Graphs
Knowledge graphs, entity resolution
Explainability for fairness
20
Centrality Fairness in Graphs
Sotiris Tsioutsiouliklis, Evaggelia Pitoura, Panayiotis Tsaparas, Ilias Kleftakis, Nikos Mamoulis: Fairness-Aware PageRank. WWW 2021: 3815-3826
Sotiris Tsioutsiouliklis, Evaggelia Pitoura, Konstantinos Semertzidis, Panayiotis Tsaparas: Link Recommendations for PageRank Fairness. WWW 2022: 3541-3551.
Results based on:
Some nodes hold more important positions in a network than others
Pagerank importance: a node is as important as its followers
Two groups
Pagerank Fairness
Networks are everywhere
Group-based fairness
PageRank Fairness: example
Personalized Pagerank Fairness
Personalized PageRank (PPR): example
Research Questions
RQ1: What are the conditions that may lead to unfairness?
RQ2: Can we modify the Pagerank algorithm to produce weights that are both fair and as accurate (i.e., close to the original Pagerank) as possible?
RQ3: Can we intervene in the evolution of a network through link recommendations to improve fairness?
RQ1: What are the conditions that may lead to unfairness?
RQ2: Can we modify the Pagerank algorithm to produce weights that are both fair and as accurate (i.e., close to the original Pagerank) as possible?
Fair random walk
Red Group
Residual fair random walk
φ = 0.5
Choose red nodes so as the modified Pagerank weights as close as possible to the original weights
RQ3: Can we intervene in the evolution of a network to improve fairness?
Future Work
How centrality unfairness is reflected
More in Part II
Part I: Roadmap
30
Introduction to Algorithmic Fairness
Centrality Fairness in Graphs
Knowledge graphs, entity resolution
Explainability for fairness
31
Knowledge graphs
Related papers:
Entity Alignment (EA)
Given a source
and a target
find pairs of matches
32
spouse
spouse
Entity Alignment
EA using Knowledge Graph Embeddings (KGE)
Aim to solve EA as a Representation Learning task in an embeddings space + Similarity-based Matching
33
Structural Bias in KGE EA Methods - Example
34
Long-Tail Entities have less chances to be matched [Wang et al. 2022, Cheng et al. 2015]
Wang, S.: On the analysis of large integrated knowledge graphs for economics, banking and finance. In: EDBT/ICDT Workshops. (2022)
Cheng, W., Wang, C., Xiao, B., Qian, W., Zhou, A.: On statistical characteristics of real-life knowledge graphs. In: BPOE. (2015)
SUSIE benchmark
Explores diverse areas of both KGs, wrt the size of the connected components
35
Robustness of KGE EA Methods to Structural Diversity
36
D-Y
D-W
BBC-D
MEM-E
Η@1
Fanourakis, N., Efthymiou, V., Kotzinos,D., Christophides,V. : Knowledge graph embedding methods for entity alignment: An experimental review. CoRR abs/2203.09280 (2022)
Bias in Knowledge Graph embeddings
Focus on gender bias in occupations, male, female as possible values.
Link prediction task (knowledge graph completion) to quantify bias amplification by the embeddings:
test whether link prediction reflects gender bias by predicting gender-dominated occupations for people of a corresponding gender more often than expected
A debias approach for removing gender bias tunable in the amount of bias it removes
Expected and predicted probabilities of top 3 male and
female occupations, in the top 10, 100, 500 and 1000 results.
Wikidata, TransE embeddings.
Bias in Knowledge Graph embeddings
Future Work
What is a counterfactual?
First approach
Counterfactual Explanations for Fairness
Related paper:
41
Bias disparity: CF recommenders
Bias Decrease
Bias Increase
Men
Women
Romance
Action
Users in groups (men, women)
Recommended items also in groups (based on movie genre)
bias (preference) of a user group for a specific item group?
does the recommender exaggerate this?
Virginia Tsintzou, Evaggelia Pitoura, Panayiotis Tsaparas: Bias Disparity in Recommendation Systems. RMSE@RecSys 2019
Recommendations
Can we explain such exaggeration in graph-based recommendations through counterfactuals?
End of Part I��
43
Fairness in package-to-group recommenders
Fairness proportionality: A user considers the package fair for her, if there are at least m items that the user likes (in her top preferences).
Envy-freeness: A user considers the package fair for her, if there are at least m items for which the user does not feel envious (her rating for these items are among the top ratings in the group)
Fair division: dividing a set of resources among several people such that each person receives their due share.
Recommend a package of items to a group of people
Dimitris Serbos, Shuyao Qi, Nikos Mamoulis, Evaggelia Pitoura, Panayiotis Tsaparas: Fairness in Package-to-Group Recommendations. WWW 2017: 371-379