A Crash Course on Ethics in Natural Language Processing
Version 1.0
Annemarie Friedrich and Torsten Zesch
License: CC-BY
Ethics for NLP
What comes to your mind when you think of ethics?
What comes to your mind when you think about ethics for NLP?
Have you encountered any ethical problems in your life?
Why do you think this topic is important?
What do you expect to learn in this crash course?
Why does Ethics matter for NLP?
NLP has the aim of modeling language, an inherently human function
NLP works with textual data or human subjects → not free of bias, prejudice, …
Language technology is widely applied (e.g. on social media) → can potentially harm anyone
Language technology shapes the way we experience the world
Bias
Privacy
Fairness
Dual Use
Environmental Issues
...
Sources and Types of Harm - Overview
NLP System
Data
Bias
Unfair Outcomes
Direct Harm
Bias
Direct harm
Learning Goals
After this course, you will be able to:
What is Ethics?
Branch of Philosophy
Ethics is the philosophical study of morality. It is the study of what are good and bad ends to pursue in life and what is right and wrong to do in the conduct of life. It is [...] primarily a practical discipline.
(Deigh, 2010, p. 7)
Synonym for Moral Code
Sometimes “ethics” is used to refer to the moral code or system of a particular tradition.
Examples: Christian ethics, professional ethics
How do these meanings relate to “Ethics for NLP”?
What is Morality?
Universal Concept
Universal ideal of what one ought to do or ought not to do, guided by reason / rational grounds.
Conventional System of Community
The members’ shared beliefs about wrong and right, good and evil, and the corresponding customs and practices that prevail in the society.
How do these concepts relate to “Ethics for NLP”?
Whose Life Matters More?
http://moralmachine.mit.edu/hl/de
Try it out!
Two ethical theories
Deontology
Deon (Greek) = duty
“Identify your duty and act accordingly”
Generalization principle: prioritizes intent as the source of ethical action, should be reasonable.
Teleology
Telos (Greek) = goal
Outcome-oriented
Utilitarianism
“Choose that action that optimizes the outcome”
“An action is ethical only if it is not irrational for the agent to believe that no other action results in greater expected utility” (Bentham 1789)
Moral vs. Legal
| legal | illegal |
moral | Doing your homework | Civil disobedience |
immoral | Cheating on your spouse | Murder |
Reading Assignment (Homework)
Hovy & Spruit: The Social Impact of Natural Language Processing. (ACL 2016)
TODO: add questions / instructions regarding the paper
Political correctness classifier?
Source of Harm - Direct
NLP System
analyzing medical documents
drug overdose killing the patient
Dual Use
NLP Task | Beneficial Use | Malicious Use |
Hate speech detection | Fighting hate crimes | Censorship of free speech |
Detection of fake news / reviews | Fighting misinformation | Generation of fake news / reviews |
... | ... | ... |
Can you think of other NLP tasks that have beneficial but also potentially malicious uses?
Image by Clker-Free-Vector-Images from Pixabay
Assume you are publishing a piece of software on GitHub. Should you mention potential malicious uses in the corresponding Readme?
Source of Harm - Bias
Data
NLP System
Doctor vs. Nurse
The doctor recommended to perform an X-ray.
He/She said …
The nurse recommended to perform an X-ray.
He/She said …
Do you think “he” or “she” is a more likely continuation in the above cases (respectively)?
What would happen if you asked a large pre-trained language model?
Bias in Machine Translation
Image source: https://arxiv.org/pdf/1809.02208.pdf
Useful or harmful?
Bias in Machine Translation
Detecting gender-neutral queries
Generate gender-specific translations
Check for accuracy
What is Bias?
Cognitive bias arises due to the tendency of the human mind to categorize the world.
→ simplifies processing.
Social biases in data, algorithms, and applications
Statistical bias in machine learning
Image by Gordon Johnson on Pixabay.
What is Bias? (Technical View)
Bias in machine learning
Bayesian probabilities: prior
May be intended (e.g., domain adaptation) or unintended
Is bias always a bad thing?
Why is Bias Problematic? (Social View)
NLP Applications
Employment matching, advertisement placement, parole decisions, search, chatbots, face recognition, ...
Social Stereotypes
Gender, Race, Disability, Age, Sexual orientation, Culture, Class, Poverty, Language, Religion, National origin, ...
Sap et al.: The Risk of Racial Bias in Hate Speech Detection. ACL 2019.
Why is Bias Problematic?
Outcome Disparity
Error Disparity
Word Error Rate in automatic captioning is higher for female speakers compared to male speakers (Tatman, 2017)..
Because a “COOKING” event is taking place, the model is more likely to predict the agent to be a woman.
(Zhao et al., 2017)
Image sources: https://www.aclweb.org/anthology/W17-1606.pdf,
https://www.aclweb.org/anthology/D17-1323.pdf
women
men
See also Shah et al. (2020)
Why is Bias Problematic?
(Technical View)
Outcome / Error disparity
Models might amplify bias
51:49 distribution in a feature may lead to 100:0 decision
Is it wrong to build models replicating “real world data”?
In what circumstances?
Sources of Bias in NLP (Shah et al., 2020)
Image Source: https://www.aclweb.org/anthology/2020.acl-main.468.pdf
De-Biasing of Word Embeddings
she
neutral
Image taken from Bolukbasi et al.,
Bias Exercise
Source: http://wordbias.umiacs.umd.edu
Source of Harm - Unfair Outcomes
NLP System
filtering
job applications
Better chances for people living in a certain area
Fairness
Treating everyone equally is fair, right?
So, everyone gets the same grade from now on ;)
fundamental principle of justice “equals should be treated equally and unequals unequally”
Bild von Gordon Johnson auf Pixabay
Group vs. Individual Fairness
group fairness
individual fairness
cannot reach group and individual fairness at the same time
Which groups are/should be protected?
How can we measure similarity of individuals?
https://medium.com/ibm-watson/ethics-in-ai-responsibilities-for-data-analysts-part-2-d76f2343e4d1
Source of Harm - Input/Training Data
Data
NLP System
Privacy
“I’ve got nothing to hide.”
Do you have curtains? / Do you close your shutters at night?
Can I see your credit card bills from last year?
A Taxonomy of Privacy (Solove, 2007)
Privacy = intimacy?
Privacy = the right to be let alone?
Problems and harms related to privacy
“Privacy [...] is a plurality of different things that do not share one element in common but that nevertheless bear a resemblance to each other.”
Data Privacy Regulations
European Regulation 2016/679
General Data Protection Regulation (GDPR)
Main rights of the “data subject” (natural person):
Similar laws in the US: California Consumer Privacy Act
Applies for the data of all EU citizens - also if controller operates from a country outside the EU!
Data Privacy vs. Data Ethics
[based on: Lawler, 2019]
“Just because we can do something, doesn’t mean we should.”
Should a company sell user information to political campaigns?
Anonymization (De-Identification)
After having run some anonymization system on our data, is everything fine?
Image Source: https://www.aclweb.org/anthology/2020.lrec-1.870/
HitzalMed
(Lopez et al., 2020)
Authorship Attribution / Author Profiling
What are potential chances and risks of this type of technology?
Reading Assignment / Discussion
Daniel J. Solove. 'I've Got Nothing to Hide' and Other Misunderstandings of Privacy. San Diego Law Review, Vol. 44, p. 745, 2007
Germany’s complicated relationship with Google Street View. NY Times, April 2013.
Questions to think about / discuss:
Which dimensions of privacy matter most to you?
A software developer accidentally notices a document where a user is drafting a suicide note. Should he/she contact the police to save a life, or respect their user’s secret?
Can you imagine a situation where interfering with someone’s privacy leads to an economic / financial issue for that person?
Word Error Rates for Automatic Captioning
on YouTube (Tatman, 2017)
WER higher for Scottish speakers
WER higher for female speakers compared to male speakers
Image Source: https://www.aclweb.org/anthology/W17-1606
Demographic Factors Improve Classification Performance (Hovy, 2015)
Distribution of categories by gender
Is it okay to leverage the author’s gender information as explicit features for text classification?
What would be recommended from a utilitarian / generalization perspective?
Image Source: https://www.aclweb.org/anthology/P15-1073.pdf
x/Axis: Topics
Reading Assignment
Prabhumoye et al.: Case Study: Deontological Ethics in NLP. NAACL 2021.
Consider a project that you are working on / have worked on or pick a recent research paper from the ACL Anthology. Analyse the method / system both from a utilitarian and from a generalization perspective. How would scholars of each ethical theory evaluate the ethicality of the method/system?
“Applications”: NLP for Social Good
Civility in communication: techniques to monitor trolling, hate speech, abusive language, detect fake news, etc.
Image Source: https://www.stiftung-nv.de/de/publikation/kurzanalyse-zu-trumps-crime-tweet-deutschland-viel-aufmerksamkeit-wenig-unterstuetzung
Other topics
Explainability
Lorem ipsum tempus
Lorem ipsum congue tempus
Lorem ipsum tempus
Lorem ipsum congue tempus
Lorem ipsum tempus
Explainability
Crowd-
sourcing
Pollution
Safety
Pick a topic of your choice and research its relationship to ethics. What are common arguments made? Do you agree? Can you find interesting examples for ethical or unethical behavior?
Reading Suggestions:
Environmental Issues
Bender et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. FAccT ’21, March 3–10, 2021, Virtual Event, Canada
Emma Strubell, Ananya Ganesh, Andrew McCallum, Energy and Policy Considerations for Deep Learning in NLP. ACL 2019.
Bild von Peggy und Marco Lachmann-Anke auf Pixabay
Misc / Practical Hints
IRB = Institutional Review Board (Ethics Review Board)
Reviews all human experimentation
Find out how to contact the IRB of your institution.
The ACL has adopted ACM Code of Ethics and Professional Conduct and published an FAQ with hints on conducting research and publishing in an ethical manner (see, e.g., ACL-IJCNLP 2021 Ethics FAQ).
Association for Computational Linguistics
Practical summary
Analyse data, task and outcomes for potential harm.
Can benefits outweigh harms?
Retrospection
What have you learned?
What does that mean for you personally?
What was surprising?
References
Literature – Ethics in NLP
Overviews
Literature – Ethics in NLP
Overviews
Literature – Ethics in NLP
Bias
Literature – Ethics in NLP
Bias
Literature – Ethics in NLP
Bias
Literature – Ethics in NLP
Fairness
Literature – Ethics in NLP
Gender Stereotypes