1 of 30

Data Ethics

Katerina Allmendinger

Nickoal Eichmann-Kalwara

Data Camp 2024

2 of 30

Data are constructed - they are the products of biased human choices and perspectives.

Data Camp 2024

3 of 30

Data Ethics and the Research Lifecycle

  • What does ethics [data / research] mean to you?

  • At which stages should you critically engage with your data?

Data Camp 2024

4 of 30

Data Ethics and the Research Lifecycle

  • Uphold applicable statutes, regulations, professional practices, and ethical standards
  • Respect the public, individuals, and communities
  • Respect privacy and confidentiality
  • Act with honesty, integrity, and humility
  • Hold oneself and others accountable
  • Promote transparency

Based on Federal Data Strategy

Data Camp 2024

5 of 30

Data as Texts/Artifacts

  • Data are not neutral or objective.
  • Data do not represent a self-evident, discoverable Truth.

Rather,

  • Data are constructed, the products of biased human choices and perspectives.
  • The interpretation, use, and analysis of data, like any text, is shaped by our biases and perspectives.

Data Camp 2024

6 of 30

Code4Lib Journal, December 2023

7 of 30

8 of 30

Data Camp 2024

machine actionability

social actionability

9 of 30

Data Camp 2024

10 of 30

CARE Data Principles

  • From the perspective and ethical values of the involved community, what are the potential benefits and harms of collecting, using, or sharing these data now and in the future?
  • Does the data play into harmful stereotypes and stigmas, or does it empower the community to assert their rights and interests?
  • Are the provenances, purposes, and use limitations stated in the metadata or other documentation?
  • Do metadata standards incorporate the (Indigenous) communities values and concepts?

Data Camp 2024

11 of 30

Consent

Privacy

Care work

Harm Reduction

Sovereignty

Data Camp 2024

12 of 30

Power

Privacy

Trust

Justice

Data Camp 2024

13 of 30

Power

  • Who is collecting and analyzing the data?
  • What’s counted, or not?
  • How and when something is counted and why?
  • Current social and political concerns as influencers?
  • Right to be Forgotten (GDPR)

Data Camp 2024

14 of 30

Privacy & Security

  • Consent: Permission to use these data (ownership?)
  • Anonymization: removing sensitive and identifiable information
  • Confidentiality: Whose/what information is protected and for how long?

Data Camp 2024

15 of 30

Trust

  • Forms of Knowledges
  • Accountability
  • Documentation
  • Reproducibility

Data Camp 2024

16 of 30

Justice

  • Social impact
  • Missing and counter data
  • Protection
  • Co-liberation

Data Camp 2024

17 of 30

Ethics for Artificial Intelligence (AI)

Data Camp 2024

18 of 30

So while AI is novel, it often seems to continue long-standing paradigms of technology in the service of capital … scholarship suggests that the human harms documented in recent AI-driven initiatives are not merely “teething problems,” but part of a broader paradigm of capitalist and colonialist values at the core of our current economic and technological systems. (Munn, 2023)

Data Camp 2024

19 of 30

a biological scientist

a chief executive officer

Image: Quote excerpt from Gross, N. (2023)

Bias

Image from Sun, L. et al. (2023)

20 of 30

Exploitation

"Dragon Cage" by Greg Rutkowski.

Data Camp 2024

AI-generated art by Sandy60

21 of 30

  • How is user input data treated? Is it stored? Anonymized? Shared with other companies?
  • What are the safeguards against malicious attacks?
  • Are these practices transparent and shared with users for informed consent?

Privacy and Security

Image credit: Photo by Lianhao Qu on Unsplash

22 of 30

Misinformation

From left: 1) AI-generated image that falsely depicts Kamala Harris and Donald Trump smiling and high-fiving 2) Social media post from Donald Trump sharing an AI-generated image that

AI-Generated Image

23 of 30

Using Generative Artificial Intelligence

Questions to consider:

  • How was this tool created and how is it maintained? By whom?
  • Who can use this tool and for what purposes?
  • How does my use of this tool impact me, my immediate communities, creators of input data, larger social structures, ecosystems, and intersections of those groups?
  • … and what do we do if we cannot answer these questions?

Data Camp 2024

24 of 30

How we work towards Ethical AI

Data Camp 2024

25 of 30

Transformative and radical uses of AI

A note of hope …

Data Camp 2024

26 of 30

Data are constructed - they are the products of biased human choices and perspectives.

Data Camp 2024

27 of 30

Reflections

  • What are examples of data ethical concerns that you encounter in your daily work, scholarship, and/or teaching?
  • In what ways do you think power is reflected in your data?
  • What data are you missing? Was it intentional? How might this missing data be beneficial?
  • How might thinking about your data as a text/artifact change how you relate to, document, and share your work?

28 of 30

References

29 of 30

References

  • Center for Countering Digital Hate. Fake Image Factories: How AI image generators threaten election integrity and democracy. https://counterhate.com/research/fake-image-factories/
  • Sun, L., Wei, M., Sun, Y., Suh, Y. J., Shen, L., & Yang, S. (2023). Smiling Women Pitching Down: Auditing Representational and Presentational Gender Biases in Image Generative AI (arXiv:2305.10566). arXiv. http://arxiv.org/abs/2305.10566
  • Gross, N. (2023). What ChatGPT Tells Us about Gender: A Cautionary Tale about Performativity and Gender Biases in AI. Social Sciences, 12(8), Article 8. https://doi.org/10.3390/socsci12080435
  • Mylrea, M., & Robinson, N. (2023). Artificial Intelligence (AI) Trust Framework and Maturity Model: Applying an Entropy Lens to Improve Security, Privacy, and Ethical AI. Entropy, 25(10), 1429. https://doi.org/10.3390/e25101429
  • Ferrara, E. (2024). The Butterfly Effect in artificial intelligence systems: Implications for AI bias and fairness. Machine Learning with Applications, 15, 100525. https://doi.org/10.1016/j.mlwa.2024.100525
  • Meighan, P. J. (2021). Decolonizing the digital landscape: The role of technology in Indigenous language revitalization. AlterNative: An International Journal of Indigenous Peoples, 17(3), 397–405. https://doi.org/10.1177/11771801211037672

30 of 30

Data Ethics

Katerina Allmendinger

katerina.allmendinger@colorado.edu

Nickoal Eichmann-Kalwara

nickoal.eichmann@colorado.edu

Data Camp 2024

THANKS!