1 of 47

Thomas Padilla

Libraries,

Collections as Data,

and AI

2 of 47

collections as data?

@maximgifmaker

3 of 47

4 of 47

5 of 47

6 of 47

7 of 47

Quantifying Kissinger

Micki Kaufman, 2014

Micki Kaufman, Quantifying Kissinger

8 of 47

9 of 47

collections as data

… ordered information

… stored digitally

… amenable to computation

10 of 47

design for qualities of collections

that define possible use

11 of 47

Always Already Computational: Collections as Data

2016-2018

thomas padilla, unlv

laurie allen, university of pennsylvania

stewart varner, university of pennsylvania

hannah frost, stanford university

sarah potvin, texas a&m university

elizabeth russey roke, emory university

12 of 47

Collections as Data: Part to Whole

2019-2023

thomas padilla, internet archive

hannah scates kettler, iowa state university

stewart varner, university of pennsylvania

yasmeen shorish, james madison university

13 of 47

14 of 47

15 of 47

16 of 47

17 of 47

How do we responsibly support computational engagement with memory organization collections as data?

18 of 47

19 of 47

Libraries & AI

  1. reusable - open , not 🙄 open 🙄
  2. accountable - development and use subject to specific community needs
  3. sustainable - adoption guided by stewardship mindset

20 of 47

Libraries & AI

  • reusable - open , not 🙄 open 🙄
  • accountable - development and use subject to specific community needs
  • sustainable - adoption guided by stewardship mindset

21 of 47

open AI =

Maximize our agency through ability to comprehensively assess, freely build upon, reuse, and share

22 of 47

Irene Solaiman, The Gradient of Generative AI Release: Methods and Considerations

23 of 47

24 of 47

25 of 47

26 of 47

27 of 47

28 of 47

Open source software often becomes an industry standard,” Zuckerberg told investors on an earnings call on Feb. 1. “When companies standardize on building with our stack, that then becomes easier to integrate new innovations into our products.”

Billy Perrigo, Yann LeCun On How An Open Source Approach Could Shape AI

29 of 47

30 of 47

31 of 47

Libraries & AI

  • reusable - open , not 🙄 open 🙄
  • accountable - development and use subject to specific community needs
  • sustainable - adoption guided by stewardship mindset

32 of 47

33 of 47

34 of 47

DAIR

35 of 47

DAIR

36 of 47

Libraries & AI

  • reusable - open , not 🙄 open 🙄
  • accountable - development and use subject to specific community needs
  • sustainable - adoption guided by stewardship mindset

37 of 47

Sustainability strategy depends on awareness of interdependence, threats, and opportunities.

Suggested methods:

  1. Exploded view - Cartography of generative AI
  2. Systems view - World Systems Theory (Wallerstein)
  3. Replacement view - AI is Automation (Bender)

38 of 47

39 of 47

  • Maintained distribution of capital concretizes system roles
  • Exclusivity - e.g., compute
  • Purgatory - e.g., raw materials, training data, content moderation

40 of 47

I think that discussions of this technology become much clearer when we replace the term AI with the word “automation”.

Then we can ask:

  • What is being automated?
  • Who’s automating it and why?
  • Who benefits from that automation?
  • How well does the automation work in its use case that we’re considering?
  • Who’s being harmed?
  • Who has accountability for the functioning of the automated system?
  • What existing regulations already apply to the activities where the automation is being used?

Emily Bender, Opening remarks on “AI in the Workplace: New Crisis or Longstanding Challenge”

41 of 47

42 of 47

call for collaboration!

43 of 47

44 of 47

45 of 47

46 of 47

Interested in collections as data or AI and libraries work?

Seeking partners for international convenings and workshops to address challenges and opportunities together.

Please do reach out!

47 of 47

🙏

Emily Bender, University of Washington Seattle

Chris Bourg, MIT

Laurie Bridges, Oregon State University

F. Stuart Chapin, University of Alaska Fairbanks

Kate Crawford, USC Annenberg

Timnit Gebru, Distributed Artificial Intelligence Institute (DAIR)

Hanna Hajishirzi, Allen Institute for AI, University of Washington Seattle

Yacine Jernite, Hugging Face

Bergis Jules, Shift Collective

Mia Lund, Open Source Initiative

A Maxwell

Margaret Mitchell, Hugging Face

Rachael Samberg, UC Berkeley

Irene Solaiman, Hugging Face

Anna Tsing, UC Santa Cruz

Immanuel Wallerstein, Yale University

Sarah West, AI Now Institute

David Gray Widder, Carnegie Mellon University

Meredith Whittaker, Signal