always already computational
collections as data
illustration by adam ferriss
thomas padilla --- unlv
laurie allen --- upenn
stewart varner --- upenn
sarah potvin --- texas a&m
elizabeth russey roke --- emory
hannah frost --- stanford
American Historical Association; American Philosophical Society; British Library; Carnegie Museum of Art; Compute Canada; Cornell University; Deutsches Klimarechenzentrum; Digital Library Federation; Digital Public Library of America; Emory University; Getty Research Institute; Hathitrust Research Center; Haverford College; Indiana University; Indiana University-Purdue University; Internet Archive; James Madison University; Koninklijke Bibliotheek; Library of Congress; Max Planck Computing & Data Facility; McGill University; Michigan State University; Massachusetts Institute of Technology; Museum of Modern Art; National University of Singapore; New York Public Library; New York University; Northeastern University; Open Knowledge Foundation; Penn State University; Stanford University; Swarthmore College; Texas A&M University; Tufts University; University College London; University of British Columbia; University of California Santa Barbara; University of California Berkeley; University of California Los Angeles; University of Canberra; University of Delaware; University of Graz; University of Houston; University of Illinois at Urbana Champaign; University of Maryland; University of Miami; University of Minnesota; University of North Carolina at Chapel Hill; University of Pennsylvania; University of Toronto; University of Utah; University of Texas Austin; Vanderbilt University; Wellcome Trust; Wheaton College;
York University . . . . . . . . . . . . . . . . . . . . .
goals
liz tatarintseva
liz tatarintseva
The Santa Barbara Statement calls for a rethinking of data documentation . . . our more ethical collection documentation is also an instructional tool and we should see documentation as another opportunity for living out our professional commitments to information literacy.
. . . taking these principles seriously illuminates a path for pedagogy that frames our digital collections as something with which students can critically engage by assessing strengths and gaps, especially in terms of missing narratives.
2016
Predominant digital collection development focuses on replicating traditional ways of interacting with objects in a digital space. This approach does not meet the needs of the researcher, the student, the journalist, and others who would like to use computational methods and tools to work with …
collections as data.
collections as data
… ordered information
… stored digitally
… amenable to computation
procedural data affords a capacity for computational processing, e.g. term frequency analysis, named entity extraction, and topic modeling | participatory data affords a capacity for enrichment by a diverse set of users, e.g. crowdsourced transcription |
encyclopedic data affords a capacity for expanded access, e.g. parametric searching by granular features like line length, genre, author gender | spatial data affords a capacity for spatial characteristics to be surfaced, e.g. place names can be geocoded and mapped |
adapted from Janet H. Murray, affordance grid
criticalhandgestures.tumblr.com
in very general terms, an agent is a being with the capacity to act, and ‘agency’ denotes the exercise or manifestation of this capacity.
stanford encyclopedia of philosophy, agency
discovering
annotating
comparing
referring
sampling
illustrating
representing
reuse
reproducibility
integrity
authenticity
permanence
attribution
How do we make our stuff more useful?
Increase fit for purpose
Enhance discoverability
Expand access methods
a social and technical challenge
What is the scope of any aspect of this work?
What types of use does this work serve?
Who does this work serve?
What partners can join you in the work?
What approaches to the work already exist?
What ethical considerations should be engaged?
What challenges exist?
What present and future opportunities exist?
Collections as Data Facets
facet \ˈfa-sət\: one side of something many-sided
=================================================
Collections as Data Facets document collections as data implementations.
An implementation consists of the people,
services, practices, technologies, and infrastructure that aim to encourage computational use of cultural heritage collections.
1. Why do it
2. Making the Case
3. How you did it
4. Share the docs
5. Understanding use
6. Who supports use
7. Things people should know
8. What’s next
Hathitrust Research Center Extracted Features Dataset
Ticha: A Digital Text Explorer for Colonial Zapotec
Vanderbilt Library Legacy Data Projects
The Museum of Modern Art Exhibition Index
on the books
Society of American Archivists, July 27, 2017
Digital Humanities 2017, August 7, 2017
Digital Library Federation, October 25, 2017
CNI Fall Forum, 2017
American Historical Association, 2018
planned
National Forum 2 @ UNLV
NICAR, 2018
Open Repositories, 2018
DLF 2018
liz tatarintseva
goals