Cobweb�Collaborative Collection Development for Web Archives�#cobwebarchive
Stephen Abrams
Kathryn Stine
California Digital Library
Peter Broadwell
Andrew Wallace
UCLA
Janet Taylor
Ann Whiteside
Harvard University
Dodging the Memory Hole, Internet Archive, 15-16 November 2017
Imagine …
Tahrir Square, 2012
Ferguson, 2014
Kaitlyn Veto © 2014
Lesvos, 2015
Steve Evans © 2015
Flint, 2016
Fort Lee, 2013
Associated Press © 2013
Puerto Rico, 2017
Joe Raedle/Getty Images © 2017
Carlos Osorio/AP © 2016
… a fast-moving event unfolding online as much as on the ground
Carlos Osorio/AP © 2016
How can we respond as a community appropriately and responsibly?
Imagine …
Harnessing the domain knowledge and technical capabilities of the entire community
Enabling local collection development decisions based on global information
Complementary, cooperative, and collaborative collecting
Institutional participation at a level commensurate with local expertise and capacity
Increasing scholarly and public awareness
Cobweb
Centralized catalog of collection- and seed-level metadata
Establishment of thematic collecting projects
Open nomination of topical seed URLs by interested stakeholders
Claiming of seeds by archival institutions intending to harvest
Holdings records for seeds actually harvested
Thematic discovery of web archives of interest
Why Cobweb?
The demands of archiving the web in comprehensive breadth or thematic depth exceeds the technical and financial capacity of any one institution
Curators cannot make rational collection development decisions without knowledge of what others have collected or intent to collect
Relevant seed URLs can be meaningfully contributed by various stakeholders: curators, archivists, subject area specialists, scholars, journalists, event participants, and the public
Apportioning collection responsibility into granular pieces encourages participation by smaller institutions and programs
Why Cobweb?
Peter Broadwell at UCLA was well into collecting “fake” news sites before it occurred to him to wonder if anyone else was doing something similar
There was; Mark Graham at IA
Cobweb project
One-year collaborative project between CDL, Harvard University, and UCLA, funded by IMLS #LG-70-16-0093-16
Public online service hosted at CDL
Python/Django stack
MIT license
Targeting initial production release in conjunction with the November 2018 IIPC General Assembly and Web Archiving Conference
https://cdlib.org/cobweb
https://github.com/CobwebOrg/cobweb
Demo
Next steps
Cobweb is a tool for collecting communities…
we need you!
Next steps
Q & A
Questions for us?
Questions from us!
Cobweb
California Digital Library https://www.cdlib.org/
Harvard University Library http:// library.harvard.edu/
UCLA Libraries http://www.library.ucla.edu/
https://cdlib.org/cobweb
https://github.com/CobwebOrg/cobweb
Kathryn Stine, Cobweb Outreach Manager
Kathryn.Stine@ucop.edu
Collaborative Collection Development for Web Archives
#cobwebarchive