Accessible Web Archives:
Rethinking and Designing Usable Infrastructure for Sustainable Research Platforms
Samantha Fritz, MLIS
Project Manager, Archives Unleashed
WAC, 15-16 June 2021
of logging onto the WWW
2
“Meet the demand for automated information-sharing between scientists in universities and institutes around the world”
fastest growing communications medium of all time
The web has shaped how we connect with one
another and interact with information.
5
We all have a relationship to data
6
We also use data to interpret and understand the world around us
7
impacts the way we produce, preserve and interact with information.
The web has provided a new context for research data
4.66 BILLION
internet users
95 Million
Photos & Videos
per day
306.4 Billion
Emails
per day
1.7MB
Of data
/sec/person
9
800
web pages
have been created since the start of this presentation
we risk losing potentially significant information
11
Development of Web Archiving
1996-2021
Increasing adoption of web archiving mandates among memory institutions around the world
1992
World wide web is launched
1996
Conscious effort to preserve born-digital content
First large-scale preservation projects
12
The web is a critical source for studying our digital cultural heritage
13
Opportunities
Expands scope to incorporate a wider and more diverse range of voices and perspectives
Shift in scale from resource scarcity to abundance
(Roy Rosenzweig)
14
Challenges are inevitable when dealing with data
Occur throughout web archiving lifecycle:
Challenges
15
web archives have largely remained inaccessible
Despite the volume of data captured
Barriers to Access
& Use
17
How can we lower barriers of access and use to web archives?
2017-2020
Archives Unleashed Project
2017 - 2020
19
Image: Archives Unleashed Project Timeline
Tools & Platforms
Archives Unleashed Toolkit
21
Image: Archives Unleashed Toolkit via Sparkshell
Archives Unleashed Toolkit
22
Image: Archives Unleashed Toolkit Documentation
Archives Unleashed Cloud
23
Image: Archives Unleashed Cloud
Archives Unleashed Cloud
24
Image: ARS-Cloud Prototype, Concept Design
Accessibility
&
Usability
Defining Access & Use
Access / Accessibility
the ability to make use of something, or capability of being reached, used, understood or appreciated
Usability
“The quality or state of being usable; ease of use”
26
We cannot talk about access without acknowledging the vital role usability plays
Applications of Access and Use Concepts
Code Base
28
User Interface
29
Datasets
Collaboration to process web archive collections and make derivatives available for all to use and explore.
Great starting point for scholars who might not have access to a web archive collections
Learning
Resources
Learning guides provide instructions on how to use and explore Cloud derivatives with external tools like Gephi and AntConc.
Toolkit Documentation
Cookbook approach, with pre-built scripts that users can plug in to address common analytic tasks.
Addresses uncertainty of how to use Toolkit
Supporting Materials
30
General Takeaways
31
General Takeaways
32
If you build it, they will come
In designing infrastructure for sustainable research platforms, we need to thoughtfully apply concepts of access and usability
If you design it, will it be usable?
In designing infrastructure for sustainable research platforms, we need to thoughtfully apply concepts of access and usability
CREDITS
This work is primarily supported by the Andrew W. Mellon Foundation. Other financial and in-kind support has come from the Social Sciences and Humanities Research Council, Compute Canada, the Ontario Ministry of Research, Innovation, and Science, York University Libraries, Start Smart Labs, and the Faculty of Arts and David R. Cheriton School of Computer Science at the University of Waterloo.
35
References
36
Images Used
In order of appearance; image title provided where possible.
37
https://archivesunleashed.org