Samvera Connect 2018
October 11, 2018
Casey Davis Kaufman, WGBH Media Library and Archives Associate Director / AAPB Project Manager
Sadie Roosa, WGBH Media Library and Archives Technical Project Manager / AMS 2.0 Product Owner
Andrew Myers, WGBH Media Library and Archives Supervising Developer
Jason Corum, WGBH Media Library and Archives Web Developer
Building on Hyrax and Avalon for the
Outline
AAPB BACKGROUND
a collaboration between
WGBH and the Library of Congress
seeking to preserve and make accessible significant historical content created by public media, and to coordinate a national effort to save at-risk public media before its content is lost to posterity
More than 50,000 hours of digitized and born digital material from over 100 public broadcasting stations and organizations�
Website launched October 2015�
>35,000 streaming video and audio files in an Online Reading Room (39% of full collection)�
Public access to the full collection of video and audio on-site at WGBH and the Library of Congress
>2.5 million inventory records from 120 stations, in addition to the digitized items, are available for research
AAPB Collaboration
Shared Responsibilities
Preservation
Access & Outreach
Goals
Coordinate a national effort to preserve and make accessible as much significant public broadcasting materials as possible
Become a focal point for discoverability
Provide standards and best practices for storing, processing, preserving, and making accessible historical content
Facilitate the use of archival content by scholars, educators, students, journalists, media producers, researchers, and the public
Increase public awareness of the significance of historical public media and the need to preserve it and make it accessible
Commitment to Growth
The Library and WGBH are committed to growing the AAPB collection by up to 25,000 hours of digitized, or “born digital” content per year
This year, we have targeted outreach to stations in states, regions and communities currently underrepresented in the AAPB.
We are currently lacking any content in the archive from 12 states and the territories (excluding Guam)
We are providing grant writing assistance to organizations submitting applications for digitization grant programs
We have created a grant proposal document package for organizations that want to collaborate with us on proposals
The Value of AAPB Participation
AAPB
REPOSITORY
PLANS
Archival Management System
Participating organizations have access to the Archival Management System (AMS) where station administrators can:
Unique Features of AMS
Does not manage files
Hierarchical PBCore data model
A lot of batch operations
Not for general access purposes�
Private created by Becris from Noun Project is licensed under CC BY 3.0
How to move forward?
Compared 3 options against our updated system requirements/desires
2. Building a new system from scratch (pretty much immediately discarded this one)
3. Building something Samvera-based
Physical Instantiation
Essence Track
Essence Track
Physical Instantiation
Essence Track
Essence Track
Physical Instantiation
Essence Track
Essence Track
Change 1:�Update PBCore data model
Current AMS
Centers on the Contributing Organization
Built based on physical inventories
Designed primarily to manage digitization projects
Updated AMS
Centers on the Asset
Only Instantiations ”belong” to an organization
Multiple organizations can add child Instantiations to a central Asset
Change 2:�Use actively maintained plug-ins
Current AMS
Uses MINT 1 plugin for data mapping from XML and CSVs
MINT 1 is no longer supported
MINT only works in 1 browser right now
Updated AMS
Samvera works to actively maintain core components
WGBH is part of Samvera and can assist with continued maintenance
Essence Track
Essence Track
Essence Track
Essence Track
Essence Track
Essence Track
Change 3:�Build a system we can continue to develop as needs change
Current AMS
Built at the beginning of a project that has changed a lot
Built completely by outside contractors in PHP
Updated AMS
Built with a better idea of a more stable project’s needs
Built in a robust community in which our developers have expertise
Physical Instantiation
Essence Track
Essence Track
Physical Instantiation
Essence Track
Essence Track
Physical Instantiation
Essence Track
Essence Track
Initial Plan
Building on Avalon gave us the most to start working with, to let us focus on fine tuning the AAPB specific features.
The Andrew W. Mellon Foundation awarded AAPB a grant in part to carry out this work, with WGBH and AVP developers and Indiana University Avalon team advisors.
Physical Instantiation
Essence Track
Essence Track
Physical Instantiation
Essence Track
Essence Track
Physical Instantiation
Essence Track
Essence Track
Hiccups
By the time the AMS development project launched, Avalon had decided to move to Hyrax for Avalon 7.
Avalon would be moving to Hyrax during the same time period that AMS would be being developed.
We couldn’t wait for Avalon to finish moving to Hyrax before kicking off development because of grant deadlines and project needs.
New Plan
Build AMS on Hyrax
Build new AAPB specific features, some baked into the application and some as components or flip-flop features.
Use Avalon components where we can, like the Avalon Media Player.
Build components with the Avalon team to address common needs, like batch import.
AMS 2.0 �Core Team
Sadie Roosa (WGBH) - Product Owner
Adeel Ahmad (AVP) – Developer
Andrew Myers (WGBH) – Developer
Jason Corum (WGBH) – Developer
Kara Van Malssen (AVP) – Scrum Master
AMS 2.0 Roadmap
Milestone 1 – June 19, 2018
Goal: Create/Edit Records in the UI
Milestone 2 – September 11, 2018
Goal: Search/Discovery and Record Page Design
Milestone 3 – November 6, 2018
Goal: Media Display
Milestone 4 – November 20, 2018
Goal: Batch Import/Edit
Milestone 5 – December 4, 2018
Goal: Export and Reporting
Milestone 6 – December 18, 2018
Goal: Batch Edit
Milestone 7 – February 12, 2019
Goal: Migration
AAPB
METADATA
PBCore Description Document
Asset
Title, Description, Subject, Contributors, Rights, etc.
Instantiations
Physical Tape Format or Digital File Format, Duration, Media Type, Generation, Color, etc.
Essence Tracks
Data Rate, Encoding, Sampling Rate, Aspect Ratio, etc.
PBCore �Work Types �in Hyrax��1 Description Document �=/=�1 Work ���
Each repeatable section gets its own work type related to the others.
Physical Instantiation
Digital Instantiation
Contribution
Asset
Essence Track
Essence Track
What is a Contribution work?
Models the idea of a person/org contributing to an Asset in a specific way.
Too many roles to have each one as a predicate
Contributions consist of
- Name (required)
- Role
- Affiliation
- Portrayal
Simple Cataloging Workflow
Embedding Smaller Child Works
Embedding Smaller Child Works
Asset-centric Search
Future Data Modeling Ideas
Broadcast Series
Currently series data is repeated on each Asset in the Series
Maybe use Custom Collection Type rather than Work Type
Assets can, but don’t have to, belong to a Series
Assets can belong to multiple Series
When new Assets are imported an interface can help importers match them with Series they should belong to
ADVANTAGES
OF HYRAX
What you get out of the box
Code Generation – Simple to generate model, view, controller, actor stacks, tests
Styling – Looks nice and clean with minimal css additions required
Flip flop functionality – Easy to turn on/off features we may not need
Flexibility
Configurable Work Types – Easy for us to create multiple work types and define the possible relationships between them
Custom Collection Types – Looking into if we can use these to solve some data modeling challenges
Changing Requirements – Able to change requirement of file upload, checkbox on deposit agreement, and setting visibility levels
Samvera �and Hyrax Communities
Real time support and feedback – Regularly hear back quickly in Slack
Access to original developers – Easy to ping code authors with questions
Quick code resolution – The PRs we’ve submitted back to Hyrax core have been merged within a day or so
Maturity
More mature than previous Samvera applications
Started development on Hyrax 2.0 and have been able to keep up with releases since then
CHALLENGES
SO FAR
Everything’s under development
Hyrax – Building our application as our foundation changes
Avalon Components – In the process of componentizing features we want to use
Do we wait?
Sometimes yes – Avalon Media Player, Custom Collection Types
Sometimes no – generic batch import solution
What should we contribute back?
PBCore Data Model?
Documentation, but not code
Batch Import component?
Building with the Avalon team, hopefully usable by others
Small Hyrax patches?
Easier to submit, Quick turn around, helps us help ourselves
New minor features?
As flip-flop features? Will anyone else use them?
Intended to contribute big chunks of functionality
Is there documentation for this?
Scattered Sources – A lot of places to look, even for just one type of information or one section of the code
Missing Information – The documentation we have is great and helpful, but there isn’t always a lot of it
Known problem without enough resources to solve – Join or support the Documentation Working Group if you can!
How much do we test?
Feature specs
Setting up tests properly is cumbersome and mysterious
Using Rspec and Capybara to DRY things up; improve consistency
- We’ve created some custom matchers and helpers
These could be useful to the broader community
Should we do our own thing?
Fighting the Framework – When do we go around? And how?
Is Hyrax a Framework or an Application?
- Premise is a framework
- Behavior is an application
- Customizations seem expected
A ticking time bomb – Customizations might cause us more headache as we move to new versions of Hyrax.
The Future
AAPB Future with AMS
Continue Development
Roll Out to Contributing Organizations
Additional Future Development
WGBH Media Library and Archives received a 5 year challenge grant from NEH, in part to implement and further develop AMS for WGBH’s needs.
Build additional features for WGBH’s instance of AMS
Integrate with WGBH’s digital preservation metadata system Phydo
americanarchive.org�
@amarchivepub
facebook.com/amarchivepub
@amarchivepub
THANK YOU!
Casey Davis Kaufman, WGBH Media Library and Archives Associate Director / AAPB Project Manager
casey_davis-kaufman@wgbh.org
Sadie Roosa, WGBH Media Library and Archives Technical Project Manager / AMS 2. 0 Product Owner
sadie_roosa@wgbh.org
Andrew Myers, WGBH Media Library and Archives Supervising Developer
andrew_myers@wgbh.org
Jason Corum, WGBH Media Library and Archives Web Developer
jason_corum@wgbh.org