1 of 47

Samvera Connect 2018

October 11, 2018

Casey Davis Kaufman, WGBH Media Library and Archives Associate Director / AAPB Project Manager

Sadie Roosa, WGBH Media Library and Archives Technical Project Manager / AMS 2.0 Product Owner

Andrew Myers, WGBH Media Library and Archives Supervising Developer

Jason Corum, WGBH Media Library and Archives Web Developer

Building on Hyrax and Avalon for the

2 of 47

  • AAPB Background

  • Plans for the Repository

  • PBCore Metadata

  • Advantages of Hyrax

  • Challenges so far

  • The Future

Outline

3 of 47

AAPB BACKGROUND

4 of 47

a collaboration between

WGBH and the Library of Congress

seeking to preserve and make accessible significant historical content created by public media, and to coordinate a national effort to save at-risk public media before its content is lost to posterity

5 of 47

More than 50,000 hours of digitized and born digital material from over 100 public broadcasting stations and organizations�

Website launched October 2015�

>35,000 streaming video and audio files in an Online Reading Room (39% of full collection)�

Public access to the full collection of video and audio on-site at WGBH and the Library of Congress

>2.5 million inventory records from 120 stations, in addition to the digitized items, are available for research

The AAPB Collection

www.americanarchive.org

6 of 47

AAPB Collaboration

Shared Responsibilities

    • Overall governance
    • Policy
    • Collection development
    • Ingest
    • Rights decisions

Preservation

Access & Outreach

7 of 47

8 of 47

Goals

Coordinate a national effort to preserve and make accessible as much significant public broadcasting materials as possible

Become a focal point for discoverability

Provide standards and best practices for storing, processing, preserving, and making accessible historical content

Facilitate the use of archival content by scholars, educators, students, journalists, media producers, researchers, and the public

Increase public awareness of the significance of historical public media and the need to preserve it and make it accessible

9 of 47

Commitment to Growth

The Library and WGBH are committed to growing the AAPB collection by up to 25,000 hours of digitized, or “born digital” content per year

This year, we have targeted outreach to stations in states, regions and communities currently underrepresented in the AAPB.

We are currently lacking any content in the archive from 12 states and the territories (excluding Guam)

We are providing grant writing assistance to organizations submitting applications for digitization grant programs

We have created a grant proposal document package for organizations that want to collaborate with us on proposals

   

10 of 47

The Value of AAPB Participation

  • Copies preserved long-term at the Library of Congress
  • AAPB manages the access platform and media servers
  • Their collection becomes part of a national initiative and has broader reach
  • AAPB provides guidance, fundraising assistance, and digitization project management support
  • Materials donated are included in AAPB initiatives such as transcript creation, crowdsourcing, automated metadata creation
  • Organizations have access to the AAPB’s Archival Management System (metadata repository) where they can search, manage, update, and access their records and media

11 of 47

AAPB

REPOSITORY

PLANS

12 of 47

Archival Management System

Participating organizations have access to the Archival Management System (AMS) where station administrators can:

    • Search and access their metadata records
    • Export their records
    • Import additional records
    • Edit their records
    • Watch/listen to their low-res proxy files
    • Download their low-res proxy files �

13 of 47

Unique Features of AMS

Does not manage files

Hierarchical PBCore data model

A lot of batch operations

Not for general access purposes�

Private created by Becris from Noun Project is licensed under CC BY 3.0

14 of 47

How to move forward?

Compared 3 options against our updated system requirements/desires

  1. Extending AMS 1.0

2. Building a new system from scratch (pretty much immediately discarded this one)

3. Building something Samvera-based

  • Avalon
  • Hyrax

Physical Instantiation

Essence Track

Essence Track

Physical Instantiation

Essence Track

Essence Track

Physical Instantiation

Essence Track

Essence Track

15 of 47

Change 1:�Update PBCore data model

Current AMS

Centers on the Contributing Organization

Built based on physical inventories

Designed primarily to manage digitization projects

Updated AMS

Centers on the Asset

Only Instantiations ”belong” to an organization

Multiple organizations can add child Instantiations to a central Asset

16 of 47

Change 2:�Use actively maintained plug-ins

Current AMS

Uses MINT 1 plugin for data mapping from XML and CSVs

MINT 1 is no longer supported

MINT only works in 1 browser right now

Updated AMS

Samvera works to actively maintain core components

WGBH is part of Samvera and can assist with continued maintenance

Essence Track

Essence Track

Essence Track

Essence Track

Essence Track

Essence Track

17 of 47

Change 3:�Build a system we can continue to develop as needs change

Current AMS

Built at the beginning of a project that has changed a lot

Built completely by outside contractors in PHP

Updated AMS

Built with a better idea of a more stable project’s needs

Built in a robust community in which our developers have expertise

Physical Instantiation

Essence Track

Essence Track

Physical Instantiation

Essence Track

Essence Track

Physical Instantiation

Essence Track

Essence Track

18 of 47

Initial Plan

Building on Avalon gave us the most to start working with, to let us focus on fine tuning the AAPB specific features.

The Andrew W. Mellon Foundation awarded AAPB a grant in part to carry out this work, with WGBH and AVP developers and Indiana University Avalon team advisors.

Physical Instantiation

Essence Track

Essence Track

Physical Instantiation

Essence Track

Essence Track

Physical Instantiation

Essence Track

Essence Track

19 of 47

Hiccups

By the time the AMS development project launched, Avalon had decided to move to Hyrax for Avalon 7.

Avalon would be moving to Hyrax during the same time period that AMS would be being developed.

We couldn’t wait for Avalon to finish moving to Hyrax before kicking off development because of grant deadlines and project needs.

20 of 47

New Plan

Build AMS on Hyrax

Build new AAPB specific features, some baked into the application and some as components or flip-flop features.

Use Avalon components where we can, like the Avalon Media Player.

Build components with the Avalon team to address common needs, like batch import.

21 of 47

AMS 2.0 �Core Team

Sadie Roosa (WGBH) - Product Owner

Adeel Ahmad (AVP) – Developer

Andrew Myers (WGBH) – Developer

Jason Corum (WGBH) – Developer

Kara Van Malssen (AVP) – Scrum Master

22 of 47

AMS 2.0 Roadmap

Milestone 1 – June 19, 2018

Goal: Create/Edit Records in the UI

Milestone 2 – September 11, 2018

Goal: Search/Discovery and Record Page Design

Milestone 3 – November 6, 2018

Goal: Media Display

Milestone 4 – November 20, 2018

Goal: Batch Import/Edit

Milestone 5 – December 4, 2018

Goal: Export and Reporting

Milestone 6 – December 18, 2018

Goal: Batch Edit

Milestone 7 – February 12, 2019

Goal: Migration

23 of 47

AAPB

METADATA

24 of 47

PBCore Description Document

Asset

Title, Description, Subject, Contributors, Rights, etc.

Instantiations

Physical Tape Format or Digital File Format, Duration, Media Type, Generation, Color, etc.

Essence Tracks

Data Rate, Encoding, Sampling Rate, Aspect Ratio, etc.

25 of 47

PBCore �Work Types �in Hyrax��1 Description Document �=/=�1 Work ��

Each repeatable section gets its own work type related to the others.

Physical Instantiation

Digital Instantiation

Contribution

Asset

Essence Track

Essence Track

26 of 47

What is a Contribution work?

Models the idea of a person/org contributing to an Asset in a specific way.

Too many roles to have each one as a predicate

Contributions consist of

- Name (required)

- Role

- Affiliation

- Portrayal

27 of 47

Simple Cataloging Workflow

28 of 47

Embedding Smaller Child Works

29 of 47

Embedding Smaller Child Works

30 of 47

Asset-centric Search

31 of 47

Future Data Modeling Ideas

Broadcast Series

Currently series data is repeated on each Asset in the Series

Maybe use Custom Collection Type rather than Work Type

Assets can, but don’t have to, belong to a Series

Assets can belong to multiple Series

When new Assets are imported an interface can help importers match them with Series they should belong to

32 of 47

ADVANTAGES

OF HYRAX

33 of 47

What you get out of the box

Code Generation – Simple to generate model, view, controller, actor stacks, tests

Styling – Looks nice and clean with minimal css additions required

Flip flop functionality – Easy to turn on/off features we may not need

34 of 47

Flexibility

Configurable Work Types – Easy for us to create multiple work types and define the possible relationships between them

Custom Collection Types – Looking into if we can use these to solve some data modeling challenges

Changing Requirements – Able to change requirement of file upload, checkbox on deposit agreement, and setting visibility levels

35 of 47

Samvera �and Hyrax Communities

Real time support and feedback – Regularly hear back quickly in Slack

Access to original developers – Easy to ping code authors with questions

Quick code resolution – The PRs we’ve submitted back to Hyrax core have been merged within a day or so

36 of 47

Maturity

More mature than previous Samvera applications

Started development on Hyrax 2.0 and have been able to keep up with releases since then

37 of 47

CHALLENGES

SO FAR

38 of 47

Everything’s under development

Hyrax – Building our application as our foundation changes

Avalon Components – In the process of componentizing features we want to use

Do we wait?

Sometimes yes – Avalon Media Player, Custom Collection Types

Sometimes no – generic batch import solution

39 of 47

What should we contribute back?

PBCore Data Model?

Documentation, but not code

Batch Import component?

Building with the Avalon team, hopefully usable by others

Small Hyrax patches?

Easier to submit, Quick turn around, helps us help ourselves

New minor features?

As flip-flop features? Will anyone else use them?

Intended to contribute big chunks of functionality

40 of 47

Is there documentation for this?

Scattered Sources – A lot of places to look, even for just one type of information or one section of the code

Missing Information – The documentation we have is great and helpful, but there isn’t always a lot of it

Known problem without enough resources to solve – Join or support the Documentation Working Group if you can!

41 of 47

How much do we test?

Feature specs

  • Initially relied on them a lot; then pulled back a bit
  • What’s the proper balance?

Setting up tests properly is cumbersome and mysterious

  • Factories that build/create dependent objects have helped
  • Would be great if we could create mocks with more efficiency

Using Rspec and Capybara to DRY things up; improve consistency

- We’ve created some custom matchers and helpers

These could be useful to the broader community

42 of 47

Should we do our own thing?

Fighting the Framework – When do we go around? And how?

Is Hyrax a Framework or an Application?

- Premise is a framework

- Behavior is an application

- Customizations seem expected

A ticking time bomb – Customizations might cause us more headache as we move to new versions of Hyrax.

43 of 47

The Future

44 of 47

AAPB Future with AMS

Continue Development

  • Hit the rest of our milestones
  • Migrate in February 2019
  • Continue to maintain and update after the grant ends to accommodate changing requirements

Roll Out to Contributing Organizations

  • Migrate their existing data for them
  • Training on new cataloging and batch import workflows
  • Promote as a viable option for small stations to manage their collections

45 of 47

Additional Future Development

WGBH Media Library and Archives received a 5 year challenge grant from NEH, in part to implement and further develop AMS for WGBH’s needs.

Build additional features for WGBH’s instance of AMS

  • Limiting access by IP
  • Exploring using Valkyrie with Hyrax

Integrate with WGBH’s digital preservation metadata system Phydo

  • Share IDs
  • Allow for navigation from one system to the other

46 of 47

americanarchive.org�

@amarchivepub

facebook.com/amarchivepub

@amarchivepub

47 of 47

THANK YOU!

Casey Davis Kaufman, WGBH Media Library and Archives Associate Director / AAPB Project Manager

casey_davis-kaufman@wgbh.org

Sadie Roosa, WGBH Media Library and Archives Technical Project Manager / AMS 2. 0 Product Owner

sadie_roosa@wgbh.org

Andrew Myers, WGBH Media Library and Archives Supervising Developer

andrew_myers@wgbh.org

Jason Corum, WGBH Media Library and Archives Web Developer

jason_corum@wgbh.org