1 of 21

Scholars Portal Dataverse Institutional Contacts Meeting

April 17, 2019

Amber Leahey

Kaitlin Newson

Meghan Goodchild

Grant Hurley

2 of 21

Agenda

  1. Official welcome to BCI (Amber)
  2. Dataverse-Archivematica integration (Grant)
  3. Latest release testing (Kaitlin)
  4. Plans for institutional login (Amber)
  5. Permissions update (Meghan)

3 of 21

Integrating Dataverse and Archivematica

+

=

Digital Preservation

4 of 21

Archivematica-Dataverse Integration Overview

  • What is Archivematica?
  • Features and limitations of the integration
  • Brief demo
  • Things to think about / next steps �

5 of 21

What is Archivematica?

  • An open source workflow tool for processing digital objects for preservation and access in a standards-based way
    • Modeled on OAIS standard (ISO 14721), implements METS/PREMIS for metadata
    • Implements many single-function tools into a configurable workflow using a micro-services design approach
  • Takes a pre-structured dataset and performs a series of configurable, linked preservation-supporting functions:
    • Checksum verifications
    • Virus scans
    • File format identification, characterization and validation
    • Normalization for preservation and access
    • Creation of METS file that describes the digital objects, their relationships and associated events
    • Many more things!
  • Creates an Archival Information Package (AIP) for preservation and Dissemination Information Package (DIP) for access

6 of 21

Integration Project

  • OCUL sponsored work with Artefactual Systems Inc.
    • Phase 1 - Proof of Concept (2015)
    • Phase 2 - Public release in v. 1.8 (November 2018)
    • Archivematica 1.9 (March 2019) - fixed some outstanding issues
  • Sandbox now available for testing and feedback�
  • Assumptions made:
    • Modular approach enabling selection/curation
    • Dataverse provides baseline preservation functions; Archivematica takes advantage of these features and adds further value by creating independent packages for long-term preservation and management
    • Preservers have access to a Dataverse instance, Archivematica instance(s) and storage
      • Access to restricted files requires an admin API key

7 of 21

Integration - Features

  • Queries Dataverse APIs to allow preserver to browse and select datasets
  • Retrieves selected Dataset and associated metadata
    • Includes DDI metadata, Dataverse checksums, citation-related text files, and tabular derivatives (if applicable)
  • Processes dataset in Archivematica according to:
    • Specific Dataverse-related micro-services
    • User-configured choices, file format normalization policies
  • Produces standards-compliant packages for storage. Includes:
    • METS file with DDI metadata & file section indicating relationships between objects
    • Originals and normalized copies of files
    • Verified Dataverse checksums for originals

Select & transfer dataset

Long- term

storage

Process

8 of 21

Integration - Limitations

  • Limited set of DDI metadata (title, identifier, author(s), distributor, version, restrictions)
  • Transfer browser interface is not searchable
  • Not automated - but could be
  • No messaging back to Dataverse to indicate a dataset has been processed

9 of 21

Archivematica Integration - Demo

10 of 21

Things to Think About

  • The integration is a tool, not a service (yet!)
  • No framework yet in place for a shared preservation service
  • Institutions are still developing policies, procedures, capacity in this area
  • Preservation needs:
    • Active commitment from stewards and service providers
    • Money & time
    • Policies/procedures around preservation decision-making
    • Management and maintenance into the future

11 of 21

Archivematica Integration - Next steps

12 of 21

Latest release testing

  • Patch release with some customizations
  • This update includes:
    • Institutional affiliation list in the support form, which will appear in the subject line of emails sent to the support list
    • Change to user guides in navigation, which removes “About” and changes the “User Guide” link to go to the SP guide
    • BCI schools available in affiliations list, and bilingual names for other institutions where applicable
  • Testing has found a few issues which have been resolved. We have a couple of minor issues to fix before release.
  • No downtime anticipated as this is not a full version upgrade
  • Releasing next week - thanks to those that tested!

13 of 21

Plans for institutional login

Goal: Offer institutional login (single sign-on) and verify users are affiliated with a participating member institution to deposit.

  1. Validate e-mail addresses of new users at Sign Up (using e-mail domain validations in form) & send confirmation e-mail to users
  2. Offer institutional login using Dataverse - Shibboleth Login, selected schools (incremental additions over time)

****PLEASE NOTE: These are just mockups, the workflow has not been fully designed yet

14 of 21

Sign Up

E-mail will be validated in the form against a list of known institutional e-mail domains (i.e. “...utoronto.ca”). If user does not enter valid email, error message will remind them to use institutional email, if they don’t have affiliated email - “contact us”

15 of 21

Login page

Users can choose to login using Shibboleth Login (only for select schools)

16 of 21

If the institution has Shibboleth login, they would be brought to the institution’s sign-in page

17 of 21

Pre-populated fields (Shibboleth login)

18 of 21

Permissions update

  • Enabled inherit permissions feature so that permissions are inherited from parent dataverse into sub dataverse.
  • SP Dataverse guide has been updated with explanation of permissions settings and our recommendations:
  • Custom “Trusted” role
    • Verified users would be able to add datasets, manage permissions on their datasets and publish own datasets
    • More investigation required due to inheriting permissions issue

19 of 21

20 of 21

Permissions - Other recommendations

  • Help users understand how to deposit within your dataverse by adding information
    • Tagline (Edit>Theme + Widgets)
    • Description (Edit>General Information)
  • Use demodv to test different workflows and ensure the settings work for your context

21 of 21

Questions/Comments?

  • Other feedback?
  • Set up regular meetings?