1 of 9

APTrust

Academic Preservation Trust

Library of Congress

Designing Storage Architectures

March 27-28, 2023

2 of 9

Who are we?

A consortium of 18+ universities and memory institutions across the US.

3 of 9

What do we do?

  • Distributed digital preservation
    • Multiple geographic regions
    • Multiple storage technologies
    • Multiple cloud storage providers
  • Regular fixity checks
  • Access controls
  • Auditing through PREMIS events
  • Safe, multi-approval deletion workflows
  • Large-scale data restorations
  • Restoration spot tests
  • Registry & REST API for managing deposited materials

4 of 9

Why us?

  • Long-term preservation solutions are hard to build and maintain in-house.
    • Lack of expertise
    • Lack of political will and funding from above
    • Staff turnover & knowledge continuity
  • Our members wanted a community-driven solution
    • Guided by & answering to depositors, not investors

5 of 9

What we built

  • A stable, highly available, auto-scaling suite of services
    • Microservices running in Docker containers: Ingest, Restoration, Fixity, Deletion, Search & Discovery
    • Predictable performance under unpredictable loads
    • Cost effective
    • Requires near zero human intervention
  • It took three tries and 8.5 years to get all of this right

6 of 9

Our Challenges

  • We’re a small team
    • One admin-security-ops lead
    • One developer
  • Cost Control
    • Architecting services for efficiency
    • Rooting out and taming hidden fees
  • Regular fixity checks aren’t feasible for all storage solutions
    • Glacier
    • Glacier Deep Archive
    • Anything on tape or offline

7 of 9

Our Challenges

  • Multi-Cloud
    • Negotiate direct connects to reduce data transfer fees
    • Adding new vendors requires more legal/administrative work than tech work

  • Data Privacy and Security
    • No uniform or permanent solution for PII: depositors must encrypt data before sending it
    • Identity and access management across a large suite of services
    • Minimizing our attack surface

8 of 9

Want to know more?

9 of 9

Image Credits

Disk Drive

Card Catalog

Guy Fawkes Mask

Photo by Adnan Khan on Unsplash

Padlock on Computer

Photo by FLY:D on Unsplash

Cloud

Photo by C Dustin on Unsplash

Glacier

Tape Drive

Burning Cash

Photo by Jp Valery on Unsplash