1 of 87

Introduction to Fedora

Before We Begin

  • Ensure you’re running Java 11
  • Go to: https://wiki.lyrasis.org/display/FF/2022-09+iPres+2022+-+Introduction+to+Fedora+6.0+Workshop
    • Download the latest version of Fedora from GitHub
    • Download Migration Validator *
    • There are 2 images you can download, or have 2 available images on your machine
    • Download the Migration Example
  • Have Docker Downloaded on your machine

* Only if you’re mildly comfortable on the command line and want to run some test scripts in terminal - this is optional. I will demo it.

2 of 87

Introduction to Fedora

Overview

Arran Griffith, Program Manager, LYRASIS

3 of 87

Navigate to your sandbox instance

    • username: fedoraAdmin
    • password: fedoraAdmin

NOTE to sandbox users: replace “localhost” with your sandbox IP address when copying example commands.

Or you can use Docker!

docker run -d –rm -p8080:8080 -p61616:61616 -p8181:8181 --name=fcrepo6 fcrepo/fcrepo:latest�docker exec -it fcrepo6 bash

4 of 87

Learning Outcomes

Understand what Fedora can do for you

Understand the key features of the software

Gain hands-on experience with Fedora

5 of 87

​LYRASIS supports enduring access to our shared academic, scientific and cultural heritage through leadership in open technologies, content ​services, digital solutions and collaboration with archives, libraries, museums and knowledge communities worldwide.​​​

6 of 87

Stores, preserves, and provides access to digital objects

Supports flexible content models for objects

Supports semantic relationships between objects

Supports millions of objects, both large and small

Interoperates with other applications and services

7 of 87

Why use Fedora?

Fedora is flexible: it can handle both simple and complex use cases

Content in Fedora is durable: Fedora supports long-term preservation

Fedora is standards-based

Fedora powers successful digital repository and DAM applications

Fedora is backed by a thriving community

8 of 87

What’s new with Fedora 6?

Enhanced digital preservation capabilities

Robust migration tooling

Built-in search

Improved performance and scale

Metrics collection and reporting

9 of 87

Fedora Front-Ends

Fedora is middleware

You can build a custom framework, or join a broader community:

10 of 87

Fedora Examples

11 of 87

Institutional Repository

https://arminda.whitman.edu

12 of 87

Research Data

https://deepblue.lib.umich.edu/data

13 of 87

Newspapers

https://www.lib.umd.edu/univarchives/student-newspapers

14 of 87

Special Collections

https://rgst.staatsbibliothek-berlin.de

https://youtu.be/v9ycBYjO0KU

15 of 87

Fundamental Concepts

16 of 87

COMPONENT of ECOSYSTEM

Fedora

Binaries

HTTP- API

IIIF

Server?

Site

?

Profiles

?

Data Export Applications

Research

Data

?

Binaries

Camel Toolbox

Triple

store

SOLR

Your

HTTP

Service

17 of 87

Web Resources

Everything is a web resource with an identifier (URI)

  • Container (metadata) resource
  • Binary (file) resource

Resources can have properties expressed as RDF triples

Resources can contain other resources

  • Containers or
  • Binaries

18 of 87

Structure

  • Hierarchical
  • Flat

19 of 87

Book Example: Hierarchical

Book Collection

Book 1

Book 2

Page 2

Page 1

Page 2

Page 1

Page1.jpg

Page1.tiff

Page2.tiff

Page2.jpg

Page2.tiff

Page2.jpg

Page1.tiff

Page1.jpg

Container

Binary

Example URL:

https://host/fedora/BookCollection/Book1/Page1/Page1.jpg

20 of 87

Book Example: Flat

Book Collection

Page 1

Book 2

Page 2

Book 1

Page 2

Page 1

Page1.jpg

Page1.tiff

Page2.tiff

Page2.jpg

Page2.tiff

Page2.jpg

Page1.tiff

Page1.jpg

Container

Binary

pcdm:hasMember

Example URL:

https://host/fedora/Page1/Page1.jpg

21 of 87

RDF Properties

22 of 87

Core Features

23 of 87

Core Services and Standards

Versioning

02

  • Memento

Create/Read/Update/Delete

01

  • Linked Data Platform

Service

Standard

Authorization

03

  • Solid
  • WebAC

Fixity

04

  • Digest
  • Want-Digest Header

Persistence

05

  • Oxford Common File Layout (OCFL)

Messaging

06

  • Activity Streams 2.0

24 of 87

Oxford Common File Layout

A simple, non-proprietary, specified, open-standards approach to the layout of preservation persistence.

25 of 87

OCFL Offers...

  • Parsability
  • Robustness
  • Versioning
  • Storage Diversity
  • Completeness

26 of 87

Benefits of Fedora + OCFL

Application-independent persistence

Fewer migrations in the future

27 of 87

Standards-based pattern for creating and managing resources

Basic tools for modeling content

28 of 87

HTML Interface Cheatsheet

PATCH

Slug

{

POST

GET

DELETE

Resource URI

29 of 87

Navigate to your sandbox instance

    • username: fedoraAdmin
    • password: fedoraAdmin

NOTE to sandbox users: replace “localhost” with your sandbox IP address when copying example commands.

Or you can use Docker!

docker run -d –rm -p8080:8080 -p61616:61616 -p8181:8181 --name=fcrepo6 fcrepo/fcrepo:latest�docker exec -it fcrepo6 bash

30 of 87

Step 1a: Resource Creation (POST)

  • Go to http://local:8080/fcrepo/rest
  • In “Type” select field choose “basic container” (default)
  • In “Identifier” text field enter “image-collection”
  • Press “add” button

$ curl -i -X POST -ufedoraAdmin:fedoraAdmin -H "Slug: image-collection" http://localhost:8080/fcrepo/rest

31 of 87

Step 1b: Resource Creation (POST)

  • You will be redirected to http://localhost:8080/fcrepo/rest/image-collection/
  • In “Type” select field choose “basic container” (default)
  • In “Identifier” text field enter “andromeda-galaxy”
  • Press “add” button

$ curl -i -X POST -ufedoraAdmin:fedoraAdmin -H "Slug: andromeda-galaxy" http://localhost:8080/fcrepo/rest/image-collection

32 of 87

Step 2: Resource Retrieval (GET)

  • Every time you got redirected after creating a Container you were using GET.
  • Retrieval is accessed directly via the path that defines a resource and contains user and some server managed RDF triples.

$ curl -i -H "Accept:text/turtle" -ufedoraAdmin:fedoraAdmin http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy

33 of 87

Binaries

We’ll use an image for (most) examples of binaries

Download from: https://go.nasa.gov/2Hc15jG

(Save as andromeda.jpg)

Andromeda Galaxy

Credit: NASA

34 of 87

Step 3: Binary Resource Creation (POST)

  • Go to http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy
  • In “Type” select field choose “binary” In “Identifier” text field enter andromeda.jpg”
  • In “File” choose the andromeda.jpg image
  • Press “add” button

$ curl -i -X POST -ufedoraAdmin:fedoraAdmin -H"Content-Type: image/jpeg" -H"Slug: andromeda.jpg" --data-binary "@andromeda.jpg" http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy

35 of 87

Step 4: Binary Resource Retrieval (GET)

  • You will be redirected to http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy/andromeda.jpg/fcr:metadata
  • Notice the fcr:metadata part!
    • Image content is at “andromeda.jpg”
    • Its metadata (rdf properties) in a virtual subpath named /fcr:metadata
    • The fcr:metadata part is an implementation detail, but the fact that the RDF that describes a binary has a URI and can be linked to is important.

36 of 87

Step 5: Update RDF (PATCH)

  • Go to http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy
  • We will add a “dc:title” property using “Update Properties”

DELETE {}�INSERT { <> dc:title "Andromeda Galaxy"}�WHERE {}

b. Press “Update

$ curl -i -XPATCH -H"Content-Type:application/sparql-update" -ufedoraAdmin:fedoraAdmin -d "INSERT DATA {<> <http://purl.org/dc/elements/1.1/title> 'Andromeda Galaxy'}" http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy/andromeda.jpg/fcr:metadata

37 of 87

Our updated RDF Properties from step 5.

38 of 87

Last step: Delete a resource (DELETE)

  • Go to http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy/andromeda.jpg
  • Press “DELETE” (the red one)
  • Reload http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy/andromeda.jpg

What do you see?

$ curl -i -XDELETE -ufedoraAdmin:fedoraAdmin http://localhost:8080/fcrepo/rest/image-collection/andromeda-galaxy/andromeda.jpg

39 of 87

Departed

Fedora creates tombstone resources at “original/path/fcr:tombstone” URL, in this case

“image-collection/andromeda-galaxy/andromeda.jpg/fcr:tombstone”

To recreate a resource at that same PATH you need to delete the tombstone placeholder first

Discovered tombstone resource at /image-collection/andromeda-galaxy/andromeda.jpg, departed: 2019-08-21T17:16:17.047Z

40 of 87

Versioning

41 of 87

Versioning

Versions: retrieved via Memento protocol

Versions: created automatically or on demand via the REST-API

42 of 87

What is a “version”?

  • Web Resource (so it has a URI, can be linked to, etc)
  • Contains the contents of a resource at a given point in time
  • Immutable
  • Timestamped (to the second)
  • Discoverable (i.e. does this resource have any versions?)

Icon made by ultimatearm from Flaticon

43 of 87

Key Memento Concepts

  • Memento - versioned body of the original resource
  • Time Gate - an endpoint supporting time-based queries for Mementos
  • Time Map - list of mementos
  • Original Resource - the resource that was versioned

44 of 87

Creating a Version

Create a resource in Fedora:

Download Old Californa Map and create a binary resource in the image-collection.

http://localhost:8080/fcrepo/rest/image-collection/map

45 of 87

Modify the Resource

Update the metadata of http://localhost:8080/fcrepo/rest/image-collection/map

DELETE {}

INSERT { <> dc:title "Map of California" }

WHERE {}

46 of 87

View the Version History

View versions via HTML Interface

$ curl -i -X POST -ufedoraAdmin:fedoraAdmin http://localhost:8080/fcrepo/rest/image-collection/map/fcr:versions

47 of 87

Advanced Versioning

You can do many more things with versions:

  • Get a list of all versions
  • Find the closest version to a specific date
  • And more!

https://wiki.lyrasis.org/display/FEDORA6x/Versioning

48 of 87

Bundle multiple changes into a single version

With Transactions

  • Start Transaction

  • Take the value Location of the Location header and use it to make updates

  • Commit transaction

$ curl -i -u fedoraAdmin:fedoraAdmin -X POST"http://localhost:8080/fcrepo/rest/fcr:tx"

$ curl -H"Atomic-ID: <Location>" -X PUT "http://localhost:8080/fcrepo/rest/test1234" -u fedoraAdmin:fedoraAdmin

$ curl <Location> -X PUT -u fedoraAdmin:fedoraAdmin

49 of 87

Fixity

  • Over time, digital objects can become corrupt
  • Fixity checks help preserve digital objects by verifying their integrity
  • Fixity works as part of Fedora’s preservation tooling to ensure the long term stability of your resources.

50 of 87

Fixity continued...

On ingest, Fedora can verify a user-provided digest against the calculated value�

On demand, A digest can be recalculated and compared at any time via a REST-API request

51 of 87

Limitations of Fixity Checking

  • It’s a check. Plan for what to do should you encounter a bad binary!
    • Versioning
    • Alternate Backups�
  • Only works on Fedora-managed content
    • Doesn’t retrieve external content�
  • Digest algorithms implemented: sha1, sha256, sha512, and md5

52 of 87

Fixity: Hands On

Check the fixity of the California map

  • Goto http://localhost:8080/fcrepo/rest/image-collection/map/fcr:metadata
  • Click the button labeled Fixity

$ curl -i -u fedoraAdmin:fedoraAdmin -X GET http://localhost:8080/fcrepo/rest/image-collection/map/fcr:fixity��$ curl -I -H "Want-Digest: MD5" -u fedoraAdmin:fedoraAdmin -X GET http://localhost:8080/fcrepo/rest/image-collection/map��

53 of 87

Simple Search

54 of 87

Simple Search

New resources are automatically indexed

Search using the following fields:

  • fedora_id*
  • rdf_type*
  • mime_type*

* Wildcards (*) supported

  • content_size
  • created_date
  • updated_date

55 of 87

Simple Search

56 of 87

Stats

57 of 87

Stats API

curl -u fedoraAdmin:fedoraAdmin http://localhost:8080/rest/fcr:stats

curl -u fedoraAdmin:fedoraAdmin http://localhost:8080/rest/fcr:stats/binaries

curl -u fedoraAdmin:fedoraAdmin http://localhost:8080/rest/fcr:stats | jq

curl -u fedoraAdmin:fedoraAdmin http://localhost:8080/fcr:stats/binaries | jq

curl -u fedoraAdmin:fedoraAdmin http://localhost:8080/rest/fcr:stats/rdf-types | jq

58 of 87

Metrics

  • Turn on metrics collection in Fedora (-Dfcrepo.metrics.enabled = true)
  • Run Prometheus collector to aggregate metrics
  • Run Grafana to visualize metrics

Documentation

59 of 87

60 of 87

61 of 87

External Services Overview

62 of 87

COMPONENT of ECOSYSTEM

Fedora

Binaries

HTTP- API

IIIF

Server?

Site

?

Profiles

?

Data Export Applications

Research

Data

?

Binaries

Camel Toolbox

Triple

store

SOLR

Your

HTTP

Service

63 of 87

Message Based Integrations

External services listen for messages from Fedora

Relevant messages trigger services

This system is scalable and fault-tolerant

64 of 87

External - Indexing

Index repository content for search

Indexing is configurable - could be based on any property

Solr and Elasticsearch have been tested

65 of 87

The Fedora & The Camel Toolbox

66 of 87

What is Camel Toolbox?

Camel is an a platform for easily implementing a host of Enterprise Messaging Patterns.

CTB is a swiss-army knife of Fedora related services

67 of 87

What can Camel Toolbox do?

  • Make metadata searchable in Solr
  • Push metadata to SPARQL-compliant �triplestore
  • Perform Fixity Checks
  • Store audit info in a triplestore.
  • Integrate with any HTTP endpoint

68 of 87

Demo: Solr and Triplestore Indexing

  • Spin up Camel Toolbox with Docker Compose

  • Solr Indexing
  • Triple store indexing

$ git clone git@github.com:fcrepo-exts/fcrepo-camel-toolbox.git

$ cd fcrepo-camel-toolbox/docker-compose

$ docker compose up -d�# in separate terminal windows - view output from each container

$ docker logs -f docker-compose_fcrepo_1

$ docker logs -f docker-compose_fuseki_1

$ docker logs -f docker-compose_solr_1

$ docker logs -f docker-compose_camel-toolbox_1

69 of 87

Camel Toolbox Demo Video

70 of 87

Migrating to Fedora 6

71 of 87

IMLS Grant-funded Work

Fedora Migration Paths and Tools: A Pilot Project

  • $249,859 IMLS Funded grant over 18 months
    • No-cost 18 month extension granted in summer 2021
  • Focused on Fedora 3.x to 6.x migrations
  • 2 pilot partner institutions:
    • University of Virginia
    • Whitman College

72 of 87

Resources

https://github.com/fcrepo-exts/migration-utils

  • A framework to support migration of data from Fedora 3 to Fedora 6 repositories

https://github.com/fcrepo-exts/fcrepo-migration-validator

  • A command-line tool for validating migrations of Fedora 3 datasets to Fedora 6

https://docs.google.com/document/d/1AM8TMb0H0Q5RMCp96YHMXOROx--KpWDV8Q8rOEf1z-Q/edit

  • Templates, tools, best practices and helpful links for cleaning up and analyzing metadata

https://wiki.lyrasis.org/display/FEDORA6x/Migrate+to+Fedora+6

  • Step-by-step instructions for migrating to Fedora 6

Migration-utils

Migration Validator

Metadata Remediation Guide

Migration Guide

73 of 87

Migration Routes to Fedora 6

F6

F3

F4

F5

74 of 87

Migration-utils

migration-utils takes command-line arguments

For an initial test migration:

  • --migration-type: FEDORA_OCFL | PLAIN_OCFL
  • --source-type: legacy | akubra | exported
  • --limit: stop after processing X resources
  • --target-dir: directory for exported objects
  • --objects-dir: Fedora 3 objects directory
  • --datastreams-dir: Fedora 3 datastreams directory

75 of 87

Migration-utils - Common Configuration Options

7. --exported-dir: directory of exported Fedora 3 content

8. --resume: resume from last migrated object

9. --continue-on-error: skip objects with errors

10. --username: Username to associate with resources

76 of 87

Migration-utils - Common Configuration Options

11. --pid-file: a list of PIDs to migrate

12. --extensions: add file extensions to migrated files

13. --f3hostname: hostname to replace placeholders

14. --algorithm: Checksum algorithm for OCFL objects

15. --no-checksum-validation: Disable datastream checksum validation

16. --enable-metrics: Enable Prometheus metrics

77 of 87

Example test migration

java -jar migration-utils-6.2.0-SNAPSHOT-driver.jar

--source-type=legacy

--datastreams-dir=brown/source/brown-subset/repoarchive/datastreams_2019/2019

--objects-dir=brown/source/brown-subset/repostore/data_2019/objects/2019

-- target-dir=brown/fedora_home

-- working-dir=brown/work

78 of 87

Example test migration

java -jar migration-utils-6.2.0-SNAPSHOT-driver.jar -t legacy -d brown/source/brown-subset/repoarchive/datastreams_2019/2019 -o brown/source/brown-subset/repostore/data_2019/objects/2019 -a brown/fedora_home -i brown/work

Source Type

Datastreams Directory

Objects Directory

Target Directory

Working Directory

79 of 87

Migrate a sample data set

Migration Sample Data: Migration Example

  • Prepare source data set
  • Perform migration
  • Stand up new Fedora on top of migrated data

Migration Documentation: https://wiki.lyrasis.org/display/FEDORA6x/Migrate+to+Fedora+6

80 of 87

Migration Validator

Validations include:

  • Object metadata
  • List of datastreams
  • Datastream size
  • Datastream checksum (optional)
  • Object count (optional)
  • Number of versions

81 of 87

Migration Validator - Example

General usage:

java -jar fcrepo-migration-validator-1.0.0-driver.jar [cli options | --help]

82 of 87

Example Migration Validation

java -jar fcrepo-migration-validator-1.1.0-driver.jar

--source-type=legacy

--datastreams-dir=brown/source/brown-subset/repoarchive/datastreams_2019/2019

--objects-dir=brown/source/brown-subset/repostore/data_2019/objects/2019

-- ocfl-root-dir=brown/fedora_home/data/ocfl-root

-- results-dir=/validation-results

83 of 87

Easy as that!

84 of 87

Thank You!

Arran Griffith - Fedora Program Manager, LYRASIS

arran.griffith@lyrasis.org

Ways to Get Involved:

  • Weekly Tech Call (Thursdays @11am Eastern)
  • Mailing Lists
  • Slack

85 of 87

86 of 87

Resources & Helpful Links

About: https://wiki.lyrasis.org/display/FF/

Communication Channels: https://wiki.lyrasis.org/display/FF/Mailing+Lists+etc

  • Slack, Mailing Lists, Newsletter

IMLS Grant Details: https://wiki.lyrasis.org/display/FF/Fedora+Migration+Paths+and+Tools

Oxford Common File Layout Community: https://ocfl.io

Get Fedora:

Download the latest version: https://github.com/fcrepo/fcrepo/releases

Fedora User Docs: https://wiki.lyrasis.org/display/FEDORA6x

Migration Support: https://wiki.lyrasis.org/display/FEDORA6x/Migrate+to+Fedora+6

Camel Toolbox: https://github.com/fcrepo-exts/fcrepo-camel-toolbox

BECOME A MEMBER

87 of 87

Wrap Up - Questions?