1 of 16

Blue Core

A Community-Operated Shared BIBFRAME Datastore

AKA LD4P3 Phase 2

Tom Cramer, Stanford University

Simeon Warner, Cornell University �

12 December 2023

CNI

2 of 16

Conclusion

  • BIBFRAME and linked data are the future of bibliographic description
  • The LD4P grants & new developments in our systems environment have brought us to an inflection point
  • We are here to introduce Blue Core, a project to develop a truly shared bibliographic store of linked data,
    • supported by rich tooling and connections across the ecosystem.

3 of 16

History of LD4 Grants

  • LD4L (2014-2016) – Exploration, use cases & ontology work
  • LD4L Labs (2016-2018) – Tooling for libraries & more ontology
  • LD4P (2016-2018) – Sketching metadata workflows & ontology extensions
  • LD4P2 (2018-2020) – Sandboxes: cataloging, conversion & authority tooling
  • LD4P3 (2020-present) – “Closing the Loop”

4 of 16

Progress Made

  • BIBFRAME 2.x
  • Sinopia (linked data cataloging tool)
  • QA (authority look up & cataloging aid)
  • ShareVDE converter & PCC Data Pool
  • Discovery (knowledge panels & browse interfaces)
  • PCC engagement (Program for Cooperative Cataloging)
  • LD4 Organization & annual conference

The world enriched with library data; libraries enriched with the world’s data.

5 of 16

But…

  • Why does every institution need to mint its own set of URIs?
  • Is it realistic that every institution run its own RDF store?
  • Are we mindlessly recreating the MARC environment by copying metadata to edit it locally?
  • Isn’t the power of linked data in authoritative, shared URIs with enriching links?
  • Is this an opportunity to create a 21st century metadata ecosystem with new models (BIBFRAME), new tooling, new workflows and new efficiencies and affordances for discovery?

6 of 16

MARC - A Spectacularly Successful 58 Year Old Standard

Development started by Henrietta Avram 1965 at LC, US standard by 1971, MARC21 in 1987, MARCXML in 2002. We all rely on it!

Fireworks by Warren Tobias: https://commons.wikimedia.org/wiki/File:Fireworks_%28Unsplash%29.jpg

7 of 16

BUT… MARC combines in one record information about different entities that should be managed separately.

AND… MARC doesn’t use identifiers well for linking or for controlled vocabularies. Too many strings. (Linky-MARC approaches are a stop-gap.)

8 of 16

BIBFRAME for New Growth!

  1. Properly separates key entities
  2. Designed to support migration from and to MARC, necessary for inevitably lengthy transition

Cherry blossoms at Olin Library, Cornell University.

9 of 16

BIBFRAME: Share and Reuse

Linked Data Authorities

Shared Bibliographic Description

Minimal Local Data

10 of 16

What are we building?

  • A shared linked datastore for BIBFRAME Works, Instances, and more
  • Maintained and operated by a consortium of libraries that is seeded by Cornell, LC, Penn and Stanford, and structured to grow
  • That integrates with bibliographic systems and providers
    • Like EBSCO, Ex Libris, OCLC, ShareVDE, et al.
  • That breaks the “institutional copy” model of cataloging and moves to shared data norms
  • And locks bibliographic data open for reuse, accessible to other institutions

11 of 16

Why are we building this?

  • Avoid duplication of effort
  • Allows linked data to work at scale
  • Allows libraries to better focus their efforts towards making richer knowledge linkages
  • Enables data use by and (longer term) contributions from small and diverse libraries
  • Lays the foundation for a new future where a diverse set of institutions work together to incorporate and apply linked open data cataloging
  • Avoids vendor lock-in associated with data, systems, �or services

12 of 16

Blue Core

Architecture

Traditional workflows in and out, different store

Local discovery at each institution driven from shared store data

Entity focused RDF

cataloging

13 of 16

Additional Considerations

  1. Workflow & Dataflow
    1. What do cataloging workflows look like with Blue Core?
    2. Integrations with ILSes, editors, et al.
    3. Dataflow, system of record, updates across the ecosystem
  2. Organizational Design
    • How to define cataloging norms for a truly shared pool
    • Governance
    • Resourcing for build out & sustainability phases
  3. Project Details
    • Participation from catalogers, libraries, commercial sector, et al.
    • Project structure, plan and timeline

14 of 16

Approach

  • Iterative
    • Success in linked data initiatives often turns out different than originally envisioned
    • Developing an ecosystem, not a single system
    • Build in time to assess and adapt to findings
  • Incrementalist
    • Every phase a baby step
    • Each with its own value
    • Minimize risk
  • Prototype and build the tooling as we go
    • Cataloging can’t progress if work is only conceptual

15 of 16

Timeline

2023 – Conceptualize

2024 – Plan & prototype

2025 – Move to implementation

16 of 16

Discussion