MSCWG Requirements Specification

Alex Ball   Keith Jeffery   Rebecca Koskela

The Metadata Standards Catalog Working Group (MSCWG) plans to develop the next generation of the Metadata Standards Directory. In broad outline, we want the information held by the Catalog to be more structured and granular, and to provide an API so that the information may be queried by automated tools, and returned in a form that tools would be able to parse and act upon. The Catalog will also be used by the Metadata Interest Group (MIG) to store mappings to metadata packages; these packages are intended to ease the task of comparing and translating between different standards.

This document refines these broad aspirations into a more precise set of requirements, based on the set of User Stories contributed by members of the MSCWG and its predecessor, the Metadata Standards Directory Working Group (MSDWG). The User Stories themselves are not repeated in this document, but are referred to by their identifier (e.g. ‘A-ID’).

Each requirement consists of the following information:

  • Requirement: The requirement itself. Some requirements have been labelled according to the aspect of the Catalog to which they relate.
  • Validation: The User Story ID from which the requirement was derived. If the requirement is based on preserving functionality from the Metadata Standards Directory rather than a User Story, the code ‘MSD’ is used in place of a User Story ID.
  • Prerequisite: An implication that the requirement has for the internal design of the Catalog.
  • Prerequisites marked ‘Data’ are pieces of information that the Catalog would need to hold about a metadata scheme in order to provide the functionality.
  • Prerequisites marked ‘Function’ are internal functions the Catalog would have to implement.
  • Priority: One of the following:
  • Must: Meeting this requirement is an essential success criterion. The Catalog cannot be released without it being met.
  • Should: Meeting this requirement is a high priority but not an essential criterion. We will try hard to meet it but may release the Catalog without having done so.
  • Could: Meeting this requirement is a low priority. We will factor it into the design of the Catalog, so that if we do not have time or effort available to meet it prior to release, we can come back to it after release.

System description

The Metadata Standards Catalog (MSC) will be an online application with two interfaces:

  • a set of Web pages designed for human interaction (GUI);
  • a Web service endpoint aimed at machine-to-machine interaction (API).

The aim of the MSC is to provide information about metadata standards and related resources/entities, to reduce the barriers to researchers using these standards to document their research data.

Data model

The MSC will hold detailed records for metadata schemes, understood as defined sets of elements (e.g. properties, keys, relations) that may be used to describe/document datasets, particularly research datasets. These include sets that are normally serialized as a standalone document or record, and those that are normally embedded in the same file as the data themselves.

The MSC will record relationships between metadata schemes. The relationship ‘is profile of’ (inverse: ‘has profile’) indicates that the first scheme directly borrows elements from the second, though perhaps with different usage instructions.

The term ‘standard’ is used to mean either that the scheme is maintained by a recognized standards body (de jure) or that is used independently by more than one user group (de facto). Priority for inclusion will be given to standards and profiles of standards.

Certain properties of metadata schemes will be expressed as relationships to other records, representing other entities. Examples include tools, user organizations, standards maintaining bodies, funding bodies. Reliable external registries will be preferred if available, otherwise the MSC will hold minimal records for these entity types.

Functional Requirements[a]

Search or browse

Requirement

GUI search: Responding to a search query of a metadata scheme identifier, display the corresponding record.

API search: Responding to a search query of a metadata scheme identifier, return the corresponding record.

Validation

A-ID, P-DLREC

Prerequisite

Data: ID, Alternate ID

Priority

Must

Requirement

GUI search: Responding to a search query of a metadata scheme name, display a list of matching records.

API search: Responding to a search query of a metadata scheme name, return a list of corresponding record IDs.

Validation

Prerequisite

Data: Title

Priority

Must

Requirement

GUI search: Responding to a search query of a set of subject terms, display a list of matching records.

API search: Responding to a search query of a set of subject terms, return a list of corresponding record IDs.

Validation

A-SUBJECT, F-SUBJECT, P-SUBJECT

Prerequisite

Data: Subject classification

Priority

Must

Requirement

GUI browse: Filter out records that do not match a given set of subject terms

Validation

MSD, F-FILTER

Prerequisite

Data: Subject classification

Priority

Must

Requirement

GUI search: Responding to a search query of (a) a funding body or (b) a data type, display a list of matching records.

API search: Responding to a search query of (a) a funding body or (b) a data type, return a list of corresponding record IDs.

Validation

A-FUND, A-TYPE, P-SUBJECT

Prerequisite

Data: Funder, Data type

Any others?

Priority

Should

Requirement

GUI browse: Filter out records that do not match a given set of criteria:

  • Maintained by a standards body
  • Number of known user organizations (banded)

Validation

F-FILTER

Prerequisite

Data: Standards maintaining body, User organizations, Standard status (de jure, de facto, draft?, deprecated?)

Any others?

Priority

Should

Requirement

API search: Responding to a search query of a set of element names, return a list of record IDs corresponding to metadata schemes containing those elements.

Validation

P-ID

Prerequisite

Data: Normalized specification

Priority

Should

Requirement

GUI search: Responding to a search query of a set of DOIs, display a list of matching records.

API search: Responding to a search query of a set of DOIs, return a list of corresponding record IDs.

Validation

F-SUBJECT

Prerequisite

Function: Translate abstract text/keywords into subject classification used by MSC

Priority

Could

Requirement

API search: Responding to a search query of a value encoding, return a list of record IDs corresponding to metadata schemes containing those elements, and for each, the names of elements using those encodings.

Validation

P-DLENC

Prerequisite

Data: Normalized specification

Priority

Could

Display

Requirement

Display list of schemes (a) that are profiles of this scheme, and (b) schemes of which this is a profile.

Validation

MSD, D-RELATE, F-COMPARE3

Prerequisite

Data: Is profile of

Priority

Must

Requirement

Display elements used for search, either directly or as a link.

Validation

MSD

Prerequisite

Data: ID, Alternate ID, Title, Subject classification, Funder, Data type, Standards body, User organizations, Normalized specification

Priority

Must

Requirement

Display other elements used by MSD.

Validation

MSD

Prerequisite

Data: Description, Native specification URL, Documentation URL, Externally maintained value constraints, Existing crosswalks, Status,

Tools

Priority

Must

Requirement

API: Responding to a query of a metadata scheme identifier and an MSC record element name, return the value of that element for the given metadata scheme.

Validation

P-DLPROP

Prerequisite

Priority

Must

Requirement

Display version history of this scheme as a list of version numbers and corresponding dates.

Validation

D-HISTORY1

Prerequisite

Data: Versions

Priority

Should

Requirement

Display name of/link to responsible standards body.

Validation

MSD, D-BODY

Prerequisite

Data: Standards body

Priority

Should

Requirement

Display version history of this scheme as a timeline.

Validation

D-HISTORY2

Prerequisite

Data: Versions

Priority

Could

Requirement

Provide links to (internally hosted) sample records that use this scheme.

Validation

D-SAMPLE

Prerequisite

Data: Sample records

Priority

Could

Requirement

Display any community endorsements of the scheme along with a reference to the documentation for this endorsement.

Validation

D-BADGE

Prerequisite

Data: Endorsement (by whom, documentation link)

Priority

Could

Requirement

Display automatically calculated maturity rating.

Validation

D-MATURE

Prerequisite

Data: Standards body, User organizations, Versions…

Function: Calculate maturity rating

Priority

Could

Update

Requirement

GUI: Provide Web form for adding or editing records.

Validation

F-EDREC

Prerequisite

Priority

Must

Requirement

API: Responding to an uploaded MSC record, add it to the MSC or replace the corresponding existing record.

Validation

P-ULREC

Prerequisite

Priority

Should

Compare

Requirement

GUI: Display visualizations of two metadata standards side-by-side, such that common (or similar) elements and value constraints are visible.

Validation

F-COMPARE1, F-COMPARE2, F-COMPARE3

Prerequisite

Data: Normalized specification

Priority

Should

Requirement

API: Responding to a pair of metadata scheme identifiers, return a crosswalk (or a sequence of crosswalks) for translating from the first to the second.

Validation

P-MIGRATE, P-XWALK

Prerequisite

Data: Normalized specification, Normalized crosswalks

Function: Calculate crosswalk from two specifications, Calculate migration pathway from existing crosswalks

Priority

Could

Non-functional requirements

Requirement

Reliability: Give each metadata standard record its own Web page with a Cool URI.

Validation

Prerequisite

Priority

Must

Requirement

Reliability: Evolve the public API in a backwards-compatible way.

Validation

Prerequisite

Priority

Must

Requirement

Usability: On form elements requiring a controlled term, prompt the user with the available options.

Validation

Prerequisite

Priority

Must

Requirement

Security: Allow registered users to update records, with version control to allow rollback of bad edits.

Validation

Prerequisite

Priority

Must

[a]Some comments from Melanie Wacker: 1) I would like to be able to easily tell in what format the standard is available, e.g. RDF in one or more serializations or traditional XML standard?

2) Which aggregators are using which standards or profiles. There was a discussion at ALA about how unsustainable it is that each aggregator/portal has its own set of standards and requirements for metadata sharing. If there would be at least one central location to find that information it might make it a little bit better.