1 of 10

A very brief introduction to �making metadata with JSON-LD

1

Stian Soiland-Reyes

The University of Manchester

Abigail Miller

The Jackson Laboratory

2 of 10

2

3 of 10

Benefits of JSON-LD

When using JSON-LD, we’re not just formatting data,

we’re giving it meaning

Semantic clarity

            • Links data to well-defined concepts using vocabularies like schema.org.
        • Allows disambiguating terms like “name” by giving them context

Interoperability

  • Enables consistent data exchange across systems, APIs, and the web

Machine and human readable

  • Retains the readability of JSON while providing structure that can be parsed by machines

Prepares data for LLM integration

  • Organizing data in JSON-LD gives it clear structure and meaning, making it easier for LLMs to understand and reason about

4 of 10

4

  "@type": "LearningResource",

{ "@context": "https://schema.org/",

Type (identifier implied)

Metadata about training material

Nested object

with its own @type and attributes

    "name": "Adding nanomaterial data",

    "version": "0.9.3",

    "description": "This tutorial describes how nanomaterial data can be added to an eNanoMapper server using a RDF format.",

    "license": "https://creativecommons.org/licenses/by/4.0/",

    "keywords": "ontologies, enanomapper, RDF",

    "url": "https://nanocommons.github.io/tutorials/enteringData/",

    "provider": {

      "@type": "Organization",

      "name": "NanoCommons",

      "url": "https://www.nanocommons.eu/"

    },

"…": {}

 }

JSON-LD context

    "provider": {

      "@type": "Organization",

      "name": "NanoCommons",

      "url": "https://www.nanocommons.eu/"

    },

  "http://purl.org/dc/terms/conformsTo": {

"@type": "CreativeWork",

"@id": "https://bioschemas.org/profiles/TrainingMaterial/1.0-RELEASE"

},

Bioschemas profile

5 of 10

RO-Crate Metadata File

5

{ "@id": "cp7glop.ai",

"@type": "File",� "name": "Diagram showing trend to increase",

},

{ "@type": "CreativeWork",

"@id": "ro-crate-metadata.json",

"conformsTo": {"@id": "https://w3id.org/ro/crate/1.1"},

"about": { "@id": "./" }

}

{ "@context": "https://w3id.org/ro/crate/1.1/context",

"@graph": [

RO-Crate metadata file descriptor

RO-Crate root dataset

..collection of Data entities

..described w/ contextual entities

{ "@id": "./",

"identifier": "https://doi.org/10.5281/zenodo.1009240",

"@type": "Dataset",�

"hasPart": [

{ "@id": "cp7glop.ai" },

{ "@id": "lots_of_little_files/" },

{ "@id": "communities-2018.csv" },

{ "@id": "https://doi.org/10.4225/59/59672c09f4a4b" },

{ "@id": "SciDataCon Presentations/AAA_Pilot_Project_Abstract.html" }

],�

"author": { "@id": "https://orcid.org/0000-0002-8367-6908" },

"publisher": { "@id": "https://ror.org/03f0f6041" },

"citation": { "@id": "https://doi.org/10.1109/TCYB.2014.2386282"},

"name": "Presentation of user survey 2018"

},

Flat list of metadata per entity

JSON-LD preamble

"hasPart": [

{ "@id": "cp7glop.ai" },

{ "@id": "lots_of_little_files/" },

{ "@id": "communities-2018.csv" },

{ "@id": "https://doi.org/10.4225/59/59672c09f4a4b" },

{ "@id": "SciDataCon-Presentations/AAA_Pilot_Abstract.html"}

],

6 of 10

Metadata

6

{

"@id": "https://orcid.org/0000-0002-8367-6908",

"@type": "Person",

"affiliation": { "@id": "https://ror.org/03f0f6041" },

"name": "J. Xuan"

}

{

"@id": "https://ror.org/03f0f6041",

"@type": "Organization",

"name": "University of Technology Sydney",

"url": "https://www.uts.edu.au/"

}

{

"@id": "figure.png",

"@type": ["File", "ImageObject"],

"name": "XXL-CT-scan of an XXL Tyrannosaurus rex skull",

"identifier": "https://doi.org/10.5281/zenodo.3479743",

"author": {"@id": "https://orcid.org/0000-0002-8367-6908"},

"encodingFormat": "image/png"

}

Linked Data: Reference by URI

Types and properties are expanded by context, e.g. http://schema.org/ImageObject

Entities can be cross-referenced with @id within the same JSON-LD document

Style Flattened JSON-LD: each entity is listed separately in @graph array. @id required

Style Compacted JSON-LD: the entity can be nested within any cross-reference, @id optional

Clients can still follow the links for potentially more data (e.g. ORCID lists publications)

7 of 10

Using common vocabularies

.. extending only when needed

8 of 10

8

https://schema.org/Dataset �is based on W3C DCAT

9 of 10

Testing in the JSON-LD Playground

9

10 of 10

Testing in Schema Markup Validator

10