1 of 39

How to publish data in PANGAEA

23 August 2023

Dr. Flavia Höring

Data management consultant for DAM research missions

Data curator at PANGAEA

2 of 39

Today‘s info event

PANGAEA info event - Flavia Höring

2

23.08.2023

Introduction to PANGAEA

Preparation of data and metadata

Submission

Data curation

Data publication

3 of 39

PANGAEA - Organisation

Hosted by:

Managed by:

PANGAEA info event - Flavia Höring

3

23.08.2023

Team

4 of 39

PANGAEA - Clients

  • Research projects

  • Institutions

  • Individual researcher

PANGAEA info event - Flavia Höring

4

23.08.2023

5 of 39

PANGAEA - Content

Data sets: > 420 000

Measurements: ~ 25 billion

Projects: > 800

PANGAEA info event - Flavia Höring

5

23.08.2023

New datasets per year: ~ 10,000

6 of 39

PANGAEA - Data types

Focus on georeferenced observational and experimental data:

  • Tabular data of e.g. environmental time series, biodiversity, sediment samples…

  • Binary files e.g. images, movies, netCDF files…

PANGAEA info event - Flavia Höring

6

23.08.2023

DON’T SUBMIT:

  • Model data
  • Nucleotide sequence data (e.g. FASTA files)
  • Software code

7 of 39

PANGAEA – Website�

PANGAEA info event - Flavia Höring

7

23.08.2023

8 of 39

FAIR data publishing in PANGAEA�

PANGAEA info event - Flavia Höring

8

23.08.2023

Full citation with DOI!

METADATA

9 of 39

FAIR data publishing in PANGAEA�

PANGAEA info event - Flavia Höring

9

23.08.2023

METADATA

Open Access Licence

10 of 39

FAIR data publishing in PANGAEA�

PANGAEA info event - Flavia Höring

10

23.08.2023

DATA

11 of 39

Benefits of FAIR data publication�

  • Citation – appropriate attribution and credit
  • Reusability – Data sharing and collaboration
  • Reproducibility of findings
  • Increasing the efficiency of research

PANGAEA info event - Flavia Höring

11

23.08.2023

12 of 39

Looking for help? Check out the Wiki!

PANGAEA info event - Flavia Höring

12

23.08.2023

Website: https://wiki.pangaea.de/

Submission guidelines, templates and more

13 of 39

Data and metadata preparation

PANGAEA info event - Flavia Höring

13

23.08.2023

14 of 39

Step 1: Prepare the data

  • Preferred format for data tables is TAB-delimited text files (UTF-8 encoding) or Excel-format
  • Position(s) (latitude/longitude in decimal degree) must be provided for every sample, observation and measurement carried out anywhere on earth
  • Include a third dimension, e.g. water depth, altitude, depth in sediment …
  • Date/Time must be provided in the ISO-format (e.g. 1954-12-07T13:34:11) as coordinated universal time (UTC)
  • Parameters are always accompanied by a unit
  • Event/station ID as first column
  • Explain abbreviations
  • Read guidelines at: https://wiki.pangaea.de/wiki/Data_submission

PANGAEA info event - Flavia Höring

14

23.08.2023

15 of 39

Campaigns and events

PANGAEA info event - Flavia Höring

15

23.08.2023

Already available campaigns and events: https://pangaea.de/expeditions/

16 of 39

Parameters

  • Parameter names: Mix of relevant standard vocabularies (e.g. WoRMS, IUPAC) and community standards
  • Complete parameter list: https://www.pangaea.de/lists/parameter/all-byname
  • Check out similar datasets in PANGAEA database

Most important: Full parameter name with unit!

PANGAEA info event - Flavia Höring

16

23.08.2023

17 of 39

Data examples

PANGAEA info event - Flavia Höring

17

23.08.2023

Date/Time and Depth are missing!

Unit is missing!

Abbreviation

not explained!

No unique event labels

18 of 39

Data examples

PANGAEA info event - Flavia Höring

18

23.08.2023

From generic template for field observation data: https://wiki.pangaea.de/wiki/Best_practice_manuals_and_templates

19 of 39

Step 2: Prepare the metadata�

  • Data authors (≠ paper authors): Contributors to collection and processing of data
  • Titles and abstracts for each dataset (≠ paper, see https://wiki.pangaea.de/wiki/Abstract )
  • (Preliminary) paper citation & other references
  • Projects/awards
  • Include separate metadata files/tables:
      • Multiple abstracts/dataset titles
      • Campaign info and Event table

(if not listed at https://pangaea.de/expeditions/)

      • Parameter table (full names, units, PIs, methods, comments)

PANGAEA info event - Flavia Höring

19

23.08.2023

20 of 39

Examples – Campaign & Event information

PANGAEA info event - Flavia Höring

20

23.08.2023

Red parameters are mandatory!

From generic template for field observation data: https://wiki.pangaea.de/wiki/Best_practice_manuals_and_templates

21 of 39

Examples – Parameter information

PANGAEA info event - Flavia Höring

21

23.08.2023

From generic template for field observation data: https://wiki.pangaea.de/wiki/Best_practice_manuals_and_templates

Please provide the primary instrument of measurement for each parameter.

    • Syntax: Instrument type, Manufacturer, Model name [further infos]
    • For more details, see https://wiki.pangaea.de/wiki/Method

22 of 39

Step 3: Data submission

  • Data prepared
  • Metadata prepared
  • Ready for submission

Processing time: up to several months

Access restriction/Moratorium up to two years can be requested

Please submit your data as early as possible!

PANGAEA info event - Flavia Höring

22

23.08.2023

23 of 39

Step 3: Data submission�

PANGAEA info event - Flavia Höring

23

23.08.2023

24 of 39

Submission form –�Basic information

PANGAEA info event - Flavia Höring

24

23.08.2023

  • General dataset title
  • Authors
  • Keywords
  • General abstract
  • License

25 of 39

Submission form - References

PANGAEA info event - Flavia Höring

25

23.08.2023

  • Any references or manuscripts (in prep)
  • Full reference, if available
  • Please include the DOI!

26 of 39

Submission form – Projects and Grants

PANGAEA info event - Flavia Höring

26

23.08.2023

  • Add related projects/funding

27 of 39

Submission form – File upload

PANGAEA info event - Flavia Höring

27

23.08.2023

  • Data and metadata files
  • Max. file size: 100MB
  • Max. number of files: 20
  • Please request an upload link for larger/more files

For datasets of 100s of GBs or TBs, please contact us first!

28 of 39

Submission form – Submit

PANGAEA info event - Flavia Höring

28

23.08.2023

  • Add comment
  • Request moratorium
  • Accept PANGAEA’s terms of use

29 of 39

Submission done !

An issue was created in our ticket system.

PANGAEA info event - Flavia Höring

29

23.08.2023

30 of 39

Step 4: Data curation – Processing steps�

PANGAEA info event - Flavia Höring

30

23.08.2023

31 of 39

Dataset status

PANGAEA info event - Flavia Höring

31

23.08.2023

Update of dataset possible

DOI is not yet registered

With (default) or without access restriction

in review

Final version of the dataset

DOI is registered

With or without access restriction

published

Update of paper reference is possible

Automatic registration after 28 days

Access to the data may be restricted for up to 2 years!

32 of 39

Make your data citation count!

PANGAEA info event - Flavia Höring

32

23.08.2023

Data availability statement:

e.g. “All data are available in the public repository, PANGAEA, https://doi.org/

10.1594/PANGAEA.004444 (Höring et al., 2022).”

Please always add the full citation of the data to the reference list of your paper!

Only refer to persistent links!

33 of 39

Information on how to contact us�

  • General inquiries: https://pangaea.de/contact/

  • Your submission: Comment via Ticket system (https://issues.pangaea.de/browse/PDI-XXXXX)

  • Questions about this event: Flavia Höring (Email: fhoering@marum.de)

  • Please fill in the feedback form for this event:

https://cloud.marum.de/apps/forms/s/8tjbnWSanSTz8zaYM2pYmm6S

PANGAEA info event - Flavia Höring

33

23.08.2023

34 of 39

FAIR data publishing in PANGAEA�

PANGAEA info event - Flavia Höring

34

23.08.2023

Findable

    • Persistent identifier
    • Rich metadata
    • Searchable online

Accessible

    • Retrievable online using standardized protocols
    • Access restrictions if necessary

Interoperable

    • Common formats and standards
    • Controlled vocabularies

Reusable

    • Rich metadata
    • License and provenance information

35 of 39

Title/Abstract – Guidelines�

An individual abstract/title must be provided for each dataset!

Title:

  • Include information about the WHAT, WHERE & WHEN

Abstract:

  • Include information about the WHAT, WHERE, WHEN, WHY & HOW
  • May contain necessary references
  • For details, see https://wiki.pangaea.de/wiki/Abstract

PANGAEA info event - Flavia Höring

35

23.08.2023

AVOID

Copying the manuscript title/abstract

Using abbreviations

Interpreting results

36 of 39

Licenses�

CC-BY: Creative Commons Attribution 4.0 International

  • Data is freely available to everyone
  • Usage requires to “cite”/”attribute” the original author(s)
  • No further restrictions on usage

CC-BY-SA: Creative Commons Attribution-ShareAlike 4.0 International

  • Attribution required
  • If data gets included into works (e.g., paper), the resulting work must be CC-BY-SA licensed, too

CC0: Creative Commons Zero 1.0 Universal

  • Public Domain
  • No attribution required, but recommended according to Good Scientific Practice

PANGAEA info event - Flavia Höring

36

23.08.2023

37 of 39

Step 4: Data curation - Flow�

PANGAEA info event - Flavia Höring

37

23.08.2023

Web Portal – submission, search, review datasets

Ticket System – communication, documentation, organization

Editorial System – Curation, Editorial

PANGAEA Wiki – Help, SOPs

Tools

38 of 39

Make your data citation count!

PANGAEA info event - Flavia Höring

38

23.08.2023

Data availability statement:

“All data are available in the public repository, PANGAEA, https://doi.pangaea.de/

10.1594/PANGAEA.004444 .”

39 of 39

(Meta-)Data Access & Reuse for Data Science

PANGAEA info event - Flavia Höring

39

23.08.2023

  • Various entry points to explore and re-use data (frontend, APIs, community specific portals…)
  • Support of commonly used tools (e.g. Python and R)

PANGAEA search:

Python module ‘pangaeapy’:

R package ‘pangaear’: