B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | AC | AD | AE | AF | AG | AH | AI | AJ | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | License for this material: CC BY-SA 3.0. | BioCADDIE WG: Metadata WG3 | Authors: Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra, with George Alter, Mary Vardigan, Jeffrey Grethe, Hua Xu and the DataMed development team; and with contributions from the members of the bioCADDIE Metadata WG 3 and WG7 | |||||||||||||||||||||||||||||||||
2 | ||||||||||||||||||||||||||||||||||||
3 | Entity | Property | Definition | Value(s) | Cardinality | Requirement Level | Relevant Competency Question(s) | Notes or Example(s) | Presence in Schema(s)/Model(s) | |||||||||||||||||||||||||||
4 | CORE DATS | |||||||||||||||||||||||||||||||||||
5 | ||||||||||||||||||||||||||||||||||||
6 | Dataset | A set of dimensions about an entity being observed. A collection of data, published or curated by a single agent, and available for access or download in one or more formats (from DCAT: http://www.w3.org/TR/vocab-dcat/#Class:_Dataset) A body of structured information describing some topic(s) of interest (from: http://schema.org/Dataset) | BGUC5-2;BGUC5-4;BGUC5-5;UC2;UC15;WPUC5-p7;WPUC7-p7;WPUC8-p7;WPUC10-p7 | |||||||||||||||||||||||||||||||||
7 | identifier | Primary identifier for the dataset. | IdentifiersInformation | 0..1 | SHOULD | BGUC5 | ||||||||||||||||||||||||||||||
8 | alternateIdentifiers | Alternate identifiers for the dataset. | AlternateIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
9 | relatedIdentifiers | Related identifiers for the dataset. | RelatedIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
10 | title | The name of the dataset, usually one sentence or short description of the dataset. | string | 1 | MUST | DataCite[/resource/titles];DataCite[/resource/titles/title];Schema.org[https://schema.org/headline];HCLS[(dct:title,rdf:langString)] | ||||||||||||||||||||||||||||||
11 | types | A term, ideally from a controlled terminology, identifying the dataset type or nature of the data, placing it in a typology. | DataType | 1..n | MUST | BGUC1-1;BGUC1-2;BGUC3-2;BGUC3-3;BGUC5;BGUC5-1;WPUC1;WPUC2;WPUC3;WPUC9-p7;UC1 | For example: microscopy imaging, gene expression profile, genomic sequence, fMRI, pathway simulation. | |||||||||||||||||||||||||||||
12 | creators | The person(s) or organization(s) which contributed to the creation of the dataset. | Person or Organization | 1..n | MUST | UC2 | ||||||||||||||||||||||||||||||
13 | dates | Relevant dates for the dataset, a date must be added, e.g. creation date or last modification date should be added. | Date | 0..n | MAY | equivalent to TemporalCoverage | ||||||||||||||||||||||||||||||
14 | spatialCoverage | The geographical extension and span covered by the dataset and its measured dimensions/variables. | Place | 0..n | MAY | |||||||||||||||||||||||||||||||
15 | licenses | The terms of use of the dataset. | License | 0..n | SHOULD | BGUC5-4 | BGUC5-1;BGUC5-4;BGUC5-8 | |||||||||||||||||||||||||||||
16 | distributions | The distribution(s) by which datasets are made available (for example: mySQL dump). | DatasetDistribution | 0..n | SHOULD | |||||||||||||||||||||||||||||||
17 | description | A textual narrative comprised of one or more statements describing the dataset. | string | 0..1 | SHOULD | DataCite[/resource/descriptions; /resource/descriptions/description; /resource/descriptions/description/descriptionType] | ||||||||||||||||||||||||||||||
18 | storedIn | The data repository hosting the dataset. | DataRepository | 0..1 | MAY | BGUC1-1;UC2 | While from the DDI perspective, every dataset may be coming from a data repository, we put a less strict requirement allowing for datasets available online and not in a repository. | |||||||||||||||||||||||||||||
19 | dimensions | The different dimensions (granular components) making up a dataset. | Dimension | 0..n | MAY | BGUC2;BGUC5-4 | ||||||||||||||||||||||||||||||
20 | primaryPublications | The primary publication(s) associated with the dataset, usually describing how the dataset was produced. | Publication | 0..n | MAY | BGUC5-2 | ||||||||||||||||||||||||||||||
21 | citations | The publication(s) that cite this dataset. | Publication | 0..n | MAY | |||||||||||||||||||||||||||||||
22 | citationCount | The number of publications that cite this dataset (enumerated in the citations property). | integer | 0..1 | MAY | |||||||||||||||||||||||||||||||
23 | producedBy | A study process which generated a given dataset, if any. | Study | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
24 | isAbout | Different entiies (biological entity, taxonomic information, disease, molecular entity, anatomical part, treatment) associated with this dataset. | BiologicalEntity or TaxonomicInformation or Disease or MolecularEntity or AnatomicalPart or Treatment or Annotation | 0..n | SHOULD | |||||||||||||||||||||||||||||||
25 | hasPart | A Dataset that is a subset of this Dataset; Datasets declaring the 'hasPart' relationship are considered a collection of Datasets, the aggregation criteria could be included in the 'description' field. | Dataset | 0..n | MAY | |||||||||||||||||||||||||||||||
26 | keywords | Tags associated with the dataset, which will help in its discovery. | Annotation | 0..n | MAY | |||||||||||||||||||||||||||||||
27 | version | A release point for the dataset when applicable. | string | 0..1 | SHOULD | WPUC5-p7 | ||||||||||||||||||||||||||||||
28 | acknowledges | The grant(s) which funded and supported the work reported by the dataset. | Grant | 0..n | MAY | |||||||||||||||||||||||||||||||
29 | availability | A qualifier indicating the different types of availability for a dataset (available, unavailable, embargoed, available with restriction, information not available) | string | 0..1 | SHOULD | see CV | ||||||||||||||||||||||||||||||
30 | refinement | A qualifier to describe the level of data processing of the dataset and its distributions. | string | 0..1 | SHOULD | see CV | ||||||||||||||||||||||||||||||
31 | aggregation | A qualifier indicating if the entity represents an 'instance of dataset' or a 'collection of datasets'. | string | 0..1 | SHOULD | see CV | ||||||||||||||||||||||||||||||
32 | extraProperties | Extra properties that do not fit in the previous specified attributes. | CategoryValuesPair | 0..n | MAY | |||||||||||||||||||||||||||||||
33 | ||||||||||||||||||||||||||||||||||||
34 | DatasetDistribution | A specific available form of a dataset. Each dataset might be available in different forms, these forms might represent different formats of the dataset or different endpoints. Examples of distributions include a downloadable CSV file, an API or an RSS feed. (From DCAT) | ||||||||||||||||||||||||||||||||||
35 | identifier | Primary identifier for the dataset distribution. | IdentifiersInformation | 0..1 | SHOULD | BGUC5 | ||||||||||||||||||||||||||||||
36 | alternateIdentifiers | Alternate identifiers for the dataset distribution. | AlternateIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
37 | relatedIdentifiers | Related identifiers for the dataset distribution. | RelatedIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
38 | title | The name of the dataset distribution, usually one sentece or short description of the dataset. | string | 0..1 | MAY | |||||||||||||||||||||||||||||||
39 | description | A textual narrative comprised of one or more statements describing the dataset distribution. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
40 | dates | Relevant dates for the datasets, e.g. creation date or last modification date may be added. | Date | 0..n | MAY | equivalent to TemporalCoverage | ||||||||||||||||||||||||||||||
41 | storedIn | The data repository hosting the dataset distribution. | DataRepository | 0..1 | MAY | BGUC1-1;UC2 | While from the DDI perspective, every dataset may be coming from a data repository, we put a less strict requirement allowing for datasets available online and not in a repository. | |||||||||||||||||||||||||||||
42 | version | A release point for the dataset when applicable. | string | 0..1 | SHOULD | WPUC5-p7 | ||||||||||||||||||||||||||||||
43 | access | The information about access modality for the dataset distribution. | Access | 1 | MUST | |||||||||||||||||||||||||||||||
44 | licenses | The terms of use of the data distribution. | License | 0..n | SHOULD | BGUC5-4 | BGUC5-1;BGUC5-4;BGUC5-8 | |||||||||||||||||||||||||||||
45 | curationStatus | The level of curation of the dataset distribution. | Annotation | 0..n | MAY | E.g. manually or authomatic or both, other values such as https://wiki.nci.nih.gov/display/CTRPdoc/Curation+Status+Definitions+-+Include+v4.3.1 | ||||||||||||||||||||||||||||||
46 | conformsTo | A data standard whose requirements and constraints are met by the dataset. | DataStandard | 0..n | MAY | BGUC5-7;WPUC9-p7 | ||||||||||||||||||||||||||||||
47 | formats | The technical format of the dataset distribution. Use the file extension or MIME type when possible. (Definition adapted from DataCite) | string | 0..n | MAY | e.g. PDF, XML, MPG or application/pdf, text/xml, video/mpeg | ||||||||||||||||||||||||||||||
48 | qualifiers | One or more characteristics of the dataset distribution (e.g. how it relates to other distributions, if the data is raw or processed, compressed or encrypted). | Annotation or CategoryValuesPair | 0..n | MAY | e.g. indicate if the distribution is isomorphic (corresponds completely with the dataset), a derivative from the dataset, or is a partial distribution of the dataset. These qualifiers can also indicate if the distribution refers to raw, processed or summarised data. It could also refer to the data being encrypted or compressed. | ||||||||||||||||||||||||||||||
49 | size | The size of the dataset. | number | 0..1 | MAY | BGUC5-1 | ||||||||||||||||||||||||||||||
50 | unit | The unit of measurement used to estimate the size of the dataset (e.g, petabyte). Ideally, the unit should be coming from a reference controlled terminology. | Annotation | 1, if size is reported | (MUST) | |||||||||||||||||||||||||||||||
51 | extraProperties | Extra properties that do not fit in the previous specified attributes. | CategoryValuesPair | 0..n | MAY | |||||||||||||||||||||||||||||||
52 | ||||||||||||||||||||||||||||||||||||
53 | DataStandard | A format, reporting guideline, terminology. It is used to indicate whether the dataset conforms to a particular community norm or specification. | BGUC5-7;UC15;WPUC9-p7 | |||||||||||||||||||||||||||||||||
54 | identifier | Primary identifier for the standard. | IdentifiersInformation | 0..1 | SHOULD | BGUC5 | ||||||||||||||||||||||||||||||
55 | alternateIdentifiers | Alternate identifiers for the standard. | AlternateIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
56 | relatedIdentifiers | Related identifiers for the standard. | RelatedIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
57 | name | The name of the standard (e.g. FASTQ, CDISC STDM, ISO8601) | string | 1 | MUST | |||||||||||||||||||||||||||||||
58 | type | The nature of the information resource, ideally specified with a controlled vocabulary or ontology (.e.g model or format, vocabulary, reporting guideline). | Annotation | 1 | MUST | WPUC9-p7 | ||||||||||||||||||||||||||||||
59 | description | A textual narrative comprised of one or more statements describing the data standard. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
60 | licenses | The terms of use of the data standard. | License | 0..n | SHOULD | BGUC5-4 | ||||||||||||||||||||||||||||||
61 | version | A release point for the repository when applicable. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
62 | extraProperties | Extra properties that do not fit in the previous specified attributes. | CategoryValuesPair | 0..n | MAY | |||||||||||||||||||||||||||||||
63 | ||||||||||||||||||||||||||||||||||||
64 | DataRepository | A repository or catalog of datasets. It could be a primary repository or a repository that aggregates data existing in other repositories. | BGUC1-1;UC2;UC15 | |||||||||||||||||||||||||||||||||
65 | identifier | Primary identifier for the data repository. | IdentifiersInformation | 0..n | SHOULD | BGUC5 | ||||||||||||||||||||||||||||||
66 | alternateIdentifiers | Alternate identifiers for the data repository. | AlternateIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
67 | relatedIdentifiers | Related identifiers for the data repository. | RelatedIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
68 | name | The name of the data repository. | string | 1 | MUST | BGUC1-1;UC2 | ||||||||||||||||||||||||||||||
69 | description | A textual narrative comprised of one or more statements describing the data repository. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
70 | dates | Relevant dates for the data repository. | Date | 0..n | MAY | |||||||||||||||||||||||||||||||
71 | scopes | Information about the nature of the datasets in the repository, ideally from a controlled vocabulary or ontology (e.g. transcription profile, sequence reads, molecular structure, image, DNA sequence, NMR spectra). | Annotation | 0..n | 1..n | SPUC1;SPUC7-2 | ||||||||||||||||||||||||||||||
72 | types | A descriptor (ideally from a controlled vocabulary) providing information about the type of repository, such as primary resource or aggregator. | Annotation | 0..n | SHOULD | |||||||||||||||||||||||||||||||
73 | licenses | The terms of use of the data repository. | License | 0..n | SHOULD | BGUC5-4 | ||||||||||||||||||||||||||||||
74 | version | A release point for the repository, when applicable. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
75 | publishers | The person(s) or organization(s) responsible for the repository and its availability. | Person or Organization | 0..n | SHOULD | |||||||||||||||||||||||||||||||
76 | aggregatorOf | The DataRepositories aggregated by this repository. This property will be empty for primary repositories. | DataRepository | 0..n | MAY | |||||||||||||||||||||||||||||||
77 | access | The information about access modality for the data repository. | Access | 1..n | MAY | |||||||||||||||||||||||||||||||
78 | extraProperties | Extra properties that do not fit in the previous specified attributes. | CategoryValuesPair | 0..n | MAY | |||||||||||||||||||||||||||||||
79 | ||||||||||||||||||||||||||||||||||||
80 | Software | A digital entity containing sets of instructions and operation, which allows computation and operation of and by computer. | SPUC11;SPUC10 | |||||||||||||||||||||||||||||||||
81 | identifier | Primary identifier for the software. | IdentifiersInformation | 0..n | SHOULD | BGUC5 | ||||||||||||||||||||||||||||||
82 | alternateIdentifiers | Alternate identifiers for the software. | AlternateIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
83 | relatedIdentifiers | Related identifiers for the software. | RelatedIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
84 | name | The name of the software. | string | 1 | MUST | |||||||||||||||||||||||||||||||
85 | description | A textual narrative comprised of one or more statements describing the software. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
86 | licenses | The terms of use of the software. | License | 0..n | SHOULD | |||||||||||||||||||||||||||||||
87 | isUsedBy | The data acquisition activity that makes use of this software. | DataAcquisition or DataAnalysis | 0..n | MAY | |||||||||||||||||||||||||||||||
88 | manufacturer | The person or organisation that produced the software. | Person or Organization | 0..1 | MAY | e.g. Adobe | ||||||||||||||||||||||||||||||
89 | version | A release point for the software. | string | 0..1 | SHOULD | |||||||||||||||||||||||||||||||
90 | extraProperties | Extra properties that do not fit in the previous specified attributes. | CategoryValuesPair | 0..n | MAY | |||||||||||||||||||||||||||||||
91 | ||||||||||||||||||||||||||||||||||||
92 | Publication | A (digital) document made available by a publisher. | BGUC5-2;WPUC5-p7;WPUC10-p7;UC2 | |||||||||||||||||||||||||||||||||
93 | identifier | Primary identifier for the publication. | IdentifiersInformation | 1..n | SHOULD | BGUC5 | ||||||||||||||||||||||||||||||
94 | alternateIdentifiers | Alternate identifiers for the publication. | AlternateIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
95 | relatedIdentifiers | Related identifiers for the publication. | RelatedIdentifiersInformation | 0..n | MAY | |||||||||||||||||||||||||||||||
96 | title | The name of the publication, usually one sentece or short description of the publication. | string | 1 | SHOULD | |||||||||||||||||||||||||||||||
97 | dates | Relevant dates, the date of the publication must be provided. | Date | 1..n | SHOULD | |||||||||||||||||||||||||||||||
98 | type | Publication type, ideally delegated to an external vocabulary/resource. | Annotation | 0..1 | SHOULD | e.g. book, article, weblog, chapter, review, correspondence | ||||||||||||||||||||||||||||||
99 | publicationVenue | The name of the publication venue where the document is published if applicable. | string | 0..1 | MAY | |||||||||||||||||||||||||||||||
100 | authorsList | The list of authors made available as a string (does not allow disambiguation). | string | 0..1 | SHOULD |