Appendix I - NIH BD2K BioCADDIE DataMed DATS model working version v2.3
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

Comment only
 
BCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJ
1
License for this material: CC BY-SA 3.0.
BioCADDIE WG: Metadata WG3Authors: Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra, with George Alter, Mary Vardigan, Jeffrey Grethe, Hua Xu and the DataMed development team; and with contributions from the members of the bioCADDIE Metadata WG 3 and WG7
2
3
EntityPropertyDefinitionValue(s)Cardinality
Requirement Level
Relevant Competency Question(s)
Notes or Example(s)Presence in Schema(s)/Model(s)
4
CORE DATS
5
6
DatasetA set of dimensions about an entity being observed.
A collection of data, published or curated by a single agent, and available for access or download in one or more formats (from DCAT: http://www.w3.org/TR/vocab-dcat/#Class:_Dataset)
A body of structured information describing some topic(s) of interest (from: http://schema.org/Dataset)
BGUC5-2;BGUC5-4;BGUC5-5;UC2;UC15;WPUC5-p7;WPUC7-p7;WPUC8-p7;WPUC10-p7
7
identifierPrimary identifier for the dataset.IdentifiersInformation0..1SHOULDBGUC5
8
alternateIdentifiers
Alternate identifiers for the dataset.AlternateIdentifiersInformation0..nMAY
9
relatedIdentifiersRelated identifiers for the dataset.RelatedIdentifiersInformation0..nMAY
10
titleThe name of the dataset, usually one sentence or short description of the dataset.string1MUSTDataCite[/resource/titles];DataCite[/resource/titles/title];Schema.org[https://schema.org/headline];HCLS[(dct:title,rdf:langString)]
11
typesA term, ideally from a controlled terminology, identifying the dataset type or nature of the data, placing it in a typology.DataType1..nMUSTBGUC1-1;BGUC1-2;BGUC3-2;BGUC3-3;BGUC5;BGUC5-1;WPUC1;WPUC2;WPUC3;WPUC9-p7;UC1For example: microscopy imaging, gene expression profile, genomic sequence, fMRI, pathway simulation.
12

creators
The person(s) or organization(s) which contributed to the creation of the dataset.Person or Organization1..nMUSTUC2
13
datesRelevant dates for the dataset, a date must be added, e.g. creation date or last modification date should be added.Date0..nMAY
equivalent to TemporalCoverage
14
spatialCoverageThe geographical extension and span covered by the dataset and its measured dimensions/variables.Place0..nMAY
15
licensesThe terms of use of the dataset.License0..nSHOULDBGUC5-4
BGUC5-1;BGUC5-4;BGUC5-8
16
distributionsThe distribution(s) by which datasets are made available (for example: mySQL dump).DatasetDistribution0..nSHOULD
17
descriptionA textual narrative comprised of one or more statements describing the dataset.string0..1SHOULD
DataCite[/resource/descriptions;
/resource/descriptions/description;
/resource/descriptions/description/descriptionType]
18
storedIn
The data repository hosting the dataset.DataRepository0..1MAYBGUC1-1;UC2While from the DDI perspective, every dataset may be coming from a data repository, we put a less strict requirement allowing for datasets available online and not in a repository.
19
dimensionsThe different dimensions (granular components) making up a dataset.Dimension0..nMAYBGUC2;BGUC5-4
20
primaryPublications
The primary publication(s) associated with the dataset, usually describing how the dataset was produced.Publication0..nMAYBGUC5-2
21
citationsThe publication(s) that cite this dataset.Publication0..nMAY
22
citationCountThe number of publications that cite this dataset (enumerated in the citations property).integer0..1MAY
23
producedByA study process which generated a given dataset, if any.Study0..1SHOULD
24
isAboutDifferent entiies (biological entity, taxonomic information, disease, molecular entity, anatomical part, treatment) associated with this dataset.
BiologicalEntity or TaxonomicInformation or Disease or MolecularEntity or AnatomicalPart or Treatment or Annotation0..nSHOULD
25
hasPartA Dataset that is a subset of this Dataset; Datasets declaring the 'hasPart' relationship are considered a collection of Datasets, the aggregation criteria could be included in the 'description' field.Dataset0..nMAY
26
keywordsTags associated with the dataset, which will help in its discovery.Annotation0..nMAY
27
versionA release point for the dataset when applicable.string0..1SHOULDWPUC5-p7
28
acknowledgesThe grant(s) which funded and supported the work reported by the dataset.Grant0..nMAY
29
availabilityA qualifier indicating the different types of availability for a dataset (available, unavailable, embargoed, available with restriction, information not available)string0..1SHOULDsee CV
30
refinement
A qualifier to describe the level of data processing of the dataset and its distributions.
string0..1SHOULDsee CV
31
aggregation
A qualifier indicating if the entity represents an 'instance of dataset' or a 'collection of datasets'.
string0..1SHOULDsee CV
32
extraPropertiesExtra properties that do not fit in the previous specified attributes. CategoryValuesPair0..nMAY
33
34
DatasetDistribution
A specific available form of a dataset. Each dataset might be available in different forms, these forms might represent different formats of the dataset or different endpoints. Examples of distributions include a downloadable CSV file, an API or an RSS feed. (From DCAT)
35
identifierPrimary identifier for the dataset distribution.IdentifiersInformation0..1SHOULDBGUC5
36
alternateIdentifiers
Alternate identifiers for the dataset distribution.AlternateIdentifiersInformation0..nMAY
37
relatedIdentifiersRelated identifiers for the dataset distribution.RelatedIdentifiersInformation0..nMAY
38
titleThe name of the dataset distribution, usually one sentece or short description of the dataset.string0..1MAY
39
descriptionA textual narrative comprised of one or more statements describing the dataset distribution.string0..1SHOULD
40
datesRelevant dates for the datasets, e.g. creation date or last modification date may be added.Date0..nMAY
equivalent to TemporalCoverage
41
storedIn
The data repository hosting the dataset distribution.DataRepository0..1MAYBGUC1-1;UC2While from the DDI perspective, every dataset may be coming from a data repository, we put a less strict requirement allowing for datasets available online and not in a repository.
42
versionA release point for the dataset when applicable.string0..1SHOULDWPUC5-p7
43
accessThe information about access modality for the dataset distribution.Access1MUST
44
licensesThe terms of use of the data distribution.License0..nSHOULDBGUC5-4
BGUC5-1;BGUC5-4;BGUC5-8
45
curationStatusThe level of curation of the dataset distribution.Annotation0..nMAYE.g. manually or authomatic or both, other values such as https://wiki.nci.nih.gov/display/CTRPdoc/Curation+Status+Definitions+-+Include+v4.3.1
46
conformsToA data standard whose requirements and constraints are met by the dataset.DataStandard0..nMAYBGUC5-7;WPUC9-p7
47
formatsThe technical format of the dataset distribution. Use the file extension or MIME type when possible. (Definition adapted from DataCite)string0..nMAY
e.g. PDF, XML, MPG or application/pdf, text/xml, video/mpeg
48
qualifiersOne or more characteristics of the dataset distribution (e.g. how it relates to other distributions, if the data is raw or processed, compressed or encrypted).
Annotation or CategoryValuesPair0..nMAYe.g. indicate if the distribution is isomorphic (corresponds completely with the dataset), a derivative from the dataset, or is a partial distribution of the dataset. These qualifiers can also indicate if the distribution refers to raw, processed or summarised data. It could also refer to the data being encrypted or compressed.
49
size
The size of the dataset.number0..1MAYBGUC5-1
50
unit
The unit of measurement used to estimate the size of the dataset (e.g, petabyte). Ideally, the unit should be coming from a reference controlled terminology.
Annotation
1, if size is reported
(MUST)
51
extraPropertiesExtra properties that do not fit in the previous specified attributes. CategoryValuesPair0..nMAY
52
53
DataStandard
A format, reporting guideline, terminology. It is used to indicate whether the dataset conforms to a particular community norm or specification.
BGUC5-7;UC15;WPUC9-p7
54
identifierPrimary identifier for the standard.IdentifiersInformation0..1SHOULDBGUC5
55
alternateIdentifiers
Alternate identifiers for the standard.AlternateIdentifiersInformation0..nMAY
56
relatedIdentifiersRelated identifiers for the standard.RelatedIdentifiersInformation0..nMAY
57
nameThe name of the standard (e.g. FASTQ, CDISC STDM, ISO8601)string1MUST
58
typeThe nature of the information resource, ideally specified with a controlled vocabulary or ontology (.e.g model or format, vocabulary, reporting guideline).Annotation1MUSTWPUC9-p7
59
descriptionA textual narrative comprised of one or more statements describing the data standard.string0..1SHOULD
60
licensesThe terms of use of the data standard.License0..nSHOULDBGUC5-4
61
versionA release point for the repository when applicable.string0..1SHOULD
62
extraPropertiesExtra properties that do not fit in the previous specified attributes. CategoryValuesPair0..nMAY
63
64
DataRepository
A repository or catalog of datasets. It could be a primary repository or a repository that aggregates data existing in other repositories.BGUC1-1;UC2;UC15
65
identifierPrimary identifier for the data repository.IdentifiersInformation0..nSHOULDBGUC5
66
alternateIdentifiers
Alternate identifiers for the data repository.AlternateIdentifiersInformation0..nMAY
67
relatedIdentifiersRelated identifiers for the data repository.RelatedIdentifiersInformation0..nMAY
68
nameThe name of the data repository.string1MUSTBGUC1-1;UC2
69
descriptionA textual narrative comprised of one or more statements describing the data repository.string0..1SHOULD
70
datesRelevant dates for the data repository.Date0..nMAY
71
scopesInformation about the nature of the datasets in the repository, ideally from a controlled vocabulary or ontology (e.g. transcription profile, sequence reads, molecular structure, image, DNA sequence, NMR spectra).Annotation0..n1..nSPUC1;SPUC7-2
72
typesA descriptor (ideally from a controlled vocabulary) providing information about the type of repository, such as primary resource or aggregator.Annotation0..nSHOULD
73
licensesThe terms of use of the data repository.License0..nSHOULDBGUC5-4
74
versionA release point for the repository, when applicable.string0..1SHOULD
75
publishersThe person(s) or organization(s) responsible for the repository and its availability.Person or Organization0..nSHOULD
76
aggregatorOfThe DataRepositories aggregated by this repository. This property will be empty for primary repositories.DataRepository0..nMAY
77
accessThe information about access modality for the data repository.Access1..nMAY
78
extraPropertiesExtra properties that do not fit in the previous specified attributes. CategoryValuesPair0..nMAY
79
80
SoftwareA digital entity containing sets of instructions and operation, which allows computation and operation of and by computer.SPUC11;SPUC10
81
identifierPrimary identifier for the software.IdentifiersInformation0..nSHOULDBGUC5
82
alternateIdentifiers
Alternate identifiers for the software.AlternateIdentifiersInformation0..nMAY
83
relatedIdentifiersRelated identifiers for the software.RelatedIdentifiersInformation0..nMAY
84
nameThe name of the software.string1MUST
85
descriptionA textual narrative comprised of one or more statements describing the software.string0..1SHOULD
86
licensesThe terms of use of the software.License0..nSHOULD
87
isUsedByThe data acquisition activity that makes use of this software.DataAcquisition or DataAnalysis0..nMAY
88
manufacturerThe person or organisation that produced the software.Person or Organization0..1MAYe.g. Adobe
89
versionA release point for the software.string0..1SHOULD
90
extraPropertiesExtra properties that do not fit in the previous specified attributes. CategoryValuesPair0..nMAY
91
92
PublicationA (digital) document made available by a publisher.
BGUC5-2;WPUC5-p7;WPUC10-p7;UC2
93
identifierPrimary identifier for the publication.IdentifiersInformation1..nSHOULDBGUC5
94
alternateIdentifiers
Alternate identifiers for the publication.AlternateIdentifiersInformation0..nMAY
95
relatedIdentifiersRelated identifiers for the publication.RelatedIdentifiersInformation0..nMAY
96
titleThe name of the publication, usually one sentece or short description of the publication.string1SHOULD
97
dates
Relevant dates, the date of the publication must be provided.
Date1..nSHOULD
98
typePublication type, ideally delegated to an external vocabulary/resource.Annotation0..1SHOULD
e.g. book, article, weblog, chapter, review, correspondence
99
publicationVenueThe name of the publication venue where the document is published if applicable.string0..1MAY
100
authorsListThe list of authors made available as a string (does not allow disambiguation).string0..1SHOULD
Loading...