RDA assessment of data WG definitions
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

Comment only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
This is a working document for the RDA WG, "Assessing data fitness for use"
2
These terms will be added to the next cycle of the IRiDiuM WG in January 2018 after which this document will be deprecated.
3
For terms and definitions that do not appear in this list, please consult:
4
http://dictionary.casrai.org/Category:Research_Data_Domain
5
6
Add to IRiDiuM RDM2018TermDefinition1ReferenceIRiDiuM existingIRiDiuM RDM2018 forum for new terms and review of existing terms
7
for reviewaccessThe continued, available for use, ongoing usability of a digital resource, retaining all qualities of authenticity, accuracy and functionality deemed to be essential for the purposes the digital material was created and/or acquired for. Users who have access can retrieve, manipulate, copy, and store copies on a wide range of hard drives and external devices.
Research Data Alliance http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page ; http://www.techopedia.com/definition/26929/data-access
http://dictionary.casrai.org/Access
8
newaccessibility
9
anonymizationA rigorous process in accordance with a standard tp remove personally identifiable information from a dataset, to such an extent that the data can not be traced back to individuals, directly or indirectly. It relates to pseudonymization, where personally identifiable information is replaced with a meaningless number or other identifier with which information on anonymous individuals from different sources can still be linked.
Information Commissioner’s Office Code of Practice on Anonymisation https://ico.org.uk/for-organisations/guide-to-data-protection/anonymisation/
https://forum.casrai.org/c/standards/iridium
10
for reviewcertified productA product that has been inspected, evaluated, tested, or otherwise determined to be in conformance or compliance with applicable or specified provisions of referenced standards, codes, or other requirements and certified by an authority which is recognized or has the legal power to grant such certification. Certified products imply a guarantee or warranty of product conformance and that the product is under the test and surveillance procedures of a specified certification system.
American National Standards Institute ANSI ""Standards Management: A Handbook for Profit""
http://dictionary.casrai.org/Certified_product
11
certification
12
for reviewcurationThe activity of managing and promoting the use of data from their point of creation to ensure that they are fit for contemporary purpose and available for discovery and reuse. For dynamic datasets this may mean continuous enrichment or updating to keep them fit for purpose. Higher levels of curation will also involve links with annotation and with other published materials.
JISC e-Science Curation Report/TC3+
http://dictionary.casrai.org/Curation
13
data collection1. In the context of how data are organized: A logical grouping of (research) datasets that share a common aspect or concept. A Data collection is the highest in the hierarchy of data groupings (data collection, data set, data granule) and comprises a grouping of datasets that have a strong connection and is organised coherently around a single element or concept (e.g., model, instrument). 2. In the context of how and what data are collected: A data management plan answers the following questions: What types of data will be collected, created, linked to, acquired and/or recorded? What file formats will the data be collected in? Will the file formats allow for data reuse, sharing and long-term access to the data? What conventions and procedures will be used to structure, name and version-control data files to help understand how the data are organized?
https://forum.casrai.org/c/standards/iridium
14
new
data fitness for purpose
Data fitness for purpose is defined by external requirements to a particular end. High quality data may not be fit for a particular purpose. RELATED TERM: data quality
http://www.mocnik-science.net/publications/2017c%20-%20Franz-Benjamin%20Mocnik,%20Alexander%20Zipf,%20Hongchao%20Fan%20-%20Data%20Quality%20and%20Fitness%20for%20Purpose.pdf
15
newdata lineageA low level, detailed technical audit trail providing a description of the historical record of the data and its origins through the entire data life cycle.
16
for review
data management
The activities of data policies, data planning, data element standardization, information management control, data synchronization, data sharing, and database development, including practices and projects that acquire, control, protect, deliver and enhance the value of data and information. SYNONYM. Data coordination
http://dictionary.casrai.org/Data_management
17
data management plan
Data management plans are living documents consisting of the practices, processes, and strategies that pertain to a set of specified topics related to data management and curation in research. These topics give structure to a DMP and may originate from a particular data policy, for example, data sharing or depositing data, or from an institutional goal, such as data archiving or data stewardship. While in some instances these policies or goals may be implicit, they nevertheless are essential to understand the content that is expected in a DMP.
Portage Network; CCSDS (2017) Data management plans - Information Preparation To Ensure Long Term Use https://www.rd-alliance.org/system/files/documents/InformationPreparationToEnsureLongTermUse-20170420.docx
http://dictionary.casrai.org/Data_management_plan
https://forum.casrai.org/c/standards/iridium
18
newdata provenanceA high level description of where the data came from; a type of historical information or metadata about the origin, location or the source of the data. For example, information about the Principal Investigator who recorded the data. RELATED TERM: data lineage
http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page
19
for reviewdata qualityThe reliability and application efficiency of data. It is a perception or an assessment of dataset's fitness to serve its purpose in a given context. Aspects of data quality include: Accuracy, Completeness, Update status, Relevance, Consistency across data sources, Reliability, Appropriate presentation, Accessibility. Within an organization, acceptable data quality is crucial to operational and transactional processes and to the reliability of analytics, business intelligence, and reporting. Data quality is affected by the way data are entered, stored and managed. Maintaining data quality requires going through the data periodically and scrubbing it. Typically this involves updating, standardizing, and de-duplicating records to create a single view of the data, even if it is stored in multiple disparate systems. Data quality assurance (DQA) is the process of verifying the reliability and effectiveness of data. RELATED TERM. Data cleaning
http://dictionary.casrai.org/Data_quality
20
data qualityData quality is defined by internal characteristics such as completeness, logical consistency, positional and thematic accuracy, temporal quality, etc. RELATED TERM: data fitness for purpose
http://www.mocnik-science.net/publications/2017c%20-%20Franz-Benjamin%20Mocnik,%20Alexander%20Zipf,%20Hongchao%20Fan%20-%20Data%20Quality%20and%20Fitness%20for%20Purpose.pdf
http://dictionary.casrai.org/Data_quality
21
for reviewdata usabilityData that can be understood and used without additional information. Usable data are delivered in a form that meets the needs of different end-user audiences, is ready for the tasks that the end-user needs to accomplish, and that has been adapted to the end-user's needs (not the other way around). Usable data have been cleaned, structured, are in machine readable format, fully documented, and ready for analysis and interpretation.
http://dictionary.casrai.org/Usable_data
22
data quality reviewA process that goes beyond quality assurance /quality control (QA/QC) to ensure the quality of data and provide sufficient information to allow all potential users to readily evaluate the "fitness for purpose" of the data.
Peer et al. (2014) Committing to Data Quality Review http://isps.yale.edu/sites/default/files/files/CommitingToDataQualityReview_idcc14-PrePrint.pdf
https://forum.casrai.org/c/standards/iridium
23
for reviewdatasetAn organized collection of data, defined by a theme or category that reflects what is being measured/observed/monitored. In machine-to-machine interactions, a dataset is a compilation of data that constitutes a programmable data unit that has been collected and organised using one process. It has a single Data Owner, a single license, one set of semantics, ontologies, vocabularies, and has a single data format and internal data convention. A dataset must include its version.
http://dictionary.casrai.org/Dataset
https://forum.casrai.org/c/standards/iridium
24
newdiscoverability
25
findable dataData relating to a given topic or attribute are findable when: (a) metadata are assigned a globally unique and eternally persistent identifier; (b) data are described with rich metadata; (c) metadata are registered or indexed in a searchable resource; and, (c) metadata specify the data identifier.
Wilkinson et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship http://www.nature.com/articles/sdata201618
https://forum.casrai.org/c/standards/iridium
26
newfindability
27
newfitness for purpose
28
newfitness for use
29
for reviewinteroperability1. The capability to communicate, execute programs, or transfer data among various functional units in a useful and meaningful manner that requires the user to have little or no knowledge of the unique characteristics of those units. Foundational, syntactic, and semantic interoperability are the three necessary aspects of interoperability. 2. Legal interoperability occurs among multiple datasets when: the legal use conditions are clearly and readily determinable for each of the datasets, typically through automated means; the legal use conditions imposed on each dataset allow creation and use of combined or derivative products; and users may legally access and use each dataset without seeking authorization from data rights holders on a case-by-case basis, assuming that the accumulated conditions of use for each and all of the datasets are met.
RDA Data Foundation and Terminology Interest Group; http://smw-rda.esc.rzg.mpg.de/exports/tedt.html#Interoperability; RDA-CODATA Legal Interoperability Interest Group; Legal Interoperability of Research Data: Principles and Implementation Guidelines, Zenodo. https://doi.org/10.5281/zenodo.162241; The Open Group TOGAF Documentation. TBS Standard On Metadata (Dublin Core Metadata Initiative); DAMA Dictionary of Data Management; ISO/IEC 2382-01, Information Technology Vocabulary, Fundamental Terms
http://dictionary.casrai.org/Interoperability
https://forum.casrai.org/c/standards/iridium
30
newlicence
31
newlineage
32
for reviewmetadataLiterally, "data about data"; data that defines and describes the characteristics of other data, used to improve both business and technical understanding of data and data-related processes. Business metadata includes the names and business definitions of subject areas, entities and attributes, attribute data types and other attribute properties, range descriptions, valid domain values and their definitions. Technical metadata includes physical database table and column names, column properties, and the properties of other database objects, including how data is stored. Process metadata is data that defines and describes the characteristics of other system elements (processes, business rules, programs, jobs, tools, etc.). Data stewardship metadata is data about data stewards, stewardship processes and responsibility assignments.
http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page
http://dictionary.casrai.org/Metadata
33
for reviewpeer reviewA process by which a scholarly work (such as a paper or a research proposal) is checked by a group of experts in the same field to make sure it meets the necessary standards before it is published or accepted
http://dictionary.casrai.org/Peer_review
34
newPID
35
for reviewPID recordA type of record (and organization) that stores an instance of an executable/understandable PID. The content of a PID record distinguishes a registered digital or data object from other DIOs. A PID record is a type of record that includes property information that characterizes the digital object it is identifying. Important parts of a PID record are location and checksum. However there is a large variation in usage. In some data models the PID is simply used as a unique label with an empty record. A PID record has a lifecycle including creation, publication, curation and destruction.
http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page
http://dictionary.casrai.org/PID_record
36
for reviewPID systemConsists of at least one PID resolver, a name schema and a defined mechanism for issuing PIDs that conform to the name schema. Examples include: DOI, Handle System, URN, ARK, PURL, etc.
http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page
http://dictionary.casrai.org/PID_system
37
for reviewpreservationAn archiving activity that ensures availability of data and associated metadata over time, regardless of format, so that they can be accessed and understood through changes in technology. Actions include maintaining integrity, discoverability, and accessibility, and to facilitate (re)use in the long term. Preservation is one of the tasks of data curation.
http://ceos.org/document_management/Working_Groups/WGISS/Interest_Groups/Data_Stewardship/White_Papers/EO-DataStewardshipGlossary_v1.1.pdf
http://dictionary.casrai.org/Preservation
https://forum.casrai.org/c/standards/iridium
38
for reviewprovenance
http://dictionary.casrai.org/Provenance
39
newreusable
40
reusable dataReusable data are analysis ready, and: (a) include sufficient information and metadata for the data to be independantly understandable without having to resort to other resources, assessable, and interpretable; (b) have been cleaned, structured, are in machine readable format, fully documented, and ready for analysis and interpretation; (c) are delivered in a form that meets the needs of different potential end-users, are ready for the tasks that potential end-users need to accomplish, and have been adapted to end-users' needs (not the other way around).
https://www.force11.org/group/fairgroup/fairprinciples; https://public.ccsds.org/pubs/650x0m2.pdf
https://forum.casrai.org/c/standards/iridium
41
newstewardship
42
newtimestamp
43
for reviewusable dataData that can be understood and used without additional information. Usable data are delivered in a form that meets the needs of different end-user audiences, is ready for the tasks that the end-user needs to accomplish, and that has been adapted to the end-user's needs (not the other way around). Usable data have been cleaned, structured, are in machine readable format, fully documented, and ready for analysis and interpretation.
http://dictionary.casrai.org/Usable_data
44
for reviewversion controlControl over time of data, computer code, software, and documents that allows for the ability to revert to a previous revision, which is critical for data traceability, tracking edits, and correcting mistakes. Version control generates a (changed) copy of a data object that is uniquely labeled with a version number. The intent is to track changes to a data object, by making versioned copies. Note that a version is different from a backup copy, which is typically a copy made at a specific point in time, or a replica. SYNONYM. Source control; Revision control; Versioning. RELATED TERM. Universal numeric fingerprint; Data citation
Research Data Alliance http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page ; http://www.alliancepermanentaccess.org/index.php/knowledge-base/dpglossary/#B
http://dictionary.casrai.org/Version_control
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...