LCRDM Task Group Dutch Data Curation Network "CURATE model"
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
%
123
 
 
 
 
 
 
 
 
 
ABCDEFGHIJKLMNPQRSTUVWXYZAAABAC
1
We kindly invite you to add to this adapted version of the US CURATE matrix, to get more insight in the current practices of data curation in The Netherlands. All the previous input of other institutions is visible for everyone. Our opinion is that filling in and comparing with other institutions is very informative. Find the source from the data curation network here: https://datacurationnetwork.org/home/resources/
Just to be sure, this is the link to this shared document: https://docs.google.com/spreadsheets/d/1U2dzCA4xCyVoZGLyf8hDdPwNEHwmvDgB6AvBALtNXk8/edit?usp=sharing
2
Please pay attention. The questions have been slightly rewritten into open questions. This means in some cases the answer will not totally correspond to the questions (closed answers, open questions). We would highly appreciate it if you give answers to the questions in column C. Thank you very much!

If you have any questions, please feel free to ask (i.slouwerhof@ubn.ru.nl). Don't forget to inform Inge Slouwerhof that you completed the form, so we can contact you in case of questions.
3
4
Please fill in herePlease fill in herePlease fill in hereRadboud University4TU.ResearchData TU DelftGroningen DANSHogeschool InHollandUtrecht UniversitySURFsaraUtrecht University DataverseNLMeertens Instituut
5
Provide a short overview of how data curation is set up at your institution (we are interested in quality assessments in place for the data that is being archived/published within the institution and/or by the employees of the institution). Which data is curated and in which circumstances?We curate only the datasets that are send to us for archiving at the DANS repository. Researchers can deposit their dataset using the RIS system (https://www.ru.nl/research-information-services/). We use a standardized control form to curate the datasets. At least 2 collegues at the RDM support team check the dataset separately from each other.  For researchers, there is a manual for this process: https://www.ru.nl/research-information-services/manuals/step-archiving-dataset/4TU.ResearchData is an archive for long-term access and curation of
research datasets, with a focus on data from science, engineering and technology. Every researcher, both in the Netherlands and abroad, can upload data to the data archive or
access and download data for use in their research. The publication workflow may differ slightly depending on whether we are in direct contact with the researcher or with a front office.
De TU Delft Library (RDS team) acts as a front office for 4TU.ResearchData. This means that all feedback regarding dataset uploads of TU Delft researchers, is provided via the data officer in the RDS team. The data officer receives the metadata quality review from the moderator of 4TU.ResearchData. The RDS team works closely with the faculty data stewards who are providing domain-specific support for RDM.The UG states basic requirements on data curation in it data policy (2015). Most research institutes have a protocol for long term storage wich includes naming conventions, metadata, codebooks etc. The purpose is serving research integrity and re-use within the research group. Archaeology curates part of its data in DataverseNL, GELIFES has its own repository (restricted access). The RDO curates data in DataverseNL and supports some other repositories (list recommended repositories). The RDO also curates metadata in Pure (with help of the Pure-team).Additional information in http://tinyurl.com/y2uwf45p Binnen Inholland zijn we nu twee jaar bezig met onderzoeksondersteuning. In eerste instantie nu vooral op het proces van publiceren in OA en datamanagementplannen. Specifieke focus op datasets is er nog niet maar dat zie ik nu wel komen.
in het verleden zij er ruim 1000 publicaties ingevoerd waarbij de metadata nog vaak onvolledig en soms onjuist is. Dit kwam vooral door onbekendheid en onervarenheid van onderzoekers (Bij Inholland is er sprake van decentrale invoer door onderzoekers en controle va metadata achteraf door de onderzoeksondersteuners). Qua proces en issues zie ik daarom wel raakvlakken met data curatie. De sheet is ingevuld vanuit de door ons gewenste situatie m.b.t. data curation
Utrecht University has its own repository, YODA. The researcher output archived in YODA is checked by a data manager. Utrech University also has an agreement with DataVerse; the datasetspublished through DataVerse are checked by the local admin. As far as I know, here is no special curation service if the researcher wants to publish at DANS EASY or make use of any other repository. The RDM team is ready to help with any questions around data publication, but we don't have any strict procedures for that. In many cases, it is important for the researchr to use a repository that is well-established in the field for the specific type of data.SURFsara provides multiple data services for long-term preservation, sharing and publication of research data. The Data Archive provides low-cost large-scale storage for any dataset, while the Data Repository service provides a self-service platform for researchers to share and publish dataset of any size with annotations and persistent identifiers. A separate assisted workflow enables large-scale dataset publications. Data curation is in place only on a technical level, i.e. the user is forced to annotate all data and is limited in choice of file formats. By request, the researcher is supported in curating new or existing data pubilcations. SURFsara is setting up processes within the RDNL collaboration together with 4TU and DANS.

All answers below are for the Data Repository service.
All researchers have the option to register for a DataverseNL account using their UU credentials. They can add data to a dataset, but these datasets are checked before publication by RDM Support. The checking is more high level and wouldn't qualify as curation.Het Meertens Instituut is onderdeel van de KNAW. Het beleid inzake de datacuratie is vastgelegd in de Datanotitie (2018). Het instituut sluit daarbij aan bij de data-princiepes en het databeleid van de KNAW (zie: https://www.knaw.nl/nl/thematisch/openscience/opendata en http://www.meertens.knaw.nl/cms/nl/collecties/research-data-management). Daarnaast is Het Meertens Instituut is gecertificeerd met de CoreTrustSeal en streeft ernaar de collecties digitaal en open access aan te bieden. Het Meertens Instituut is een CLARIN-B Centre. Wij slaan data op voor twee redenen: voor de onderzoekers om het onderzoek controleerbaar en reproduceerbaar te maken. Daarnaast slaan wij datasets op voor huidig of toekomstig onderzoek. Voor dat laatste hanteren wij een acquisitie model (uit 2019). Dit doen wij in samenspraak met de onderzoekers (waarbij we ook vragen of de set compleet is, of er documentatie is, of er gepubliceerd is etc.). In beide gevallen zijn de onderzoekers leidend als het gaat om inhoud en kwaliteit van de dataset. Zo ook bij code. Zie ook het collectieplan: http://www.meertens.knaw.nl/cms/images/publicaties/Collectieplannw.pdf.
De metadata van de dataset wordt gegenereerd en gecontroleerd door de afdeling collecties. De metadata is een eigen standaard waar Dublin Core en CMDI metdata van gegenereerd wordt.
6
7
8
CCheck files and read documentationWhich checks do you perform to check if files in familiar formats can be opened? And in the case of unfamiliar formats (when it is not immediately clear which software is required to open them)?Yes, we do this for all the files, even if unknown and/or unfamiliar software is needed to open the files.YesNoData curated by the RDO: yes, unless ... for short term re-use specific formats are helpful that cannot be read by software offered on our workstations.yes, we do this for all files. Some details: A) when the original software is too expensive for us (e.g. Stata), the work-around is to convert the file (in this case to SPSS) and check that version. B) we contact depositor in case of damaged file.Yes, if possible. I don't suppose that we have all te tools to open all sorts of files.yesBecause of the self-service nature of the service, there are currently no checks other than mimetype determination to see if a file is as indicated by the file extension.
The accepted file formats are limited to known and well-established formats (see below).
We advise to stick to the preferred formats as listed by DANS. Or use an other widely used formati for the field. Also common formats - MS Office - are accepted.Op dit moment controleren wij of de files zijn aangeleverd in de preferred formats (http://www.meertens.knaw.nl/cms/images/stories/data/PreferredFormatsMI.pdf). Dat is nu nog manueel. Wij zijn ook bezig met een traject om een geautomatiseerd systeem met checksums op te zetten.
9
What do you do if there is code provided within the data set?Yes, all code provided within the dataset is checked and runned to see if no errors occur.Nonoyes for executables and yes when we have the required software (if any). Otherwise we only try to open it. This issue also relates to our demand to use so-called "preferred formats".Yes, if possible.yesThere is no specific test for this during the creation of the digital objects on the platform. A user can establish a link to an external source for the code. For large-scale datasets this Is done prior to publicationWe advise to focus on reproducibility and to include the needed code plus a readme.txt to ensure data can be reused.Het Meertens Instituut is onderdeel van de KNAW. Het beleid inzake de datacuratie is vastgelegd in de Datanotitie (2018). Het instituut sluit daarbij aan bij de data-princiepes en het databeleid van de KNAW (zie: https://www.knaw.nl/nl/thematisch/openscience/opendata en http://www.meertens.knaw.nl/cms/nl/collecties/research-data-management). Daarnaast is Het Meertens Instituut is gecertificeerd met de CoreTrustSeal en streeft ernaar de collecties digitaal en open access aan te bieden. Het Meertens Instituut is een CLARIN-B Centre. Wij slaan data op voor twee redenen: voor de onderzoekers om het onderzoek controleerbaar en reproduceerbaar te maken. Daarnaast slaan wij datasets op voor huidig of toekomstig onderzoek. Voor dat laatste hanteren wij een acquisitie model (uit 2019). Dit doen wij in samenspraak met de onderzoekers (waarbij we ook vragen of de set compleet is, of er documentatie is, of er gepubliceerd is etc.). In beide gevallen zijn de onderzoekers leidend als het gaat om inhoud en kwaliteit van de dataset. Zo ook bij code. Zie ook het collectieplan: http://www.meertens.knaw.nl/cms/images/publicaties/Collectieplannw.pdf. De metadata van de dataset wordt gegenereerd en gecontroleerd door de afdeling collecties. De metadata is een eigen standaard waar Dublin Core en CMDI metdata van gegenereerd wordt.
10
How do you evaluate the richness, accuracy and completeness of the metadata?YesYesNoWe stimulate researchers to describe data as they would have liked it, if they would have found it. We sometimes add metadata (important variables, methods, fields, scientific names of species as key-words). Check the right affiliation in Pure. We do not aim for completeness .yes. When metadata is weak, documentation is often weak too, and we contact depositor.Yes, this is necessary. Not sure if we do agree on what is a required minimal set of metadata?make sure the fields in the metadata editor are filledThe only requirement is that all required fields of the metadata schemas are filled inWe check if there are fields left open that could be filled, e.g. software can be specified very often.Zie 3
11
What do you expect to be present in the documentation? (readme, codebook, data dictionary, other?)YesYesNoYes, but may not be necessary when the data in the files are self-explanatory.yes. When documentation is weak, metadata is often weak too, and we contact depositor.nvtyesWe expect the researcher to provide at least a concise description of the dataset in the metadata. Any added documentation is optional.Most data is related to a publication, the descriptions shoud clearly state what data and/or code is included and what not.Zie 3
12
What kind of check do you apply to know if there are human subjects involved? In case there are human subjects involved, how do you check for direct and indirect identifiable data?Yes, if it is a dataset with a real potention of containing personal data, 3 separate people perform a privacy check to prevent data leakage.YesNoYes. We check for direct and indirect identifyable data, albeit the last is by some rules of thumb rather than analytical tools. We work on a checklist. We offer a DPIA service for a final check and propose appropriate measures.yes. We also check if the data is anonymised and if needed involve our legal expert. The depositor is responsible and in principle DANS doesn't do anonymisation or pseudonimisation.Ja, als onderdeel van ontwikkelen van datamanagementplan yesManual periodical check. There is currently no specific check in place.We make clear that if theres (personal) sensitive information in the data, the researcher is responsible, and we offer advise on what how to anyonimise or exclude personal information.Persoonsgegevens worden door ons behandeld conform de Algemene Verordening Gegevensbescherming (AVG). De privacyverklaring van het Meertens Instituut is te lezen op de website van de KNAW (https://www.knaw.nl/nl/de-knaw/privacyverklaring-knaw).
13
UUnderstand the data (or try to), if not…What usability criteria do you consider? (missing data, ambiguous headings, code execution failures, etc...)Yes. All of the items described here are part of our data curation process. We check if there is missing data and ask the researcher to document why this data is missing. We also execute the code to see if it runs without problems. The quality check is not on the level of content, but on the level of the data quality.YesNoWe do check for headings and explanation of headings, but not for missing data in the datafiles. We do check missing metadata. If a lot is missing we try to add ourselves (RDO/Pure-team) or ask the researcher to give more information. No checks on code execution.yes. Quality assurance: not contentwise, but DANS datamanager checks how the files in a dataset are related/ interdependent. Ambiguous headings: we propose (or demand) changes to the depositor. Code execution failures: back to depositor.nvtmake sure that files mentioned in the documentation are submitted and there are no files that are not mentioned. Check data presentationCheck data presentation manually. Depending on the ingest workflow we can make sure all files are present.We don't check actively for missing data or failing code, but we make clear that all data in the dataset will be published and potentially reused.Zie 3
14
Which metadata do you extract from the submitted files automatically to fascilitate re-use?Yes, we check if there is a related result (article) and link it to the dataset. Yes, if not already added in the metadata of the dataset, we check on existing related publiations.Noyes, but we do not search "forever". Our validation process in Pure allows for validation, improvements and re-validation.yes. E.g. we search for publication(s) and reports related to (or supposed to be in) the dataset; in case of "known" depositors we search for related datasets in our repository.nvtThe data manager does not necessarily has domain specific knowledge to evaluate what is important for reuseCurrently no metadata is extracted from submitted files.Only the built-in options on size and md5 checksum.Zie 3. Op collectieniveau DC en CMDI. Op een lager niveau hangt dat af van de vraag en wat is aangeleverd.
15
What ways do you have to determine if the documentation of the data is sufficient for a user with similar qualifications to the author’s to understand and reuse the data?Yes, the quality of the documentation is one of the most important aspects we check on. We make recommendations towards the researcher, but we do not create any documentation ourselves. YesNoyes. We advice, but do not aske for more if the policy and the researcher indicate the given info is good enough. We are no expert in every field.yes, this is a major goal of our data curation. We provide rich information on the website ("pre-ingest"): https://dans.knaw.nl/en/deposit/information-about-depositing-data?set_language=en  We don't create additional documentation ourselves.nvtThe data manager does not necessarily has domain specific knowledge- the qualifications are not the same as qualifications of the researcherThe repository and data manager does not have domain-specific knowledge for file formats and contents.We don't check for this at the moment.
16
Which parameters of the tabular data do you check? (structure, definitions of headers, codebook…)Yes. YesNoHeadings are checked. Structure is only an issue if things are really bad. This may be the case with legacy data.yes, and we expect a codebook explaining headers, variables etc.nvtyes, form and completeness are checked as far as posssibleThere are no checks in place for tabular dataWe only do some spot checks at the moment.Zie 3
17
RRequest missing information or changesHow do you communicate to the researcher what changes need to be made and what issues, errors need to be fixed? (Orally, by email, creating a list of necessary changes, implementing the changes yourself and discussing the results?)Yes, after we complete our standardized controle form, we collect a list of questions and suggestions and mail the depositor of the dataset.YesThe list of questions is provided by 4TU.ResearchData but sent by the data officer of the RDS team (front office).Yes, with reasons why we ask for more info or improvements.yes, if needed. See http://tinyurl.com/y2uwf45p Yes, this is necessaryThe YODA environment allows the data manager to caollaborate with the researcher on the same dataset before it is published. Questions and issues can be discussed. I am not sure if questions are recoded in form of a "list"Depending on the ingest workflow, there will be changes to the contents and structure of the files as necessary. This will be communicated in person or via email. For self-service ingest, there is no such process, but there might be communication after the publication regarding this topic.We usually reply via e-mail and offer further support by mail r phone. We clearly state the changes that need to made, changes that are optional and mention the CC license.Zie 3
18
AAugment metadata for findabilityDescribe how you enchance metadata to facilitate findability (correcting errors, adding keywords, linkages to related datasets, etc.)We only enhance metadata when we don't change the content. For example, if keywords are not separated by semicolons, we add them for the researcher. However, if we think that there should me more keywords, we make this suggestion to the researcher.Yes, every deposited dataset is undergoing a metadata quality review. Suggestions for improvement of the metadata are returned to the depositor. Noyes, especially keywords. Topics often need te be included. Also general description or method may need more info. In Pure we try to relate to other research output. This requires checking several times since workflows and timing my differ. We still need NARCIS to read our Pure metadata on datasets.yes. e.g. we add extra Subject terms (= keywords) and Location (if we're sure!). Also, we add Relations to e.g. external websites, related datasets, publication etc. In addition to what we do ourselves (if needed), we may request to rename unclear file names and/or recommend to zip files into the desired, clear folder structure.NeeThe metadata forms are provided by YODA environment or by DataVerseThe metadata fields are structured according to a metadata schema which enforces certain formats and allows linking to other data sources. Some fields are tied to (controlled) vocabularies.We look for additional identifiers and ask the researcher to add this.Zeker. Dat doen wij in overleg. Zowel mondeling bij de intake als schriftelijk.
19
In which cases do you structure and present metadata in domain-specific schemas to fascilitate interoperability with other systems?No, not for specifice datasets. NoNoYesnot on the level of datasets. In specific cases we adapt metadata for community harvesters, so that they can aggregate the metadata of all datasets relevant for their community. This costs money ;-)nvtVarious YODA environments follow their own standards, which correspond to what is agreed upon in a particular research communityCommunities and domains can define their own metadata schemas to allow them to make their data interoperable within their domain.We use the domain-specific fields in dataverseZie 3
20
How do you evaluate that linkages are sufficient? (link to report/paper, to related data sets, to source data, etc)Yes. We check if the researcher has made a link to the corresponding research paper. If existing data is used, we also check whether there is a proper reference. Yes. We check on related publications.NoYes, especially in Pure. We do not have sufficient capacity to do the same for DataverseNL. this does not happen in local research group archives.yes, this is a major goal of our data curation. See other fields above.nvtWe try to ask the researhcers if there are related publicationsThere is no such procedure in place, but the inclusion is encouraged during the ingest workflows.this is not actively checked.Dat controleren wij bij de intake.
21
TTransform file formats for reuseWhich criteria do you have on specialized file formats and their restrictions? (e.g., Is the software freely available? Link to it or archive it alongside the data)?Yes. We check if we (data curation team) can open all the files. This includes checking if needed software is available and can be easily installed. Needed software should be mentioned in the documentation, accompanied with information on where to download the software and how to open the files with the software. Yes, we check if the dataset is provided in a preferred file format. See our list of preferred formats: https://researchdata.4tu.nl/fileadmin/user_upload/Documenten/preffered_file_formats.pdfNoyes. sometimes we add generally readable formats.yes, this is why we have so-called preferred formats. Documentation should contain software, when possible, or else a good description of what's needed to access and use the data. https://dans.knaw.nl/en/deposit/information-about-depositing-data/before-depositing/file-formats is the current versionnvtthe researchers are free to archive formats they prefer. Notes on software should be in the documentationsThe repository distinguishes between accepted and preferred formats. Any other file formats are not accepted upon ingest in the self-service portal. For the massive ingest workflow any file format can be considerd after careful consideration with the data producer. Generally, file formats that are open, free to use, considered standard formats in relevant communities and commonly used are preferred. Accepted formats can be stored, but are not curated other than bitwise preservation.We advise to use the DANS preferred formats, or another well known format in the field.Zie 2
22
Which criteria do you have on preferred file formats and transformation into open, non-proprietary file-formats that broaden the potential audience for reuse.Yes, we use the list of the DANS archive. Often, we also store the original files so that no data is lost. We don't do this ourselves.NoNot as a standard procedure. If we add transformed open source data, we usually upload both versions as some information may get lost in the transformatiuon process.yes, if needed we convert e.g. Word, Excel, DBF and audiovisual data to preferred formats. We retain the originals.nvtSo far, no such ttransformations are performed, but we are considering deriving copies in preferred formats when possibleTransformations can be performed as part of a data curation project for a (number of) dataset(s). The transformation should always be to a preferred format.We advise on using the preferred, but also accept commonly used formats. Mainly since DataverseNL is not very long term storage.Nog geen specifieke criteria. Behalve dat het kan zijn dat een onderzoeker een specifiek format of software wil gebruiken die niet duurzaam is. Ook in dergelijke gevallen moeten slaan wij de data op. Immers, het is per definitie zo dat nieuw onderzoek ook gedaan kan worden door nieuwe datasets en tools die (nog) niet duurzaam zijn of als duurzaam zijn gekwalificeerd.
23
Which criteria do you have on the availibility of software needed to open the dataset?Yes. If not, we advise to store it with the dataset. Yes. NoDefenitely within the UG-domain and preferably general. yes; see also "check if code runs". In case of unclear software version or unclear software at all, we ask for conversion to a better documented and specific version of a specific application.nvtFor some datasets (ex. Geo-labs) only specific software can deal with the data. So far, it hasn't been problematized and it is assumed that it should be possible to archive those datasets as wellThere is currently no such specific criterium; for the preferred formats this is considered not a problem. For accepted formats, it should be broadly used and accepted in the community.No criteria at the moment. Zie 2
24
EEvaluate for FAIRness (*)(%)Name some of the metadata fields you expect to find next to author/title/date.Yes. We check if most of the metadata fields are filled in by the researcher. We also encourage researchers to extend the metadata when we find it too brief.Yes.Noyes, see alos several answers above. Even in the lacal research institutes archives more is required (n as stated in their protocols/policies).OVERALL COMMENT about E=FAIR: DANS *provides* several FAIR qualities from this list, so we don't *check* for them in the datasets that we receive.                                         Yes, we check that all mandatory fields are filled in and recommend that all Dublin Core fields are filled.Ja, m.b.t. publicatiesyes, there are more mandatory metadata fieldsBy default, all Dublin Core and DataCite fields can be filled in. They are mandatory or recommended according to the DataCite guidelines.author / author identifier / institue / title / related publicationDC & geografische kenmerken.
25
How do you make sure the dataset is findable with a PID?All datasets curated by us are archived within the DANS EASY archive and will have a DOI.Yes, every dataset is provided a DOI upon publication.NoWe check the references.we provide a DOInvtyes, add PID for authors and contributors as much as possibleDOIs and EPIC PIDs are automatically assigned to the digital object and files upon completion and publication of the object.All datasets in DataverseNL get a Handle.Als CLARIN B centre voegen wij PID’s aan de datasets die beschikbaar worden gemaakt voor CLARIN (handles).
26
How do you make sure that the dataset will be discoverable via web search engines?All datasets curated by us are archived within the DANS EASY archive and the discoverability is therefor guaranteed.We support the OAI-PMH protocol to allow the harvesting of our metadata for integration in search engines. We have also embedded schema.org metadata in the dataset landing pages, so that it can be indexed in Google Dataset Search.NoAll final versions of research data should be described in Pure (which is not the case yet). The metadata may be readable for the public, within the UG-domain or back-office only. General access levels are specified too (open, restricted, closed etc.) Public metadata in Pure is indexed by e.g. Google. We want very much to be harvested by NARCIS.we provide all metadata of all datasets via the OAI-PMH protocol to search engines and aggregatorsnvtYODa and DataVerse are indexed by most data servicesAn OAI-PMH endpoint is provided for harvesting and the site can be indexed by any web crawler.DataverseNL is indexed by Google Data SearchOp dit moment werken wij aan een nieuw systeem waarbij deze zaken ook mee genomen worden.
27
How do you make sure that the dataset is retrievable via a standard protocol (e.g., HTTP)?All datasets curated by us are archived within the DANS EASY archive and are therefor by definition accessible.Yes, https protocol is used.NoYes.we provide all metadata of all datasets via the OAI-PMH protocol to search engines and aggregatorsnvtYODa and DataVerse comply with this requirementThe website is fully compliant. Only HTTPS connections are accepted. A REST API allows for automated interaction.Is hanlded via DANS-DataverseZie 20
28
How do you assess that the dataset is free, open? How do you make sure that it can be downloaded?Researchers can choose between open access and restricted access. As the policy of the Radboud University states that datasets should be published open access when possible, we encourage researchers to publish open access. When we see the researchers chose restricted access, we try to talk the researchers over except when there are good reseans to publish under restricted access.Currently, only Open Access is provided and an embargo date can be set on request.NoWe check all possible access levels and information on it and copy this in Pure. We may get in touch with the researcher to ask for further information and discuss options.the DANS motto "Open when possible, restricted when needed" has been widely adopted... However, so far the choice is left to the depositor. It is also possible to combine Open and Restricted Access files within one dataset.Ja, m.b.t. publicatiesThe created lading page is checked for mistakesAll digital objects have a dedicated landing page which displays all open metadata. Depending on the share level the files can be downloaded.By default all datasets are licensed wiht CC0, we encourage the use of CC0 of CC-byZie 20. Daarnaast heeft het Meertens Instituut datasets die webbased beschikbaar zijn: http://www.meertens.knaw.nl/cms/nl/collecties/databanken.
29
How do you evaluate the chosen metadata format? How do you check if it follows a standard schema?The datasets curated by us are registered and deposited using the RIS system. Here, we have a fixed metadata scheme (combination of DataCite and DublinCore). Therefor, we don't particularly check the metadata scheme.All metadata in 4TU.ResearchData is stored as RDF and is making use of standard ontologies and vocabularies, e.g. Dublin Core (dcterms), foaf, owl, wsg84 (coordinates), geonames, and an own ontology for topics that were not covered in the standard ones.NoNot really impressed by Dublin Core, being the bare minumum. More of an issue whether we would advice a repository to be used. If it does not meet DC, then please do not use the repository.yesJa, m.b.t. publicatiesYODa and DataVerse have their own standardsDublin Core and DataCite are expected and enforced.default by DataverseNLZie 3
30
How do you make sure that metadata is provided in machine-readable format (OAI feed)?NoYesNo... not sure...we provide all metadata of all datasets via the OAI-PMH protocol to search engines and aggregators. Moreover, the metadata can be downloaded as .csv and .xml file.?YODa and DataVerse comply with this requirementAn OAI-PMH endpoint is provided for harvesting. A REST API allows for automated interaction using JSON format.default by DataverseNLZie 20
31
Which contact info do you expect to be displayed (if the direct assistance of the author needed)?No. The author name and affiliation are per default mentioned. We do not provide an emailadress or other contact details. In most cases the researchers adds his/her contact details inside the dataset documentation. However we not have a check on whether contact information is provided.No, but we support ORCID ID for author and contributor names.Noyes!. General contact point is the Research Data Office, so we must be able to find specialists. Responsibilities are described in the RDM policy of the UG.no, although a depositor can provide this and is also encouraged to add their DAI (and in the near future also ORCID). nvtDifferent labs have different agreements upon thatThe author is registered as a user at SURFsara and contact details are to be filled in during metadata annotation.DataverseNL has the mail address, contact goes via the 'contact' button on the website itself.De adresgegevens van het Meertens Instituut.
32
Which indicators of who created, owns, and stewards the data do you expect in the metadata?Within the metadata of the datasets, the rightsholder of the dataset is always mentioned. In most cases, this is the Radboud University. Also, authors and co-authors are a required field. We always check whether the authors of the datasets and the authors of the corresponding article match. if not, we ask the dataset depositor. The depositor of the dataset, is the one who stewards the data and handles for example access requests. However, this is not made clear from the metadata.Here is still some work to be done: different roles of researchers, affilitaion (many repositories make pigs ears of that). Ownership is difficult in Dutch law: a researcher has rights of use, as has the university. What is an issue in curation are data that belong to third parties. Stewards are the RDO and then according to the policies of UG and research institute.mainly yes: Creator is a mandatory metadata field, Rightsholder is an optional metadata field. A depositor can optionally add a data steward as a Contributors.Ja, m.b.t. publicaties maar geen informatie over stewardsCreator is a mandatory field. Rightsholder is optional. As a default, affiliation is interpreted as rightsholderCreator is a mandatory field. Rightsholder is optional. As a default, affiliation is interpreted as rightsholderthe account used to create a dataset is considered the main contact and creation.De collecties zijn op dit moment op een klein gedeelte na het eigendom van het Meertens Instituut. Die collecties die dat niet zijn onder licentie gedeponeerd. Via het Meertens Instituut kan contact worden gezocht met de eigenaar.
33
How do you approach evaluation of usage terms (e.g., a CC License)?Data published open access in the DANS EASY archive is accompanied with a standard license. Within our curation process we do not check if additional license files are used (when the data is published restricted access). Depositors can choose a licence from a predefined list. The full range of Creative Commons licences for datasets, and specifically for software and code, three popular open source licences are supported.
NoThis is something to work on, especially if CC-zero, CC-by, CC-NC is not usable.we provide a usage licence. Ja, m.b.t. publicatiesLicense is a mandatory fieldLicense is a mandatory field and can be chosen via a license selector toolThe default is CC-0, we inform and offer advise if a researcher wants to change this or add a data availability statement.Zie 1
34
DDocument your curation activities (*)(%)Which provenance information do you record (who did what to the dataset and when)?Yes. We keep a standardized control form in which we capture all of our findings and actions.NoThis is someting we try to implement in our new research workspace so this type of data is more or less automatically generated during the research process.yes, from the moment the dataset is submittednvtInformation about the approval of submissions is saved system internally by YODAAny changes to metadata are automatically logged in the system. File changes are not allowed.From creation until publication the dataset has a status 'DRAFT'. The Handle is assigned, but nothing is findable yet. Then when published, it gets a version 1.0 and all changes after that get a new version including a log and a new curation/checking round.Zie vraag 1, verder voegen wij zoveel mogelijk informatie toe
35
What is included in accessioning & deposit records (names, dates, contact information, submission agreements, etc)?Yes, we have an overview of the deposited datasets within our CRIS partly, incomplete.yes, for keeping provenance informationJa, onderdeel van het datamanagementplanSome of this information is saved system internally by YODA, to my knowledge there is no separate databaseThe publication is linked to a data owner which is known by SURFsaraAll is in the dataset description.Minimaal DC en verder zo veel we kunnen (incl. taal, geografie, periode, eigenaar)
36
Which provenance logs do you keep?Yes. We keep track of different versions of the datasets.Yes.In our new research workspace, in DataverseNL (background) and in Pure.yesNeeAutomatically created by YODAAutomated loggingIs kept online in DataverseNlJa, voor zover het voorkomt doen we aan versiebeheer.
37
Do you have a service workflow to follow the curation process?Yes, we follow a standardized control form in which the service workflow is described. Yes, there is an internal workflow in place.Yes for Pure and DataverseNLyes, see http://tinyurl.com/y2uwf45p Nog niet beschrevenThe workflow is in developmentIn development.Not officially. There is a RDM mailbox, all request end up there. This is handled by three RDM staff.Ja er is een interne workflow
38
Describe any other relevant requirements for data curation process at your institutionNoData policy:  https://d1rkab7tlqy5f1.cloudfront.net/Library/Themaportalen/RDM/researchdata-framework-policy.pdfSummary: Archive according to UG and research institute policies first, then think of how to share data more openly and register in Pure.noNeeThe researcher neds to be employed by the institution to make use of the data curation servicesData curation can be requested before and after publication and is offered as a separate package.Within DataverseNL UU users can get an own folder and manage and curate their own datasets. By default all is checked by RDM Consultants, mainly this is informing the researcher on the default license, whether or not to inlcude personal information and to be clear when describing the dataset.Zie vraag 3
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...