| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Initial questions | Datasharing checklist | Additional details for specific cases | Other common questions | For code | |||||||||||||||
2 | ||||||||||||||||||||
3 | What kind of data is being shared? File size? Compression? Updatable? | Description of the project/dataset: What was the project about and why share this data, and who should use it? Who are the individuals involved (if more than just yourself) and contact details? Any plans/schedule for updating the data, and how users can submit issues/requests (GitHub has this inbuilt) | Dataset builders seeking collaboration: To formalize the dataset as a publication you may also want to assign a Digital Object Identifier. If you use GitHub, one of the recommended ways to add a DOI is via Zenodo, see https://guides.github.com/activities/citable-code/. DataCite (https://datacite.org/) also provides many options for assigning a DOI to a dataset. | What upstreams contribute to your work / what toolchain is used? How do you note when they change; when does this trigger a recompilation? Do you have a process-dependency tree? | Repository structure: (start a github or gitlab repository, add a description, choose a license) | |||||||||||||||
4 | Where should it be hosted? For how long? Do I require registration info for others to use my data? Do i need a download counter? | Describe the data: Data files + datasets (size, relevance to what sort of analysis, etc.) Data schemas and/or field descriptions How was the data produced? Method + data cleaning. Was there any preregistration of statistical methods? How is the data updated? (if relevant) | On Datasharing: Matt Marx's writeup on his sharing of data + process for Reliance on Science | Reuse: What sorts of reuse are expected? What derivative datasets or recombinations with other data layers are collaborators interested in making? | Reuse: What sorts of reuse do you expect? Including modules used with other code , or plugins developed by others to make your code work for their use cases | |||||||||||||||
5 | Affiliation Requirements or resources associated withg that affiliation? Do I plan to maintain the data? Do I plan to respond to queries? | Describe the code: Document the source code and process used to create the dataset How to use and build on the data (e.g. tune-able parameters in key steps, explicit nods to replication, etc.) | To formalize the dataset as a publication: Also want to assign a Digital Object Identifier. If you use GitHub, an easy way to add a DOI is via Zenodo, see https://guides.github.com/activities/citable-code/. DataCite (https://datacite.org/) also provides many options for assigning a DOI to a dataset. | How are dumps provided: name, format, versioning? Is there a feed of updates to dumps? | Formalize the code as a publication: you can get a DOI via Zenodo, see https://guides.github.com/activities/citable-code/ | |||||||||||||||
6 | External links: URL / link to other datasets/software/code used URL / link to related papers | Examples of datasets in GitHub: https://github.com/lmatthia/publisher-oa-portfolios · https://github.com/CSSEGISandData/COVID-19 | What sources are used; drawn from what set? Are the sources versioned? Regularly crawled? | |||||||||||||||||
7 | Terms of use: How should it be cited (‘If you use the data, please cite…’). Any license details (code, other software or third party data and associated copyright or terms of use, etc.) | Sharing a Public Dataset - RonS example: https://zenodo.org/record/3685972 | What downstreams are using this work? Is that visible to other reusers, via pingbacks or other? | |||||||||||||||||
8 | Sharing a Dataset in-progress: example:IProduct: http://www.iproduct.io/ | Is this used in any metastudies? What schema or other mappings; fuzzings or anonymization; or other processing is used to make each one work? Is the metastudy map encoded in a named package that others could use? | ||||||||||||||||||
9 | How was data chosen for measurement/inclusion? How is it noted when this changes? | |||||||||||||||||||
10 | What data cleaning, noise correction, or other feedback loops did you use in compiling the data? | |||||||||||||||||||
11 | What similar efforts or alternatives exist? | |||||||||||||||||||
12 | What is the whole tale of your work -- what is needed to replicate it? Is this articulated in a [whole tale] file? Does that include workflow + usage notes? | |||||||||||||||||||
13 | Has your process been replicated in practice? By how many independent parties? | |||||||||||||||||||
14 | ||||||||||||||||||||
15 | ||||||||||||||||||||
16 | ||||||||||||||||||||
17 | ||||||||||||||||||||
18 | ||||||||||||||||||||
19 | ||||||||||||||||||||
20 | ||||||||||||||||||||
21 | ||||||||||||||||||||
22 | ||||||||||||||||||||
23 | ||||||||||||||||||||
24 | ||||||||||||||||||||
25 | ||||||||||||||||||||
26 | ||||||||||||||||||||
27 | ||||||||||||||||||||
28 | ||||||||||||||||||||
29 | ||||||||||||||||||||
30 | ||||||||||||||||||||
31 | ||||||||||||||||||||
32 | ||||||||||||||||||||
33 | ||||||||||||||||||||
34 | ||||||||||||||||||||
35 | ||||||||||||||||||||
36 | ||||||||||||||||||||
37 | ||||||||||||||||||||
38 | ||||||||||||||||||||
39 | ||||||||||||||||||||
40 | ||||||||||||||||||||
41 | ||||||||||||||||||||
42 | ||||||||||||||||||||
43 | ||||||||||||||||||||
44 | ||||||||||||||||||||
45 | ||||||||||||||||||||
46 | ||||||||||||||||||||
47 | ||||||||||||||||||||
48 | ||||||||||||||||||||
49 | ||||||||||||||||||||
50 | ||||||||||||||||||||
51 | ||||||||||||||||||||
52 | ||||||||||||||||||||
53 | ||||||||||||||||||||
54 | ||||||||||||||||||||
55 | ||||||||||||||||||||
56 | ||||||||||||||||||||
57 | ||||||||||||||||||||
58 | ||||||||||||||||||||
59 | ||||||||||||||||||||
60 | ||||||||||||||||||||
61 | ||||||||||||||||||||
62 | ||||||||||||||||||||
63 | ||||||||||||||||||||
64 | ||||||||||||||||||||
65 | ||||||||||||||||||||
66 | ||||||||||||||||||||
67 | ||||||||||||||||||||
68 | ||||||||||||||||||||
69 | ||||||||||||||||||||
70 | ||||||||||||||||||||
71 | ||||||||||||||||||||
72 | ||||||||||||||||||||
73 | ||||||||||||||||||||
74 | ||||||||||||||||||||
75 | ||||||||||||||||||||
76 | ||||||||||||||||||||
77 | ||||||||||||||||||||
78 | ||||||||||||||||||||
79 | ||||||||||||||||||||
80 | ||||||||||||||||||||
81 | ||||||||||||||||||||
82 | ||||||||||||||||||||
83 | ||||||||||||||||||||
84 | ||||||||||||||||||||
85 | ||||||||||||||||||||
86 | ||||||||||||||||||||
87 | ||||||||||||||||||||
88 | ||||||||||||||||||||
89 | ||||||||||||||||||||
90 | ||||||||||||||||||||
91 | ||||||||||||||||||||
92 | ||||||||||||||||||||
93 | ||||||||||||||||||||
94 | ||||||||||||||||||||
95 | ||||||||||||||||||||
96 | ||||||||||||||||||||
97 | ||||||||||||||||||||
98 | ||||||||||||||||||||
99 | ||||||||||||||||||||
100 |