1 of 19

Preparing Your Deposit for Data Sharing

Part of the Creating an NIH Data Management and Sharing Plan with ICPSR Workshop

Katherine Baisden, Data Project Manager

Aubrey Garman, Lead Data Curator

October 2025

2 of 19

Why Should Data Be Shared?

Sharing data is critical to the advancement of science.

  • Maximizes the value of investments in data collection.
  • Demonstrates good stewardship of the effort offered by individuals participating in research.
  • Provides opportunities for testing theories and using novel methods to solve problems.
  • Allows for replication and extension of research results.

2

3 of 19

Preparing to Share/Deposit with ICPSR

Sharing data using ICPSR = Depositing

At a minimum, ICPSR requires that deposits include:

  • data files
  • documentation files (codebooks, user guides, questionnaires, etc.)
  • descriptive information about the study and methodology

Confirm that the informed consent or IRB documentation allows for data sharing.

Self-publish or curation?

3

4 of 19

Curation at ICPSR

ICPSR Curation uses the files and information provided in deposits to:

  • Make a data collection (study) discoverable.
  • Create preservation copies of the materials.
  • Create multiple data file formats for equitable access and secondary use.
  • Ensure the data are being made available at the appropriate use level (public, restricted).

4

*This presentation primarily focuses on QUANTITATIVE data, which represents the majority of data housed at ICPSR.

5 of 19

Curation at ICPSR, continued

  • Checks for issues of incompleteness in the data and documents them accordingly.
  • Standardizes data to convert it to multiple file formats.
  • Enhances variable level metadata with options to apply “question text” to variables.
  • Reviews for and resolves issues related to maintaining respondent confidentiality.
  • Creates study level metadata which result in study pages on the ICPSR sites.
  • Compiles documentation related to studies.

5

*This presentation primarily focuses on QUANTITATIVE data, which represents the majority of data housed at ICPSR.

6 of 19

Getting all your ingredients together before Depositing

6

Just like baking - the experience is more enjoyable if you gather all your ingredients together before starting the process.

Same is true in depositing your data. Look at the deposit form and gather your information. If you have questions about any specific items, please contact our support staff at ICPSR.

7 of 19

Helping ICPSR Understand the Deposit

Provide ICPSR with information on data use, access, and dissemination.

  • Are there publication or promotion deadlines?
  • Intended access - public or restricted access?
  • Terms of use agreements
  • Is this deposit related to data that have previously been distributed by ICPSR?
  • Are there any measures that are proprietary?

7

8 of 19

Helping ICPSR Understand the Files

Ideal data formats accepted by ICPSR:

  • SAS (with formats catalog), SPSS, Stata, or ASCII (with setup syntax)

Common documentation formats include Word, PDF, Excel, etc.

Use clear file naming and folder structure with the deposited materials.

8

9 of 19

File Inventory

9

10 of 19

Data Preparation Tips

  • Clear variable names
    • Avoid using periods or non-alphanumeric characters (@, #, $), or starting names with a number (limitations across statistical software)
    • For longitudinal collections, plan for consistency in variable naming across time and/or develop a variable crosswalk
    • Understand that some software packages will also have limits on variable name length
    • Types of variable names
      • One-up - sequentially numbering the variables starting with an alpha character
      • Question Numbers - align the variable to the question number on the instrument
      • Mnemonic names - short variables names linked to concept of variable
      • Prefix, root, suffix systems - all variables that deal with education start with ED

10

11 of 19

Data Preparation Tips, Continued 1

  • Variable Labels
    • Label all response options in categorical variables
      • Avoid duplicate value labels in the same variable
    • Consider variable order and variable groupings
    • Descriptive and unique variable labels
      • The item or question number from the original instrument
      • Clear indication of the variable’s content
      • Indication if the variable is a composite (created from a combination of other variables)

11

12 of 19

Data Preparation Tips, Continued 2

  • Designate missingness or clearly label invalid responses
  • De-identify sensitive data
    • Remove direct identifiers or Personally Identifiable Information (PII)
    • Consider using only month/year components of date variables
    • For public-use data, limit open-ended responses or string variables
  • Perform consistency checks

12

13 of 19

Documentation is Important

It is helpful to provide complete study documentation:

  • Codebooks
  • Original data collection instruments, interview guides, etc.
    • Ensure compliance with copyright materials.
  • User guides, read-me files, manuals
    • Include applicable details on using weight variables, linking or merging data, and other important information needed to reuse the data.
  • Variable crosswalks, especially for longitudinal data
  • Consent forms
  • Syntax or programming code

13

14 of 19

Metadata Makes Data FAIR

The fields you complete in the ICPSR deposit form directly relate to study metadata fields which result in complete study page information viewable and searchable to secondary researchers on the ICPSR site.

14

15 of 19

Elements of Metadata for Deposit Manager and Study Home Pages

15

Project Description

  • Title
    • Title, Time Period, and Location
  • Project Description
  • Principal Investigators
    • Listed in order that is preferred
  • Funding Source

Scope of Project

  • Subject Terms
  • Geographic Coverage
  • Time Periods
  • Collection Dates
  • Universe
  • Collection Notes

Methodology

  • Response Rates
  • Sampling
  • Collection Models
  • Units of Observation
  • Geographic Unit

Related Publications

Note: Our Data Deposit Manager form will be changing in the near future.

16 of 19

Deposit to Study Page Fields – Study Level Metadata

16

Deposit Workspace

Study Page

17 of 19

Summary

When you are thinking about how to prepare your research projects for sharing, curation or self-published, consider the question:

“What would someone else need to know in order to use these materials?”

17

18 of 19

Resources on Preparing Data for Sharing

Resources from some NIH Funded Archives at ICPSR

18

19 of 19

Thank You!