1 of 14

Ethics & Expectations of

Sharing Network Data

2 of 14

3 of 14

Why share data?

Ethical & Scientific: Key to the shared community of science

    • Open science goals of transparency and access
    • Reproducibility requires ability to see data & methods of prior work
    • Shared resources facilitate productivity
      • Data sharing is a costly public investment, and should thus be a common public good.
    • Community legacy – the history of the field is build on the data underlying our research.

4 of 14

Why share data?

Legal: Funders, including the NIH and NSF, say you should.

    • NIH 2023 Policy: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html
      • NIH expects that … researchers will maximize the appropriate sharing of scientific data, acknowledging certain factors .. may affect the extent to which scientific data are preserved and shared.”
        • Scientific Data: The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.
        • Data Sharing: The act of making scientific data available for use by others (e.g., the larger research community, institutions, the broader public), for example, via an established repository.
        • Metadata: Data that provide additional information intended to make scientific data interpretable and reusable.
        • Data Management and Sharing Plan (Plan): A plan describing the data management, preservation, and sharing of scientific data and accompanying metadata.
          • Now required for all grant submissions.

5 of 14

Why share data?

Why wouldn’t you share data?

Good Reasons:

    • Safety: If releasing the data could cause harm.
      • To respondents
      • To investigators
      • To security/public
    • Ethical: release has to meet ethical standards
      • Promises to respondents / institutions have to be kept
      • Privacy must be respected:
        • minimally to the level promised at data collection/consent
        • ideally complete
    • Naive use might cause harm
      • Data complexity is such that users could reasonably be expected to do more harm than good if data are used inappropriately
      • Future computational & matching tools may make currently safe releases exposed.

6 of 14

Why share data?

Why wouldn’t you share data?

Understandable, but not necessarily honorable, reasons:

    • Too much work
      • Preparing the data for sharing is costly and time consuming
    • Getting scooped
      • You are not done with the data, early release means others might get your finding first.
    • Intellectual property/ownership rights
      • Data collection is hard work: why should you give away for free what you worked so hard to produce?
    • Cover up mistakes/ embarrassment
      • Perhaps you pushed the envelop on the modeling and different choices would result in it disappearing.

7 of 14

Accepted data sharing principles

Data sharing should be FAIR:

Findable: Data should be available on a searchable repository with lasting identifier (like DOI number).

Accessible: Data storage space should be open and free to access

Interoperable: Data are in a standard format

Reusable: should have sufficient documentation & liscence to allow responsible reuse.

Data should provide for CARE of those at risk in their reuse:

Data sharing provides collective benefit, to researchers and research subjects, who should have authority over its use. Researchers have a responsibility to cultivate respective relationships with subjects, and sharing should comport to established ethical standards.

8 of 14

Data sharing Goals & Recommendations

Neal et al, 2024

  • Publication centric model. Ideals are for broader sharing, but minimally, sharing sufficient to reproduce & interrogate published findings.
  • Protects authors, readers & journals

9 of 14

Fine. You win. How do I do that?

Why are network data special and what do I do about it?

  • Identifiability and “deductive disclosure”
    • High-density, setting-based sampling means that knowledge of who is in a study is likely, particularly for sociocentric studies
    • Rich cross-linked information about multiple people increases risk
    • Many of the relations we are asking about are particularly sensitive
    • Not all of our data points are consenting subjects

  • Formats are not standardized (the “I” in FAIR) and metadata/instruction necessary for reuse is not trivial.

10 of 14

Fine. You win. How do I do that?

What good data sharing looks like

The SARA model: focus on balancing four data-sharing goals:

    • Security: Protection of privacy for research subjects
    • Accessibility: openness of data use to other researchers
    • Reproducibility: maintain data fidelity necessary open science goals
    • Adaptability: Ability of researchers to extend data to new questions/analsyes

Pasquale et al 2025: Focus on epidemiologically relevant network data and whether accepted data archiving strategies address the challenges posed by network data.

11 of 14

Fine. You win. How do I do that?

What good data sharing looks like

Pasquale et al 2025: Focus on epidemiologically relevant network data and whether accepted data archiving strategies address the challenges posed by network data.

Common strategies in non-network data:

Aggregate data: report summarized statistics about the network data & metrics on networks

      • Covariance matrix for all variables in a model
      • Distributional summaries, moments.

Partial reporting: exclude some individuals to create ambiguity in potential deductive disclosure risk..

      • Randomly remove 10% of nodes or ties.

Reducing data sensitivity: Change the specificity of data to remove identifiability.

      • Blur dates, aggregate codes, etc.

Data perturbation:

      • Add systematic random noise to data.

Full data sharing via repositories: Make the data available behind a secure login to regulate access and minimize exposure.

12 of 14

Fine. You win. How do I do that?

What good data sharing looks like

Complete Networks

13 of 14

Fine. You win. How do I do that?

What good data sharing looks like

Ego Networks

14 of 14

Conclusions

  • Data sharing is good for the scientific community, and minimal attention (i.e. a plan for sharing and management) is required by funders and, increasingly, journals.

  • Sharing network data poses risks due to the unique nature of network data. Deductive disclosure and violations of secondary consent are legitimate concerns.

  • Current best options are:
    • Aggregate for publication and easy access
    • Find a suitable repository or secure access options.
      • We have one in the works…coming soon we hope!