1 of 11

Manish Parashar

Director, Scientific Computing & Imaging (SCI) Institute & Chair in Computational Science and Engineering

Presidential Professor, School of Computing

University of Utah

NSDF AHM, SLC, Utah, USA

October 11, 2022

What do we mean by democratizing data?

2 of 11

3 of 11

Event Horizon Telescope: Blackhole Image (Sagittarius A*)

Laser Interferometer Gravitational-Wave Observatory (LIGO) : Observation of Gravitational Waves

Search

data

Gather

data

Process

data

Observatory, Instrument, Repository

Scientific discovery

May 2022

2017 Nobel Prize

4 of 11

OSTP (Nelson) Memo (08/25/2022)

Ensuring Free, Immediate, and Equitable Access to Federally Funded Research

  • Remove the 12-month publication embargo on peer-reviewed articles resulting from Federally funded research; 
  • Require that scientific data supporting the results reported in peer-reviewed articles are made available at the time of publication and to make plans for sharing other research data as well, ensuring privacy and security considerations are maintained; 
  • Create new precedents for ensuring scientific and research integrity through transparency of Federally funded research with the use of persistent digital identifiers for all aspects of the research life-cycle (with extended deadlines); 
  • Identify and address issues of inequality in Federally funded scholarly publishing; and, 
  • Apply public access requirements to all Federal agencies with research and development expenditures, not just those originally subject to the 2013 guidance. 

5 of 11

Democratizing Science => Democratizing Data/Compute

  • Broad, fair, and equitable access to data/compute is essential to democratizing science

  • Significant barriers
    • Knowledge: Awareness, discovery, expertise, support
    • Technical: Allocation, access, on-ramps
    • Social: Awareness of the importance of access to CI, rewards structure

  • Open/equal access 🡺 democratized/equitable access
    • Data Discovery: Knowledge networks, intelligent data delivery
    • Data Access: Realtime, streaming, on-demand
    • Data Integration: Data integration & interoperability
    • Data-driven Science: Data processing, knowledge extraction

M. Parashar, "Democratizing Science Through Advanced Cyberinfrastructure" in Computer, vol. 55, no. 09, pp. 79-84, 2022. doi: 10.1109/MC.2022.3174928

https://www.computer.org/csdl/magazine/co/2022/09/09869577/1GeVwbmLj6U

6 of 11

Democratizing Science => Democratizing Data/Compute

  • Broad, fair, and equitable access to data/compute is essential to democratizing science

  • Significant barriers
    • Knowledge: Awareness, discovery, expertise, support
    • Technical: Allocation, access, on-ramps
    • Social: Awareness of the importance of access to CI, rewards structure

  • Open/equal access 🡺 democratized/equitable access
    • Data Discovery: Knowledge networks, intelligent data delivery
    • Data Access: Realtime, streaming, on-demand
    • Data Integration: Data integration & interoperability
    • Data-driven Science: Data processing, knowledge extraction

M. Parashar, "Democratizing Science Through Advanced Cyberinfrastructure" in Computer, vol. 55, no. 09, pp. 79-84, 2022. doi: 10.1109/MC.2022.3174928

https://www.computer.org/csdl/magazine/co/2022/09/09869577/1GeVwbmLj6U

7 of 11

Democratizing Science => Democratizing Data/Compute

  • Broad, fair, and equitable access to data/compute is essential to democratizing science

  • Significant barriers
    • Knowledge: Awareness, discovery, expertise, support
    • Technical: Allocation, access, on-ramps
    • Social: Awareness of the importance of access to CI, rewards structure

  • Open/equal access 🡺 democratized/equitable access
    • Data discovery
    • Data access
    • Data processing, integration
    • Data-driven science

M. Parashar, "Democratizing Science Through Advanced Cyberinfrastructure" in Computer, vol. 55, no. 09, pp. 79-84, 2022. doi: 10.1109/MC.2022.3174928

https://www.computer.org/csdl/magazine/co/2022/09/09869577/1GeVwbmLj6U

8 of 11

The Virtual Data Collaboratory

Goals/features:

  • A dedicated high-speed network, compute and storage resources federated over the participating sites
  • Data discovery and access: a set of Data Services including indexing, cataloging, sharing and metadata management according to FAIR principles
  • An Internet-scale execution platform, allowing to distribute complex distributed analytics close to the data source
  • A network of high-performance DTNs equipped with a middleware for implementing smart data discovery, delivery, and management strategies.
  • Train the next generation of scientists with deep disciplinary expertise in leveraging data, cyberinfrastructure, and tools to address research problems.

M. Parashar et al., "The Virtual Data Collaboratory: A Regional Cyberinfrastructure for Collaborative Data-Driven Research," in Computing in Science & Engineering, vol. 22, no. 3, pp. 79-92, 1 May-June 2020, doi: 10.1109/MCSE.2019.2908850.

https://www.computer.org/csdl/magazine/cs/2020/03/08686134/1hXdyNXrgt2

9 of 11

The Virtual Data Collaboratory (See Ivan Rodero’s Talk)

M. Parashar et al., "The Virtual Data Collaboratory: A Regional Cyberinfrastructure for Collaborative Data-Driven Research," in Computing in Science & Engineering, vol. 22, no. 3, pp. 79-92, 1 May-June 2020, doi: 10.1109/MCSE.2019.2908850.

https://www.computer.org/csdl/magazine/cs/2020/03/08686134/1hXdyNXrgt2

10 of 11

Towards Intelligent Data Discovery and Delivery

Goal: Explore how an understanding of the data and their usage, can be used to democratize access to LF data/products

Data Delivery

  • Analyse LF data and data access
  • In-network data caching, prefetching

Data Discovery:

  • Data/data product recommendations

Builds on edge, in-network (DTN)

capabilities

Y. Qin, I. Rodero and M. Parashar, "Toward Democratizing Access to Facilities Data: A Framework for Intelligent Data Discovery and Delivery" in Computing in Science & Engineering, vol. 24, no. 03, pp. 52-60, 2022.doi: 10.1109/MCSE.2022.3179408

11 of 11