1 of 41

Data Strategies Part 4: �How can Federal agencies harness partnerships and innovation to maximize the impact of Earth science data resources?

January 23, 2025

Raleigh Martin, NSF

Leslie Hsu, USGS

2 of 41

What are we discussing today?

  • Federal Data Strategies: Recap why we are here and what has happened
  • Learn about data investments, gaps, and collaboration examples
  • Explore future steps

3 of 41

How do we continue to enable and support collaborations that span organizations, including agencies, even if we need to implement data strategies in different ways?

4 of 41

Recap

  • Who are you?
  • What data does your agency provide? Who are your data customers?
  • What are some challenges?
  • Do you have a data strategy?
  • Links to share
  • Do you have a question for other agencies?

In July and November 2023, and January 2024, we heard short presentations from: ��USGS, USDA, EPA, NASA, NOAA / NOAA Fisheries, DOE, NIH (NIEHS), and NSF

5 of 41

July 2023: Sharing challenges, plans, and priorities

Introduction to Federal Data Strategies

U.S. Federal Data Strategy

OSTP Memos

Public Access Plans

Agency Data Strategies

Agency presentations

Viv Hutchison, USGS

Cyndy Parr, USDA

Ann Vega, EPA

Joel Scott, NASA

Monica Youngman, NOAA

Karen Sender, NOAA Fisheries

Topics

What are your biggest barriers to implementing data strategy?

Take-aways and next steps.

6 of 41

November 2023: Part 2

Part 2

Compilation of Challenges

Q&A from July questions

New Agency presentations

Jay Hnilo, DOE

Michelle Heacock, NIH (NIEHS)

Sourced New Topics

Data discovery

Hosting files to support Public Access

Paths to data.gov

Policy for data analyzed with generative AI

7 of 41

January 2024: Part 3

Part 3

Compiled links from participating agencies

Discussion of trust and ethics in relation to our data strategies

New Agency presentations

Raleigh Martin, NSF

Sourced New Topics

Ideas for partnership

Don’t forget OSTP coordination groups

Formality versus smaller conversations

8 of 41

Around the table

9 of 41

Today's

Speakers

Nancy Ritchey, NOAA National Centers for Environmental Information

Katie Baynes, National Aeronautics and Space Administration

Leslie Hsu, United States Geological Survey

Raleigh Martin, National Science Foundation

10 of 41

NOAA/NCEI

NESDIS

Provide secure and timely access to global environmental data and information from satellites and other sources to promote and protect the nation's security, environment, economy, and quality of life

Dept of Commerce

Create the conditions for economic growth and opportunity for all communities

NOAA

Science (climate, weather, oceans coasts, and sun), service (share knowledge and information), and stewardship (conserve and manage marine ecosystems)

NCEI

Provide environmental data, products, and services to help drive resilience, prosperity, and equity for current and future generations

National Oceanic and Atmospheric Administration (NOAA)

National Environmental Satellite, Data, and Information Service (NESDIS)

National Centers for Environmental Information (NCEI)

11 of 41

11

NCEI Mission

NCEI provides environmental data, products, and services covering the depths of the ocean to the surface of the sun to drive resilience, prosperity, and equity for current and future generations.

NCEI Vision

A tenacious and trusted leader in environmental information for a rapidly changing world with a focus on driving lasting good across our partnerships, our economy, around the U.S and the world through generations.

September 2024

National Centers for Environmental Information - NCEI

12 of 41

DOC/NOAA/NCEI: Key data investments

Leadership and Guidance:

Infrastructure:

  • NESDIS is actively moving to the cloud for processing, archiving and providing access to NOAA’s data

Data Partners:

12

National Oceanic and Atmospheric Administration ⎸National Centers for Environmental Information

13 of 41

Building an Innovative Infrastructure

  • NOAA/NCEI is building the Open Information Stewardship Service (OISS), our new archive in the cloud
  • The cloud environment is critical in allowing us to operate at the speed and scale needed to meet today’s demands
  • Utilizing technologies of the semantic web and linked open data to enable open science, AI-based applications, etc.
  • This new approach will gather full provenance of our data and products, which facilitates transparency and Scientific Reproducibility.

13

National Oceanic and Atmospheric Administration ⎸National Centers for Environmental Information

14 of 41

14

NCEI’s Value to the Nation

NCEI Stakeholders by Sector

23.5%

21.1%

17.6%

12.4%

10.9%

6.9%

23.5%

21.1%

17.6%

12.4%

10.9%

7.6%

Science, Technology, and Engineering

Ecosystems (Agriculture/Aquaculture)

Transportation and Infrastructure

Energy

Insurance, Finance, and Legal

Health and Emergency Management

Higher Education

September 2024

National Centers for Environmental Information - NCEI

15 of 41

Authoritative Information and Services

In the context of authoritative products and services, the notion of “authoritative” means…

i

… conferred by users

  • Community /Partner Use and impact
  • Proof is in their use
  • Reliable, valuable

… credibly

represent earth system

  • Accuracy, rigor
  • Scientific credibility

… carefully sourced and transparent

  • Discoverability
  • Provenance
  • Preservation

NCEI:

Aim here

“science”

“service”

“stewardship”

September 2024

National Centers for Environmental Information - NCEI

16 of 41

16

Providing Climate Information to Inform the Future

Tornado Climatology

Climate Extremes Index

Regional Snowfall Index

U.S. Drought Monitor

Blended Sea Winds

Monthly U.S. & Global Climate Reports

U.S. Billion-Dollar Weather�& Climate Disasters Report 

Hourly Precipitation Data

September 2024

National Centers for Environmental Information - NCEI

17 of 41

17

Coast, Oceans, and Geophysics

Providing data and information from the Sun to Earth’s seafloor

Deep Sea Corals Data Portal

Model Reanalysis

Ocean Exploration Digital Atlas

World Ocean Database

Bathymetry and

Global Relief

Enhanced Magnetic Model

Gulf of Mexico Data Atlas

Passive Acoustics

September 2024

National Centers for Environmental Information - NCEI

18 of 41

Imagine a world of interconnected Data & Services!?

A linked and open “knowledge network”, with many doors of entry

EVERYTHING needed to use and understand the contents of the NOAA archive is embedded in the knowledge graph, and readable by both people and machines

18

National Oceanic and Atmospheric Administration ⎸National Centers for Environmental Information

19 of 41

NOAA/NCEI

5. Ideas for the future (e.g., “It would be great if we could…”)

20 of 41

Advancing Earth System Science End-to-end

03.19.2024

Technology

Flight

Research and Analysis

Data and Modeling

Earth Action

21 of 41

NASA

22 of 41

NASA

23 of 41

NASA

24 of 41

25 of 41

U.S. Geological Survey

Created by an act of Congress in 1879, the USGS provides science for a changing world, which reflects and responds to society’s continuously evolving needs. As the science arm of the Department of the Interior, the USGS brings an array of earth, water, biological, and mapping data and expertise to bear in support of decision-making on environmental, resource, and public safety issues.

Contributors

Leslie Hsu�Coordinator, USGS Community for Data Integration

Mike Frame�Chief Data Officer

Greg Gunther

Chief, Science Data Management Branch

Department of the Interior

U.S. Geological Survey

Core Science Systems

Science Data Management

Science Analytics and Synthesis

26 of 41

Selection of USGS Data Investments

Support throughout the science data lifecycle - most focused on USGS scientists and their partners

1. Digital repositories - ScienceBase, Other disciplinary repositories (e.g., National Water Information System)

2. Identifiers - Asset Identifier Service (previously the USGS DOI Tool)

3. Tools and Resources for data - e.g., Data management page, Metadata Wizard, sciencebasepy, ScienceBase Data Release Tool and Team

4. Trainings and Communities - Data and Software Carpentries Trainings, Data Release Trainings, Community for Data Integration (including seed funding for scalable data projects)

5. Advanced Scientific Computing - HPC Training, Hybrid Computing Solutions

27 of 41

USGS Selected Opportunities

Need

⟶ ⟶ ⟶ Opportunity

Increase data discovery and determination of fitness for use

Encourage cross-agency and cross-community disciplinary standards and analysis/use of existing metadata

Increase accessibility of tools, working code, and authoritative data that are currently restricted

Explore, design, improve, and encourage platforms that allow multi-organization access

Evolve toward more integratable and analysis-ready data across the USGS disciplines

Collaborate to provide more data in support of AI adoption and training

Increase USGS data availability

Informed participation in Cloud Vendor Open Data programs

28 of 41

USGS Data Collaboration examples

Many of our collaborations are at the discipline-based level (Water, earthquakes, fire, invasive species, critical minerals, astrogeology).

Collaboration with Department of the Interior bureaus is also high, since USGS is the science agency providing science for DOI decisions and we fall under the same Dept of Interior policies.

There is an opportunity to aggregate data collaboration examples at a USGS-wide level and use them as models for further progress.

29 of 41

USGS - ideas for the future

Map the major agency-supported systems and capabilities between agencies holding similar data types, building in linkages between separate systems, and demonstrating integration between data holdings. (has this been done?)

How can we avoid multiple copies of Agency key datasets being rehosted and shared within Agencies? Some of this is to due network challenges, size, versioning, awareness, etc.

Agency Open Data collaborative efforts with Cloud Vendors (i.e. NASA, NOAA): Explore how can government agencies more effectively deliver, analyze, and save with a more government wide approach to government open data interactions with Cloud Vendors. Right now, each Agency is interacting independently with Vendors, often getting mixed messages, agreements and benefits.

Communicate more on AI training data, leveraging NAIRR, catalogs.

30 of 41

USGS - additional info

Leadership and Guidance:

  • USGS Decadal Data Strategy (2024)
  • Each Bureau within DOI has an associate Chief Data Officer (aCDO)
  • USGS has a CDO.
  • USGS has a Data Advisory Board

31 of 41

The National Science Foundation (NSF) was established in 1950 by Congress to:

  • Promote the progress of science.
  • Advance the national health, prosperity and welfare.
  • Secure the national defense.

NSF Directorate for Geosciences (GEO):

  • Expands educational opportunities for students across the geosciences.
  • Oversees diverse, state-of-the-art infrastructure and facilities around the world.
  • Builds impactful partnerships both within the agency and with external groups to leverage resources and enable scientific innovations.
  • Scientific domains: Atmospheric and Geospace Science, Earth Science, Ocean Science, Polar Science

The vast majority of NSF’s budget is allocated through financial assistance awards (grants, cooperative agreements) to external entities (e.g., universities, non-profits)

32 of 41

4 layers of NSF investment:

  1. NSF-managed data systems (e.g., Research.gov, NSF Public Access Repository, Enterprise data tools)
  2. NSF-wide research computing support (e.g., ACCESS, CloudBank, National AI Research Resource)
  3. GEO domain-specific data repository investments (see below)
  4. GEO cross-cutting data initiatives (see below)

33 of 41

Gaps and unrealized innovation opportunities

Need

⟶ ⟶ ⟶ Opportunity

Sustaining long-term funding for data resources within short-term grant cycles

Diversify cross-agency and cross-sector funding sources for data resources

Managing scope (i.e., data repository funding limited to serving a specific user community)

Foster collaboration and integration across data resources to broaden impact

Adapting to change (e.g., data volumes, new methodologies)

Develop communities of practice that enable coordination and diffusion of leading practices

34 of 41

NSF-funded OpenTopography facility is harnessing cross-agency data products (e.g., 3DEP) and developing cross-sector partnerships (e.g., via IAA with USGS)

The NASA-funded Astromat builds on NSF investment in sample data curation through SESAR

NSF leads the National AI Research Resource (NAIRR) pilot, which

is coordinating access to data, computing, software, and training across 12 federal agencies and many private partners

35 of 41

Ideas for the future

Foster further partnerships and interoperability across agency-supported data repositories

Streamline pathways for transitioning research-oriented data innovation into enterprise data capabilities

Identify use cases around which to accelerate cross-agency and cross-sector partnership

36 of 41

Discussion and

audience input

37 of 41

What interagency partnerships do you benefit from or are aware of?

What Federal agency data innovations do you wish were more broadly applied?

38 of 41

4 Panel Questions to rank/choose from

39 of 41

Closing reflections

40 of 41

How do we continue to enable and support collaborations that span organizations, including agencies, even if we need to implement data strategies in different ways?

41 of 41

Thank You! Let’s Connect.

Sign up for the ESIP Update and get a weekly newsletter for the #EarthScienceData community: