1 of 24

EarthCube: A Look Back from NSF’s Perspective

Eva Zanzerkia, Program Director GEO/EAR

Amy Walton, Deputy Division Director CISE/OAC

  

2 of 24

EarthCube’s Timeline

3 of 24

Elements of EarthCube

People

Solicitation

Technology

Partnerships

4 of 24

People

5 of 24

Organizational Studies of Information

History & Theory of Infrastructure: Lessons for New Scientific CI, Paul Edwards, 2007

People, routines, forms, and classification systems are as integral to information

handling as computers, Ethernet cables, and Web protocols. The boundary between

technological and organizational means of information processing is mobile. It can be

shifted in either direction, and technological mechanisms can only substitute for human

and organizational ones when the latter are prepared to support the substitution.

6 of 24

NSF People

•Gabrielle Allen

•Irfan Azeem

•Ed Bensman

•Lisa Boush

•Bob Chadduck

Almadena Chtchelkanova

•Eric DeWeaver

•Cliff Jacobs

•Tom Janecek

•Sean Kennan

•Irene Lombardo

•Raleigh Martin

•Peter Milne

•Shree Mishra

•Lenny Pace

•Rob Pennington

•Allen Pope

•Barbara Ransom

•Ilia Roussev

•Jennifer Schopf

•Jack Sharp

•Dane Skow

•Mike Sieracki

•Dena Smith-Nufio

•Marc Stieglitz

•Colleen Strawhacker

•Alejandro Suarez

•Mark Suskin

•Marco Tedesco

•Mete Uz

•Amy Walton

•Herb Wang

•Maria Womack

•Eva Zanzerkia

7 of 24

27 End-User Workshops: �~2,000 participants, multiple agencies (NOAA, NASA, USGS, USDA, NRL, +)

Earth ~70%

Ocean ~60%

Atmosphere ~30%

Polar - distributed

Atmosphere (4)

Earth

(7)

Ocean

(5)

Earth and Ocean(5)

Atmosphere,

Earth

and Ocean

(6)

8 of 24

Climate Modeling

EarthScope

‘Omics

Sedimentology

Discrete; Samples;

Desktop

Streaming;

HPC

Real-Time Data

Geochemistry

Rock Physics

Deep Sea

Hydrology

Community Modeling

Structural Geology

Critical Zone

Ensemble Forecasts

River/BioGeo

Unstructured data is collected

Knowledge of resources lacking

Principle Investigators work in isolation

Difficulty discovering and vetting dark/legacy data

Scaling algorithms for Big Data

Lack of training in CI

Ingesting heterogeneous data

Quantifying uncertainty/quality

Colocation of data and computation

On-demand computing

Interoperable streaming protocols

Questions of ownership, credit, provenance

Data integration and visualization

Collaboration with computer scientists

Lack of community standards

Hard to discover and use data outside discipline

9 of 24

We Did Not Know

  • Community members and groups were more isolated than we suspected
  • The broad collaborative activities we envisioned for EarthCube needed help
  • Assuming the role
    • Did not fully appreciate its importance in advancing community dialog
    • Simple tools proved effective, but time consuming

10 of 24

Community and Governance

  • EarthCube is a community of geoscientists and computer/cyberinfrastructure scientists
  • Governance committees lead the way
    • Leadership Council
    • TACC, Science, Funded Projects, Engagement, etc.
    • Council of Data Facilities
  • In-Person and virtual forum
    • Connections to ESIP, AGU & other prof societies, agencies

11 of 24

Governance Challenges and Rewards

  • Volunteer efforts drove most of EarthCube Governance
  • NSF – Community Communications
  • 2016 Reverse Site Visit
  • EarthCube Offices played major and changing role

Test Enterprise Governance (ECTEG)

EarthCube Science Support Office (ESSO)

EarthCube Office (ECO)

12 of 24

Strategy/Solicitation

13 of 24

The EarthCube Strategy

An alternative approach to respond to daunting science and cyberinfrastructure challenges

EarthCube is an outcome and a process

EarthCube: next generation CI to transform the conduct of geosciences

Unidata

IRIS

IEDA

NCAR

OOI

CUASHI

The process must

  • Engage all stakeholders: Geosciences end-users

Geosciences and CI facilities

CI and Computer Science specialists�

  • Build EarthCube iteratively, with community input and assessment in yearly intervals

  • EarthCube built on existing resources, understanding that different �geosciences communities are not uniformly served

DataOne

14 of 24

The Changing Landscape

    • OSTP memo on Public Access
    • FAIR Principles
    • 2 Initiatives (CIF21 and HDR)

NSF and Federal Priorities:

    • 4 GEO ADs
    • OCI->ACI->OAC

Leadership

EC was ahead of the game

15 of 24

EarthCube Solicitation

  • Research Coordination Networks (2013-2020)
  • Building Blocks
  • Conceptual Designs
  • Integrative Activities
  • Prototypes
  • Capabilities (2016-2021)
  • Integration

  • Test Enterprise Governance
  • First EarthCube Office Call
  • Second EarthCube Office Call

Umbrella + Amendment structure - NSF Pilot

Goal – Take input from EarthCube to craft calls

Key Documents and Project Standards and Specifications

16 of 24

EarthCube Solicitation: Evolution

Domain End-User Workshop

CI for Paleogeosciences

Research Coordination Network

CI for Paleogeosciences

Integrative Activity

Building Interoperable Cyberinfrastructure (CI) at the Interface Between Paleogeoinformatics and Bioinformatics

EC solicitation moves geosciences domains towards data-enabled research

and advances interoperability

17 of 24

Technology

18 of 24

  • 215 awards since the 2013
    • 12 RCNs
  • Crossing AGS, EAR, OCE, PLR
  • Diverse size and scope
  • CI from data resources to workflows to advanced model coupling tools, etc.
  • Engagement of Geoscientists from year 1
  • Consider Sustainability

19 of 24

EarthCube Technical Focus Evolved

  • Preconceived notions for EC CI
    • (single infrastructure; social network)

  • Common Standard and best practices
  • Schema.org
  • P418 and P419
  • GEOCODES and the Resource Registry

  • Cloud-mediated data-computation; Notebooks
  • Data science; AI/ML

20 of 24

Partnership

20

21 of 24

Productive Collaborations:�The EarthCube Example

EarthCube and OAC

  • A longitudinal experiment
  • A long-term partnership
  • Social evolution
    • Communication
    • The critical role of training
  • A range of CI resources needed to make progress

Future Options

21

22 of 24

Highly Accessible Resources

Shared Campus Resources

Leadership-class

Frontera (Austin)

Cloud Resources

CloudBank (San Diego)

CloudLab (Salt Lake City)

Chameleon Lab (Chicago)

NCAR

Cheyenne (Cheyenne)

Services

PATh/OSG (Madison)

Innovative systems

Stampede 2, Wrangler (Austin)

Bridges-2, Neocortex (Pitt)

Jetstream, JetStream-2 (Bloomington)

Ookami (Stonybrook)

Expanse, Voyager, National Research Platform (San Diego)

Anvil (W. Lafayette)

Delta (Urbana-Champaign)

ACES (College Station)

ACCESS PIs

An Advanced CI Ecosystem for All

Learn how to access resources at access-ci.org ​

CUI//PRVCY

23 of 24

Anticipated (and Unanticipated) Outcomes

  • Transform scientific enterprise
  • Substantial increase in scientific productivity and capability
  • Integrate and sustain connections among multiple modes of support
  • An engaged community with a common vision
  • Iterative discovery process leading to consensus on the best approach

  • EarthCube changed the NSF

24 of 24

Thank You