1 of 90

Crossing the chasm:

Amplifying success stories about co-creating across institutions

Eli Holmes, NOAA Fisheries, NMFS Openscapes

Julie Lowndes, Openscapes

Jen Schopf, Texas Advanced Computing Center, UT Austin

Amy Steiker, NASA National Snow & Ice Data Center �(NSIDC DAAC), NASA Openscapes

Yuvi, 2i2c / Project Jupyter

And many other contributors!

Slides: https://openscapes.org/media

ESIP Meeting, July 2025

Why are we here

  • Since our keynote at CNG conf, we’ve been retelling across NOAA
  • Power of stories, of sharing successes. Prioritizing, pushing ourselves. Real examples to amplify & fork – just as you do code.
  • 4 stories of crossing the chasm
  • Plenary-style discussion
  • Collaborative notes doc

2 of 90

Diffusion of Innovation Theory & Crossing the Chasm

Eli Holmes

NOAA Fisheries

3 of 90

About us:

We have a long history of actionable science & teaching

Julia Lowndes, PhD

Openscapes Lead

15+ years marine ecologist

Eli Holmes, PhD

NOAA Fisheries Open Science Lead

25+ years at NOAA Fisheries as a statistician/NMFS PARR rep

Why are we here today?

  • Share what workforce modernization can look like & how it improves our science
  • Let you know what is happening now (and since 2021) at NOAA Fisheries

We’re representing the work of many people across NOAA Fisheries and beyond!

4 of 90

Chasm between your great idea and wider adoption

Early Adopters

5 of 90

Early Adopters

Sharing a success stories

Target Population

Crossing the chasm is really hard. Sharing successes helps keep us moving forward.

6 of 90

Why is the chasm such a common phenomena?

A sheepdog parable…

7 of 90

Crossing the Chasm” – selling a sheepdog

8 of 90

The idea: initial inspiration

The innovator watches wolves hunting and has a crazy out-of-the-box idea: What if I could take a wolf and train it to use this behavior to herd sheep!

9 of 90

Problem is, the innovator is not really suited to turning this vision into reality.

The innovator is fascinated by the wolf and is not bothered by the complete impracticality of the wolf.

We need something considerably more tractable and practical.

10 of 90

Early adopters - take a wild idea, relentless drive to turn it into a workable product.

Take a puppy

Patience and skill, actually obsession, to do this

11 of 90

The early adopter

The target population

Then it is time to ‘sell’ the idea to those who raise sheep (the target audience)

The early adopter can’t help trying to get the sheep raiser to love puppies “Ok, so it’s a puppy and needs training!! JUST LEARN TO TRAIN THE PUPPY LIKE I DID! It’s FUN! *

I raise SHEEP! I don’t care about puppies*. I don’t want to train puppies**. I want to raise sheep.

*The early adopter dies a little at this statement…. **depression sets in

12 of 90

Social & operational chasm

Institutional barriers

Early Adopters

13 of 90

Crossing the Chasm at NOAA Fisheries

Eli Holmes

NOAA Fisheries

14 of 90

Early Adopters

How do you build the bridge to cross the chasm?

Target Population

Adoption of Reproducible Science and Data Workflows� at NOAA Fisheries

15 of 90

Where we are today

16 of 90

Community Building and Skill Building

Openscapes

Support for ‘mentors’, organizers, leads

Data Science academy

1:1 and Team�Support

Infrastructure

training Jupyter Hub

Weekly Trainings and Hack Events

Training Page

  • GitHub
  • Quarto Reports
  • R
  • Python
  • Cloud Computing
  • Remote Sensing Data

Google Spaces for Posit Connect, GitHub, R, Python, Open Science

AMA Help Desk with Jon Peake every Wednesday.

asar team also is holding weekly helpdesk

Training Page has sign-up links

Team trainings. Email Jon and Eli.

Weekly Coworking

Fall Openscapes Champions cohorts

Help for on-boarding to Cloud Computing

Posit Connect server for Shiny applications and data dashboards

GitHub Enterprise server for code management & preservation. Governance Team

2 FT staff

E Holmes & J Peake

“Mentor cohort at each FMC”

2025

J Lowndes, S Butland,

A Teucher, I Fenwick

17 of 90

How did we get to today?

Coming up with a strategy using Diffusion of Innovation theory

in 2021

18 of 90

“Diffusion of Innovation” Theory (EM Rogers 1962)

Predictable progression as an idea diffuses through a population.

Important to understand so you don’t put your energy in the wrong place.

Time

I have a few talks on this idea on my website:

eeholmes.github.io

Target Population

19 of 90

Early Adopters are critical to diffusion of innovation

  1. Early Adopters develop the innovation into something of value
  2. Their energy and effort is what drives the initial diffusion process, but that is a slow process.

Time

Target Population

20 of 90

2020-2021

2024-2026

2022-2023

First 4 Champions Cohorts

Roll out of Agency-wide training program

Developing the Mentor Cohort + co-leading 6 Champions cohorts

21 of 90

How do we get started? People don’t want to try new ideas – in fact they kind of actively resistant to our new ideas.

Start with the eager

and then move to the willing.

22 of 90

You need to find the “eager” (the Early Adopters), but they are hard to reach and many are isolated

Early Adopter

23 of 90

Developing and growing the Early Adopter Community at NOAA Fisheries

2021-2023

Find Allies

Help people/orgs/teams with their “pain points” and goals

Contribute

Community

24 of 90

Start with the eager and then move to the willing.

Remove barriers to joining your group.

Cultivate culture of openness and helpfulness.

You have to do 10x more comms than you think.

25 of 90

Early Adopters

What’s your success story?

Target Population

26 of 90

Crossing the Chasm at NASA Earthdata, 2021-2022

Julie Lowndes

Openscapes

27 of 90

Date: Tuesday, July 29, 2025 << please join us!!Time: 9:30 - 11:00 am PT�Speakers: Amy Steiker, Luis Lopez, Danny Kaufman, Joe Kennedy, Chris Battisto

Register (free) via Zoom:

https://openscapes.org/events/2025-07-29- community-call-earthaccess

A python library that enables authentication, search, & access for NASA Earth science data with just a few lines of code.

28 of 90

NASA Earthdata & Openscapes

Supporting NASA Earth science research teams’ migration to the cloud

The overarching vision is to support scientific researcher teams using NASA EOSDIS data as they migrate their workflows to the cloud. We are doing this working with NASA Distributed Active Archive Centers (DAACs) over three years by:

  1. Develop a cross-DAAC Mentor community of collaborative cloud data instructors, that co-create, curate, and use shared resources (“make once, use often”)
  2. Empower science teams through the Champions program to migrate their download- intensive data analysis workflows to the cloud and open, kinder science
  3. Scale the Openscapes Champions program with DAAC Mentors to support more teams transforming their workflows towards open, kinder science and the cloud

29 of 90

EarthData Cloud Cookbook

In active, modular, open development

nasa-openscapes.github.io/earthdata-cloud-cookbook

Create place:

GitHub Org + Quarto

Purposeful documentation

  • Create-once-use-often, importer workflow
  • Researcher-focused: tutorials & how-to guides

Common

Cloud

Foundation

NASA Specific Science Tutorials

30 of 90

Mentors’ early co-creation and teaching was key

* four other workshops

Time

Target Population

Nov 2021:

Cloud Hackathon

Mar 2021:

Kickoff

Oct 2022:

Luis Lopez demos earthaccess

(then called earthdata)

Apr-Jun 2022:

Champions Cohort

Designed as part of their paid jobs, ~20%

*

31 of 90

Building the mechanism:

Openscapes Flywheel

(Robinson & Lowndes 2022)

(an open source tool to fork & reuse!)

32 of 90

The Openscapes Flywheel: an open source tool to facilitate and scale inclusive Open science practices

Robinson & Lowndes 2022

Fork common workflows, skills, tools, ideas

Work Openly

Invest in learning and trust

Inspire

Flywheel

Create space and place

Welcome

Empower

Learning culture

Engage

A Future Us mindset

Amplify

Open leaders

Flywheel concept

transformations occur from consistently doing key activities that add up over time

(Collins)

33 of 90

Coworking

Regularly scheduled, with people not part of your normal work team

Cross-team awareness; cross-org learning

Solve problems; turn around and share

Onboarding

Work Openly

Invest in learning and trust

Inspire

Flywheel

Create space and place

Welcome

Forking as a worldview

34 of 90

“It was a really great week. The tutorials were AMAZING. Everyone did a great job, and everyone was very nice. I really appreciated welcoming environment. I don't have a strong python background. But i was supported in learning all around”

65 Openscapes 2i2c JupyterHub AWS instances

50 forks of the Cloud Hackathon GitHub repo

8 hack-team projects presented on Day 5

Empower Research Teams

35 of 90

Hackathons

Tutorials + peer-to-peer learning + project team-based work (eScience definition, hackweeks)

A lot of generative ideas – identifying a problem so painful

Invest in learning and trust

Work Openly

Inspire

Flywheel

Welcome

Forking as a worldview

Create space and place

36 of 90

Previously unseen barrier

Core functionality for data access – Earthdata Login authentication, efficient search and query via CMR, bulk download, and direct access from AWS object stores – was insurmountable for many users via the code-based workflows required for Cloud.

https://nasa-openscapes.github.io/

2021-Cloud-Hackathon/tutorials/

04_NASA_Earthdata_Authentication

37 of 90

Partnering

Early adopters can’t do it alone. Must partner, bring new creativity & resources

Many forms of contributors:

Advocating for time

2i2c; RStudio/Posit

Cheatsheets, Art

Work Openly

Invest in learning and trust

Inspire

Flywheel

Create space and place

Welcome

Fork common workflows, skills, tools, ideas

38 of 90

Luis Lopez presenting then-called earthdata Oct 2022!

39 of 90

Co-creation

Hackathons

Teaching

Coworking

YES!

Reusing what works in new places

40 of 90

Creating our own certainty

  • Time to learn & contribute
  • Seeing the problem so painful
  • Partnering - asking for help

41 of 90

Date: Tuesday, July 29, 2025 << please join us!!Time: 9:30 - 11:00 am PT�Speakers: Amy Steiker, Luis Lopez, Danny Kaufman,

Joe Kennedy, Chris Battisto

Register (free) via Zoom:

https://openscapes.org/events/2025-07-29-community-call-earthaccess

42 of 90

Crossing the Chasm at NASA Earthdata

Amy Steiker & Julie Lowndes

NASA NSIDC DAAC & Openscapes

To be presented Tuesday, July 29, 2025 << please join us!!Time: 9:30 - 11:00 am PT�Register (free) via Zoom –

https://nasa-openscapes.github.io/news/2025-07-29-community-call-earthaccess/

43 of 90

Time

Hard to even define the chasm

44 of 90

Early Adopters

How do we help users leverage the awesomeness of NASA Earthdata?

Target Population

A bridge: earthaccess.

Shift now is in who helps build this bridge

45 of 90

“earthaccess is amazing”

- NASA SMAP Mission PI from Univ. Montana.

earthaccess integral to his open climate science for ag course:

openclimatescience.github.io

We built earthaccess to support users migrating search, access, and analysis workflows to the AWS Cloud.

But it is so much more.

200+ dependent projects1

1 https://github.com/nsidc/ earthaccess/network/dependents

500+ GitHub stars2

2 https://github.com/nsidc/ earthaccess

This is an ugly start to improve, but thinking this is the “wow” page

AS: Add another box here to highlight hack days? Screenshot from Matt’s recorded demo or hackday post in GH?

Enables reproducibility of workflows, collab & transparency.

Now on earthdata.nasa.gov!!3

3 https://earthdata.nasa.gov/data/tools/earthaccess

46 of 90

We built with intentional culture & community – critical to its success so far & its future growth.

Hackathons & Workshops Cross-learning between developers and users.

Open community

ownership, the ways we support our contributors & maintainers

Community partnerships & collaborations

Xarray • Pangeo • HDF • OPeNDAP •

Openscapes

47 of 90

Early Adopters

Widening the bridge, as the chasm changes

Target Population

Ongoing work: new features & support.

Shift now is in who helps build.

48 of 90

Vision for the Majority: “batteries included”.

Time

49 of 90

Enabling real world analysis at scale: On-demand virtual data cubes

Community governance evolution: “Repotting” earthaccess repo to further enable partnerships and “mature the bazaar”

Continuing to lower barriers to library contributions

What does widening the bridge look like?

50 of 90

Early Adopters

How do you build the bridge?

Target Population

We’ve already built this foundation with earthaccess, we can look at how our community model can support the bigger cause

How to increase partnerships, external funding?

How do *we* support ESDIS, rather than the other way around?

Hitting this chasm allows us to look further; part of a bigger, continued need.

Learning from and furthering stories that already exist

51 of 90

Crossing the Chasm with TACC

Saving 50 years of astronomy data with Arecibo

Jen Schopf

Texas Advanced Computing Center / University of Texas at Austin

52 of 90

WHAT IS TACC

  • The University of Texas at Austin and UT System Research Computing Facility
  • Largest NSF-funded national computing center for open science
    • Also NIST, NASA, NIH, DARPA, DOD, etc.
  • 200+ Dedicated staff
  • Altogether, ~20k servers, >1M CPU cores, 1k GPUs
  • About seven billion core hours over several million jobs per year – for 4,000 projects and ~95,000 users per year.

52

5/15/25

Federal Investments in TACC are over $1B in last 10 years; and over $1B slated for next 10 years.

While we are a national provider, we have *by far* the most computing resources of any University

in the country (and often the world), and will continue to through the 2020s.

53 of 90

We help Bridge the CI Chasm for Users-

  • Provide researchers with:
    • Computing, Data, AI , Software capabilities to support their research
    • The expert help to use it
    • In the ways they want to consume it
    • Help with grants/strategy
  • We also like to help Departments and Colleges:
    • Recruit top faculty and top students.
    • Win large grants, leveraging our capabilities.
  • Computation, AI, Data almost ubiquitous across the sciences

53

5/15/25

54 of 90

We have Computers. Lots. All kinds. Especially big ones

  • From single node to 500,000 processors.
  • Accelerated computing with GPUs (and a few DPUs, FPGAs, Vector Accelerators, etc.)
  • Batch scheduled, managed;
  • Bare metal; Kubernetes/Virtual Machines/Containers
  • Interactive (Web, API, Jupyter. . .).
  • Large memory
  • Simulation, AI/ML/DL, Analytics.
  • Visualization walls, VR/AR/Displays

54

5/15/25

55 of 90

We have Storage - Lots. All kinds.

  • A few Petabytes of solid state
  • A few hundred Petabytes of spinning disk
  • >Half an Exabyte of tape
  • Some optimized for speed, some for reliability, some for cost
  • We have some that you can use for a few days, and some in service since 1986

55

5/15/25

56 of 90

WHO USES TACC

  • ~4,000 active projects, 150+ fields
  • ~15,000 annual command line users
  • ~80,000 annual via web portals, APIs
  • Partnerships with UT System, NSF ACCESS, industry, international
  • K-12 programs, college classes, institutes and other training (1,000+ users trained annually)
  • 7 billion compute hours annually
  • 5 billion files processed annually

56

57 of 90

  • We have begun construction of the Leadership Call Computing Facility (LCCF)
  • A >$500M new facility with a long term investment in computing, data, people, and facilities for simulation, analytics, and AI.
  • Funded by the same “Major Facilities” fund as NCAR, LIGO, LHC, Vera Rubin, McMurdo, etc.
    • This is a promotion for the role of computing

58 of 90

ARECIBO - A HISTORY OF (PLANNED) CHALLENGES

  • 1963 - Arecibo in Operation https://www.naic.edu/ao/legacy-discoveries
  • 2006 - NSF 15% Budget cut across Astronomy
  • 2007 - Arecibo budget cut from$10.5 to $8M
    • (NASA adds ~$2.6M to help ops budget)
  • 2015 - Facilities director Kerr quits due

to funding clashes

  • 2018 - University of Central Florida

takes on stewardship

58

59 of 90

UNPLANNED CHALLENGES (1)

  • Hurricane Maria ( Sept 20, 2017 )
    • Category 4
    • Significant damage to facility
    • R&E (I2) network connection lost and not replaced
      • Too expensive to repair due to budget cuts
  • Core operations resume when the power came back...

59

60 of 90

UNPLANNED CHALLENGES (2)

  • 5.0 - 6.4 Earthquakes ( January 7-11, 2020 )
  • Operations still going
    • shaken not stirred

60

61 of 90

UNPLANNED CHALLENGES (3)

  • July 30, 2020
    • Tropical Storm Isaias
    • SCIENCE vs operations means that some hard choices are made
    • Maintenance deferrals start adding up

61

62 of 90

UNPLANNED CHALLENGES (4)

  • Aug 10, 2020
    • First [auxiliary] cable snap
  • Nov 6, 2020
    • Second cable [primary] snaps
  • 2024 NAESM report -
    • Zinc “creep” slowly deformed the wire support sockets

63 of 90

DECEMBER 1, 2020 - RESULTING FACILITY PROFILE

63

Data Center Building 1.

64 of 90

NEED TO MOVE 2+ PB OF GOLDEN COPY DATA

  • Option 1: Move data to the cloud
    • Est time to transfer: 42 years
    • Est cost: Millions for downloads

  • Option 2: Move data to partner site at UCF
    • Partner site says they can’t support this option

  • Option 3: TACC steps up to save science
    • Workflow developed
    • 9 months later, all data from spinning disk at TACC

64

65 of 90

MOVING THE DATA

  • Network Attached Storage (NAS) “Appliances”
    • 100+ Terabytes at a time (Full capacity of NAS)
  • Onsite team hand carries NAS to the closest 10Gbps links (coastal - IRNC funded AMPATH)
    • University of Puerto Rico - Mayaguez
    • Engine-4 Commercial Collaboration Space
  • Device tuned to be more effective, but still a slow process

65

66 of 90

WORKFLOW TO MOVE DATA ON SPINNING MEDIA

66

67 of 90

DATA AT TACC

  • ~9 months later and 2.5PB of data moved
    • Arecibo team set up web interface
    • Maintained on a volunteer basis
  • Some science projects also made local copies – from TACC, not PR

67

68 of 90

BUT WHAT ABOUT THE TAPES?

68

69 of 90

6,600+ TAPES

69

70 of 90

AO DATA MOVEMENT PART 2

  • Team from TACC deployed May 2023 to save the Data on Tapes
    • Approx 60 years of data history was condensed down to 4 pallets
    • Estimate of about 3.8PB of data (empirically a PB of data weighs 664# …)
  • Data Currently at TACC
    • ~3 PB Online spinning disk
    • Rest of Tape data available on demand
  • Arecibo closed August 2023

70

71 of 90

71

72 of 90

ACKNOWLEDGEMENTS

  • EPOC was funded by
    • US NSF award #1826994 through 2023
    • US NSF award #2328479 through 2025

72

73 of 90

TAKE AWAYS

  • TACC has resources available to help you bridge your personal data chasm - or collapsing infrastructure…
    • and interest in helping science, however we can
    • (although we’d rather not ship pallets of tapes again)
  • Reach out
    • Jennifer Schopf, jms@tacc.utexas.edu

73

5/15/25

74 of 90

Finding Chasms to Cross �JupyterHubs at Scale from Wikimedia to UC Berkeley to your Research Institution

Yuvi

2i2c / Project Jupyter

75 of 90

76 of 90

77 of 90

78 of 90

79 of 90

English Wikipedia

296,000,000,000 pageviews in 2024

Data8

~200users in Fall 2015

80 of 90

200 -> 1500 students within 2 years

81 of 90

Crossing Chasms always requires a multitude of skills

82 of 90

Extract z2jh out, so it can serve a broader community

83 of 90

Build community around z2jh

84 of 90

Becomes a tool others can use to cross their own chasms!

85 of 90

2i2c.org, a cloudy non-profit to contribute cloud-y skills to many chasm crossings

86 of 90

You have skills and tools that you have collected through your unique life path

You may not know what all you have!

87 of 90

Crossing Chasms always requires a multitude of skills

88 of 90

Find people whose life paths didn’t look like yours to cross chasms together with

89 of 90

A group somewhere may be missing exactly just your skillset to cross a chasm

Explore the unfamiliar

90 of 90

Together we can go where none of us could go alone