1 of 31

NASA Openscapes Lessons from Year 1:

Erin Robinson • Julia Lowndes • Luis Lopez

and the NASA Openscapes Mentors

ESDSWG, April 21, 2022

All artwork by Allison Horst; @allison_horst

slides: https://nasa-openscapes.github.io/about

Supporting NASA Earth science research teams’ migration to the cloud

We believe Open Science can accelerate data- driven solutions and increase diversity, equity, inclusion, and belonging in research and beyond.

2 of 31

Open science as part of the climate movement

Recurring themes

  • Create space & place
  • Focus on the common
  • Reuse existing
  • Contribute to fill gaps - and do so openly

“ ‘We’ speaks to the collective, to collaboration, to community, to the relational work at hand. Addressing the climate crisis...will take everyone.

‘We’ speaks to justice, to how we do the work that needs doing and whose contributions are valued. We cannot, we must not, go it alone”

  • Ayana Elizabeth Johnson & Katharine Wilkinson

3 of 31

4 of 31

  • Researcher-centered, focused on teams.

Practice and feel safe working openly with yourself and your team; then ease into more.

  • Create space & place to explore & learn. Cohort Calls, Seaside Chats, Co-Working; GitHub, R, Python, Quarto, Google Drive,

Slack; Efficiency Tips & Inclusion Tips.

  • Cultivate relationships & real connections.

Welcoming folks with diverse backgrounds; meeting where they are; skills to empower immediate work; kinder science.

  • Open culture: Learning, teaching, iterating.

Not a checklist - a continual practice. Imperfect, messy. Takes time.

Openscapes approach

5 of 31

NASA Openscapes Framework Design

Supporting NASA Earth science research teams’ migration to the cloud

The overarching vision is to support scientific researcher teams using NASA EOSDIS data as they migrate their workflows to the cloud. We are doing this working with NASA Distributed Active Archive Centers (DAACs) over three years by:

  1. Develop a cross-DAAC Mentor community of collaborative cloud data instructors, that co-create, curate, and use shared resources (“make once, use often”)
  2. Empower science teams through the Champions program to migrate their download- intensive data analysis workflows to the cloud and open, kinder science
  3. Scale the Openscapes Champions program with DAAC Mentors

to support more teams transforming their workflows

towards open, kinder science and the cloud

6 of 31

Focus on finding the common, teaching culture, contributing as part of their jobs

  • Carpentries Instructor Training
  • NASA-Openscapes GitHub Org
  • 2i2c JupyterHub & AWS Credits
  • Quarto collections of notebooks

Support Mentors towards establishing a common set of NASA Earthdata tutorials that they can then build off their DAAC-specific and science examples

Develop Mentors

Scale Open Science Leaders

7 of 31

“It was a really great week. The tutorials were AMAZING. Everyone did a great job, and everyone was very nice. I really appreciated welcoming environment. I don't have a strong python background. But i was supported in learning all around”

65 Openscapes 2i2c JupyterHub AWS instances

50 forks of the Cloud Hackathon GitHub repo

8 hack-team projects presented on Day 5

Empower Research Teams

Teaching opportunity helped focus Mentor engagement and aligned with PO.DAAC’s priorities

8 of 31

9 of 31

“I still get kudos from those that attended…a participant shared what she learned and liked with another early adopter group she is in and the organizer is looking to implement or share those practices at their own workshops - yay for spreading open science

- Catalina Oaida, PO.DAAC

Reusing, collaborating, & role-modeling Open Science

10+ talks & workshops led by Mentors, reusing tutorials

>> Including AGU, UWGs, SWOT Ocean, ECOSTRESS, Train, internal DAAC staff

Scale Open Science Leaders

Develop Mentors

Mentor support is external and internal facing: teaching researchers and DAAC staff, hands-on in the Hub

10 of 31

Pathway presentations next week! - sharing work in progress

10 Researcher teams planning their transition

DAAC Mentors assisting teams, co-working and Slack

https://nasa-openscapes.github.io/champions

Empower Research Teams

2022 Science Champions Program

March-April

https://nasa-openscapes.github.io/2022-nasa-champions

10 Cohorts since 2019, with NASA, NOAA, academic, non-profit, gov, and tribal groups

Openscapes Champions Program

Cohorts of 7 research teams meet virtually 5x over 2 months to explore open data science & apply to their own shared workflows (Pathway)

11 of 31

So many other wins to measure & share

I am championing open data science in faculty meetings • I said we’re doing our students a disservice • I’m asking to hire lecturer positions • I enabled labs to have seaside chats • we are developing onboarding plans • I support my students starting hacky hours • we now take lab meeting notes in google docs, openscapes style! • our lab tech will teach a new data science course • we have a code of conduct! • us too! • all this is something I’ve worried about a lot and feeling better about it • all our lab protocols are now on GitHub • I reorganized my teaching materials so it’s better next year • all my capstone students are using GitHub for their analyses • I'm now pushing GitHub on all my collaborators • we have a more open culture • I organized and put all my teaching materials online • I didn’t realize there would be so much to talk about • we have a shared lab Google Drive • we developed a way to offboard our undergrads • my lab has put all our scripts online • we are going to have a lab hackathon • thank you for saying the code of conduct on each call • I didn’t know about #rstats twitter • these efficiency tips save me so much time • I taught a course from R4DS, learning the night before • I want to do everything a bit better • I reran my collaborator’s code! • we improved our documentation • we improved our metadata • this saved me so much time • everyone in our lab now has a GitHub account • I led a discussion on data sharing practices for our lab • we are now officially an open science lab • I’m going to put all my teaching materials online • I am starting an RLadies chapter • I organized our lab’s code on GitHub • I feel a part of something bigger • I will start teaching R in my classes • I thought I was utterly incapable of learning GitHub • this happened just at the right time • this isn’t just about coding & GitHub, it’s about changing the way we do science

12 of 31

Skills & collaborative infrastructure:

  • Coding - to access and analyze
  • Notebooks - literate programming to combine prose/narration with code
  • GitHub - to move code to Cloud & collaborate
  • Reuse - comfort reusing existing code and packages
  • Asking for help - confidence, relationships
  • Developing NASA Earthdata vocabulary - granule, Level 2, s3 bucket, Earthdata Login, concept_id…

A lot required to “get to the science”

Through tooling and culture, how can we lower barriers and support scientists with empathy?

Transition to Cloud: alignment for DAACs & researchers they support

13 of 31

NASA Openscapes

Cloud Infrastructure

Luis López

Software Engineer @ NSIDC

14 of 31

A tale of two workflows � "This story is based on actual events. Characters and timelines have been changed for dramatic purposes."

  1. Coffee
  2. Open a data provider website
      • Only works on Chrome
  3. Click here, click there, order data
      • Not very reproducible
  4. Download and analyze data using GIS tools
      • If the data required is not that big (terabytes? Forget it!)
      • maybe compile some code
      • Install a library… oh wait what? A compiler error.
  1. Coffee
  2. Go to openscapes.2i2c.cloud
    • No need to install anything
    • Science!

15 of 31

Starting up my 2i2c JupyterHub instance…on live TV

16 of 31

2i2c cloud infrastructure

Right to replicate, Cloud Agnostic

My Cloud

17 of 31

But what is the experience for the user?

2 things we saw a need for, early on:

  1. JupyterHub customization, shared environment
  2. Streamlining NASA Earthdata Access

…Origin stories of corn environment and earthdata Python package

Observations from the Mentors Cohort

Observations working with researchers

Responsiveness of this group to user feedback: Eli Holmes (NOAA) had requests that we could meet -> increase how productive researchers can be

18 of 31

Openscapes environment

Scientists don’t need to install anything

https://github.com/NASA-Openscapes/corn

  • Github authentication
  • Session persistence
  • Deployed to us-west-2
  • Reproducible Conda environment
  • Extensibility
  • Multiple kernels are supported
  • Dask-kubernetes!

19 of 31

How to access NASA datasets and not die trying

`earthdata` Python package removes barriers for auth and access

  • CMR
    • CMR-STAC
  • EDL
    • Tokens
  • AWS credentials
    • One per DAAC
  • S3 Buckets
    • Boto3?
  • List goes on…

20 of 31

Show me

21 of 31

Observations (mine)

  • Ready to use cloud environments are very useful and valued by scientists -and developers-.

  • We need to learn from researchers how we can lower barriers to science - both tech and teaching

  • There is a need for open infrastructure on a permanent basis.�
  • There are tradeoffs, the cloud is not a silver bullet.
    • Costs, who pays? how?: https://www.cloudbank.org/
    • Complexity

22 of 31

https://nasa-openscapes.github.io�https://github.com/nasa-openscapes�https://github.com/nsidc/earthdata

THANKS!

Artwork by Allison Horst.

23 of 31

Cole Krehbiel

Aaron Friesz

Catalina Oaida

Jack McNelis

Makhan Virdi

Matt Tisdale

Vishal Bagadia

Amy Steiker

Luis Lopez

Andy Barrett

Christine Smit

Jennifer Adams

Alexis Hunzinger

Kumar Ramasubramanian

Shubhankar Ghalot

Iksha Gurung

Thank you to the

DAAC Mentors!

24 of 31

DAAC Mentors 2022!

Join us: Nominate DAAC colleagues!

Benefits of joining:

  • Aligns with DAAC priorities
  • Cross-DAAC community of expertise, tutorials, open science leadership
  • Carpentries instructor training
  • Access to 2i2c JupyterHub

2021 Mentors aren’t leaving, we’re growing the community. Welcoming Mentors from new and existing DAACs!

25 of 31

Thank you!

We’re looking forward to working together!

Julia Stewart Lowndes, PhD

Co-Director, Openscapes

National Center for Ecological Analysis & Synthesis (NCEAS), UC Santa Barbara (UCSB)

lowndes@nceas.ucsb.edu; @juliesquid

Erin Robinson

Co-Director, Openscapes

Metadata Game Changers

erin@metadatagamechangers.com

Luis Lopez

Software Engineer, NSIDC

luis.lopezespinosa@colorado.edu

26 of 31

Deliver Champions Program for a Cohort of research teams

Transform research teams workflows towards kinder, inclusive open science

Flywheel

Attract research teams interested in better practices for data-intensive science

Inspire broader scientific communities through visible examples and leaders – Open science shift

Invest in Champions Program curriculum

Develop Champions Mentor – professional development and leadership skills

Empower

research teams

Amplify leaders

Engage Mentors

Invest in Champions Program curriculum: NASA Earthdata cloud-specific materials

27 of 31

Action shot: collaboration

28 of 31

Slide from Mike Little - iterated since his GFSC, talk April 2022

NASA Openscapes Framework: Experience from Year 1

PI: Julia Stewart Lowndes, NCEAS/UCSB and Co-Lead: Erin Robinson, Metadata Game Changers

Collaborators: 2i2c, Carpentries and DAAC Mentors from: LPDAAC, NSIDC, PODAAC, GESDIS, ASDC, IMPACT

Key Milestones

Objective

To support scientific researchers using NASA Earthdata as they migrate their workflows to the cloud. We are doing this working with NASA Distributed Active Archive Centers (DAACs) over three years by:�

  • Developing a cross-DAAC Mentor community that supports growth into confident cloud data instructors, and create, curate and use shared resources and have a tutorial review process
  • Empowering science teams to experiment migrating their download-intensive data analysis workflows to the cloud through a partnership with Carpentries, hosted cloud environments like 2i2c & SMCE
  • Scaling the Openscapes Framework with DAAC Mentors to support science cohorts and amplify as many open science leaders as possible, transforming their workflows towards open, kinder science and the cloud

Benefits of Hosted Cloud Hub for Project:

Rapid Startup (minutes for users/a few weeks for initial hub setup)

Easy cross-DAAC collaboration in shared, common environment

Allowed for cross-DAAC creation of common, curated NASA EarthData cloud tutorial material

Low barrier to entry: Researchers can experience cloud environment and access NASA Earthdata in the cloud before having the cost discussion

Low maintenance for project

Multiple separate projects can operate independently

Easy to provide non-NASA colleagues and cross-DAAC colleagues with access for workshops and hackathons

  • Project kick-off at ESDSWG Feb/21
  • DAAC Mentors Accessed Jun/21
  • 2i2c Jupyter Hub Launch Jun/21
  • 1st cross-DAAC Cloud Hackathon Nov/21
  • AGU cross-DAAC Cloud Workshop Dec/21
  • Multiple DAAC-specific workshops Mar/22
  • NASA Champions (10 teams) Kickoff Mar/22
  • Decision to experiment w/ SMCE Mar/22
  • First login to SMCE Apr/22

29 of 31

30 of 31

31 of 31