1 of 16

Education and Outreach - CARE & FAIR Principles

HDR² From Harnessing the Data Revolution to Harvesting the Data Revolution - PI Conference - Oct. 26-27, 2022

Nathan Quarderer (CU Boulder/CIRES/Earth Lab/ESIIL), Paula Mabee (Battelle/NEON), Valerie Barr (Bard College)

Michelle Holko (Google)

2 of 16

Education and Outreach - CARE & FAIR Principles

  • Intro (Nate Quarderer) 11-11:05
  • Guest Speaker (Michelle Holko) 11:05-11:30
  • Scenario Discussion 11:30-11:50
  • Reporting Out 11:50-12:00

3 of 16

NSF DSC: Earth Data Science Corps (EDSC)

  • Award #1924337
  • 3rd yr of funding + 1 yr extension
  • 5 partner institutions including 2 Tribal Colleges and Universities (TCUs)
  • 60 students; 8 faculty
  • 12-week paid internship
  • Earth & Environmental Science; GIS; Python
  • Immersive, online, project-based learning environment
  • COVID-19 data (2020); LiDAR to discover geomorphological & culturally sensitive areas on Tribal lands (2021-2022)
  • Indigeneous Data Sovereignty; CARE/FAIR

COVID-19 Positive Rate on Mandan Hidatsa Arikara Nation (2020)

4 of 16

Guest Speaker: Michelle Holko

Photo goes here

Michelle Holko, PhD, PMP

Strategic Business Executive & Scientist, Google Cloud

Technology for healthcare and life sciences for the public sector @ Google Public Sector

PhD in Genetics from Case Western Reserve University; MS in Clinical Investigation from Northwestern University

5 of 16

Google’s mission and commitment to education

To organize the world’s information and make it universally accessible and useful.

Reinvent education and research everywhere, so that everyone has access to the quality education they deserve and can answer bigger questions faster

Confidential & Proprietary

6 of 16

Google grew out of AI/ML research

7 of 16

IaaS model makes it easy to reproduce research and workflows

Share your architectures and workflows with your collaborators

Continuity of research even when researchers change organizations

Continuous development to improve and simplify workflows

Contribute to the RAD Lab community ecosystem

Leverage the best of Google across your organization

What are the benefits of using the cloud in research?

Reproducibility

Collaboration

Continuity

Continuous improvement

Community

One Google

Proprietary + Confidential

8 of 16

Listening is a key value for inclusion

Confidential & Proprietary

9 of 16

If we knew what we were doing, it wouldn’t be called research, would it?

– Einstein

Confidential & Proprietary

10 of 16

Data come from many different sources

Confidential & Proprietary

11 of 16

The FAIR principles encourage open data, but what about the CARE principles?

How can technology be used to achieve native data sovereignty?

12 of 16

Guiding principles for native data sovereignty

  • Location of the data
    • Bring compute to the data
    • Data centers on native land

  • Security
    • IAM (Identity and access management)
    • MFA (multi-factor authentication)
    • Monitoring, etc.

  • Data and research governance to enable research and discovery
    • Process for research approvals

Confidential & Proprietary

13 of 16

Topics for discussion

  • What data types?
    • Access to data collected by other orgs
    • What about internet and other digital data?
  • Where should data live, and how can they be connected?
    • On premise for data, connected for analysis via cloud?
    • Data center near tribal land - existing or new?
  • Frameworks for review / approval of research
    • More granular consent that can change over time
  • What about resources and working across tribes / nations?
    • Different policies for different users / participants

Confidential & Proprietary

14 of 16

CARE/FAIR Scenarios (30 min)

  • See QR codes on next slide
  • Walk through your group’s scenario; or choose wildcard (#4)
  • Use Google Doc for notes
  • 20 min for small group discussion
  • 10 min for sharing

15 of 16

Scenario QR Codes

#1

#2

#3

#4

16 of 16

Education & Outreach Breakout Session

Tonight 19:30 - 20:30

Discussion Topics:

  • Undergraduate data science education including pedagogy, project-based learning, and data science ethics

  • Sustainable and reproducible data science research, including open source code, collaborative shared community building, and training programs

We hope to see you there! :)