1 of 59

How the Ocean Health Index enables better science in less time

Julia Stewart Lowndes, PhD

Marine Data Scientist & Mozilla Fellow

National Center for Ecological Analysis

& Synthesis, UC Santa Barbara

slides: jules32.github.io

twitter: @juliesquid

April 15, 2019

Bren School Seminar

UC Santa Barbara

2 of 59

Allison Horst

Public

3 of 59

Allison Horst

Allison Horst

3

Public

4 of 59

Allison Horst

5 of 59

6 of 59

Data science is the discipline of turning raw data into understanding

Hadley Wickham

Statistician, Professor, Developer

Chief Scientist, RStudio �

7 of 59

Open data science involves mindsets and skillsets emphasizing efficiency, reproducibility, transparency, collaboration, communication, and kindness

Me

8 of 59

Outline

9 of 59

oceanhealthindex.org

Halpern et al. 2012

10 of 59

11 of 59

A scientific method, tool, and community for channeling the best available scientific information into marine policy.

  • Captures coupled system health, incorporates sustainability
  • Boils into easy-to-understand metrics
  • Is flexible to different contexts
  • Stimulates actions to improve ocean health
  • Is repeatable to track progress through time

Halpern et al. 2012

12 of 59

A scientific method, tool, and community for channeling the best available scientific information into marine policy.

Important because in marine management there is need for:

  • Science- and data-driven methods to measure what people care about
  • Standardized but flexible methods to assess different geographies
  • Streamlined assessments from year-to-year to track change through time

13 of 59

Global

  • Repeated annually 2012-2019+
    • 220 coastal nations & territories
  • “Toolbox” software dev & maintenance
  • Global Fellows program
  • United Nations indicators:
    • Convention on Biological Diversity
    • Sustainable Development Goal 14
  • Case studies
  • Stakeholder engagement

Smaller scales

Assessments we lead

Assessments we enable

OHI+

  • ~20 completed & ongoing
  • Usually at smaller spatial scales w/ stakeholder engagement
  • We provide Toolbox & guidance
    • Code, files, support
    • ohi-science.org
      • Books, blogs, trainings, forum, publications

14 of 59

15 of 59

OHI+

16 of 59

A healthy ocean sustainably delivers a range of benefits to people now and in the future.

17 of 59

18 of 59

Employment

Cultural Identity and Sense of Place

Food Provision

19 of 59

goals

OHI framework

A healthy ocean sustainably delivers a range of benefits to people now and in the future.

71

scores

models

data

inform

20 of 59

goals

scores

models

data

inform

Repeatable OHI assessment process

21 of 59

Combining datasets

Documenting decisions & methods

Asking for & incorporating feedback

Collaborating

Revising and comparing

Analyzing and summarizing data

Reevaluating past decisions

Critically evaluating results

Communicating methods and results

Planning

Designing figures

Gathering data

Reading the literature

OHI, aka modern science

22 of 59

ohi-science.org

@ohiscience

23 of 59

vs.

data_final_final.xls

Re:FWD: data question

scripts/species_count.R

Issue: species count

Allison Horst

24 of 59

25 of 59

How we work

  • Culture of sharing, teaching, learning
  • Shared practices & norms
  • Seaside Chats
  • Resilience
  • On/off boarding

connection to broader communities

  • listen
  • learn
  • contribute

open coding language

  • shared practices
  • streamlined tools

collaboration platform

  • bookkeeping
  • display, publication, & distribution

docs, slides, sheets

  • coordinate
  • co-develop
  • share

26 of 59

Streamlined workflow

Figure adapted from Teucher 2018

27 of 59

Shared tools & practices

efficiency & reproducibility: coding and version control are the keystone

28 of 59

Shared tools & practices

but also for

game-changing

collaboration & communication

efficiency & reproducibility: coding and version control are the keystone

29 of 59

Shared practices for reproducibility

30 of 59

RStudio for R, text editing, Github sync, and more

Shared coding practices; convenient interface for coding and syncing

R code (scripts and console)

File navigation, help,

plots, packages

GitHub connection, environment, build

31 of 59

Github for archiving & bookkeeping

Convenient sharing with yourself and others

See what changed line-by-line

...and plot-by-plot

32 of 59

Github for discussion & project management

Individual

to do’s

Shared & archived

conversations

Project management and institutional memory

33 of 59

R + Github for documentation & communication

Website

OHI-Science.org

Protocols & methods

*made with RMarkdown*

Hands-on Training Books

34 of 59

R + Github for publication

Interactive

applications

*made with shiny*

ohi-science.org/ohi-global

35 of 59

Shared tools & practices

Our workflow is more streamlined; efficient onboarding & offboarding

36 of 59

Enabling better science in less time

  • Culture of sharing, teaching, learning
  • Shared practices & norms
  • Focus on training & building community

37 of 59

Learning with online communities

38 of 59

https://blog.mozilla.org

39 of 59

40 of 59

openscapes.org

@openscapes

41 of 59

Empower

Amplify

Engage

Allison Horst

We champion open practices to help uncover data-driven solutions faster.

Build champions and communities

Build confidence and skills

Build awareness and excitement

42 of 59

Openscapes Champions

We help Champions & their labs:

  • Engage through culture of sharing, teaching, learning

  • Be empowered through guidance, pathways & agency for skill-building

  • Amplify their efforts

Mentorship program that empowers environmental scientists

with open data science tools and grows the community of practice

43 of 59

Openscapes Champions

Lessons based from Lowndes et al. 2017

Openscapes.org/series

Early lesson:

  • Data science as a discipline
  • Open data science tools exist
  • Open as a way to work
  • Lab members as a team
  • Collaborators and community (redefined) as a way to learn
  • The internet as an underleveraged tool for science

44 of 59

Openscapes

  • Half-way through inaugural cohort with seven labs!
  • Seaside, Bluffside, Bayside, Fishbowl Chats
  • Openscapes.org resources, blogs, series
  • Champions Summit!
  • Long-term plan – annual concurrent cohorts, in-person workshops

Progress and what’s next:

45 of 59

What can you do to engage with open data science?

Allison Horst

46 of 59

What can you do to engage with open data science?

- Talk about your data challenges with colleagues

- Share your next presentation online

- Use Twitter for science

- Follow selectively, listen &

learn (e.g. #rstats, @nceas)

1. Promote/enable the culture of open data science – even if you don’t code

2. Create/join communities, locally & online

3. Use existing online resources to learn & skillshare

  • ohi-science.org/betterscienceinlesstime
  • openscapes.org/resources

4. Ask for open data science skills to be formally taught

How about TODAY:

47 of 59

So what can you do?

A few of UCSB’s many learning communities

eco-data-science.github.io

@ecodatasci

meetup.com/

rladies-santa-barbara

@RLadiesSB

#TidyTuesday Hacky Hours

ESM 206 & 244

library.ucsb.edu/

software-carpentry

NCEAS opportunities: internships, postdocs, research scientists:

nceas.ucsb.edu/employment

48 of 59

We can get to environmental solutions faster

if we are more

efficient & collaborative

with how we do science.

Let’s do

better science

in less time together.

49 of 59

Julia Stewart Lowndes, PhD

http://jules32.github.io

lowndes@nceas.ucsb.edu

@juliesquid

openscapes.org ohi-science.org

@openscapes @ohi-science

slides: jules32.github.io

Thank You

50 of 59

51 of 59

Extra slides

52 of 59

Using twitter for science

My internal monologue:

  1. Cool visualization!
  2. I want to represent my data this way
  3. He includes his code!
  4. Package from @sckottie at rOpenSci
  5. rnoaa is a package making NOAA data more accessible!

53 of 59

Using twitter for science

54 of 59

Being open with your science

  • Learning & getting feedback
  • Connecting with people
  • Less reinventing the wheel
  • Open mindset: expecting better ways and sharing your approaches

55 of 59

Twitter to learn and connect

  • Follow selectively & deliberately, listen & learn
    • Learn R (e.g. #rstats)
    • Follow colleagues. Tell someone you like their paper.
    • Find fellowships & jobs.

56 of 59

Shared practices for reproducibility

Data wrangling: up to 50–80% of a data scientist’s time Lohr 2014

57 of 59

Tidy data

What if you needed D. opalescens 2017?

Untidy :(

Tidy !!

Good for data entry, not good for data analysis because:

  • Data are in column headers
  • What are the values?
  • Variable is spread across multiple columns

Great for data analysis because

  • Each variable has its own column.
  • Each observation has its own row.
  • Each value has its own cell.

Species

2016

2017

D. gigas

398

139

D. opalescens

663

447

O. rubescens

423

739

species

year

count

D. gigas

2016

398

D. gigas

2017

139

D. opalescens

2016

663

D. opalescens

2017

447

O. rubescens

2016

423

O. rubescens

2017

739

58 of 59

Tidy data

Examples from tidyr

tidyr::gather()

separate()

gather()

59 of 59

Our Ocean Health Index story

Ocean management is complex

Need for science- and data-driven methods to measure what people care about

Need for standardized but flexible methods to assess different geographies

Need to streamline assessments from year-to-year to track change through time