1 of 53

Daniel NüstInstitut für Geoinformatik�d.n@wwu.de | @nordholmen0000-0002-0024-5046

Slides: http://bit.ly/hangout21-repro

1

Practical reproducibility and reproducibility vs. peer review

Spatial Data Science Hangout, spatial@ucsb, Spring ‘21

2 of 53

2

CC-BY 3.0, Sebastian Bertalan, Wikimedia Commons

An article about computational science in � a scientific publication is not the  � scholarship itself, it is merely advertising 

of the scholarship. The actual scholarship 

is the complete software development � environment and the complete set of � instructions which generated the figures. 

Claerbout’s claim: 

3 of 53

Crisis? Crisis of what?

Credibility crisis?�Replicability crisis?�Reproducibility crisis?�Robustness crisis?�Generalisability crisis?

3

4 of 53

Practical Reproducibility in Geography and Geosciences

Daniel Nüst & Edzer Pebesma (2020) Practical Reproducibility in Geography and Geosciences, Annals of the American Association of Geographers, DOI:10.1080/24694452.2020.1806028�PDF: http://nuest.staff.ifgi.de/N%C3%BCst-and-Pebesma_2020_AAM_Practical-Reproducibility-in-Geography-and-Geosciences.pdf

Creating reproducible workflows

Computing environment: hardware + software, containers/virtualisation (Binder), freezing/pinning�Script-based workflows: no point-and-click GIS, notebooks (Jupyter, R Markdown)�> Research compendium

Challenges

Education, publishing practices, SDIs, GIS, proprietary software,�lack of rewards/pressure, sensitive data, time, ...�> all solvable

5 of 53

Reproducible Research &�research software

Peer Review

5

Reproducible research and peer review are cornerstones of science. But are they getting along?

6 of 53

CODECHECK

7 of 53

7

The inverse problem in reproducible research. Figure 1 of https://doi.org/10.12688/f1000research.51738.1

The left half of the diagram shows a diverse range of materials used within a laboratory. These materials are often then condensed for sharing with the outside world via the research paper, a static PDF document. Working backwards from the PDF to the underlying materials is impossible. This prohibits reuse and is not only non-transparent for a specific paper but is also ineffective for science as a whole. By sharing the materials on the left, others outside the lab can enhance this work.

8 of 53

8

The CODECHECK example process implementation. Figure 2 of https://doi.org/10.12688/f1000research.51738.1

The left half of the diagram shows a diverse range of materials used within a laboratory. These materials are often then condensed for sharing with the outside world via the research paper, a static PDF document. Working backwards from the PDF to the underlying materials is impossible. This prohibits reuse and is not only non-transparent for a specific paper but is also ineffective for science as a whole. By sharing the materials on the left, others outside the lab can enhance this work.

9 of 53

9

Independent execution of computations underlying research articles.

One re-execution by codechecker during peer review

👶

  1. Codecheckers record but don’t investigate or fix.
  2. Communication between humans is key.
  3. Credit is given to codecheckers.
  4. Workflows must be auditable.
  5. Open by default and transitional by disposition.

📃

🔏

10 of 53

10

Nüst D and Eglen SJ. CODECHECK: an Open Science initiative for the independent execution of computations underlying research articles during peer review to improve reproducibility [version 1; peer review: awaiting peer review]. F1000Research 2021, 10:253 (https://doi.org/10.12688/f1000research.51738.1)

11 of 53

Reproducible AGILE

12 of 53

AGILE Reproducible Paper Guidelines 🇬🇧 🇪🇸

https://doi.org/10.17605/OSF.IO/CB7Z8

Created by AGILE Initiative in 2019, see report at https://osf.io/hupxr/

Transparency & Reproducibility�GIScience�https://osf.io/phmce/wiki/home/

Promotion�Acknowledge spectrum

12

13 of 53

The guidelines

Author guidelines�Data in Research Papers�Computational workflows �in Research Papers�Pre-submission checklist�Writing DASA section

Rationale/Motivation/Vision

Reviewer guidelines (what not to worry about)

Reproducibility reviewer guidelines

13

14 of 53

AGILE conference review process

14

Reproducibility review after accept/reject decisions, triggered by regular reviewer

Reproducibility review & communication

Community conference

Badges on proceedings page

Presentation at conference

Read full report at https://osf.io/7rjpe/

15 of 53

Reproducibility review results

6 reproducibility reports published in 2020�🔥 9 more coming for 2021

16 (2020) not possible/not attempted (5 of which after communication with authors):

  • no starting point in the paper
  • documentation insufficient for third party
  • sensitive/confidential/commercial data
  • proprietary software
  • software paper
  • conceptual papers

15

16 of 53

What can you do today?

16

17 of 53

Reproducible Research & Open Science

18 of 53

18

Quintana, D. S. (2020, November 28). Five things about open and reproducible science that every early career researcher should know. https://doi.org/10.17605/OSF.IO/DZTVQ

19 of 53

What can scientists do?

Take one step at a time.

Create and publish Research Compendia�(Your code is good enough!):�https://research-compendium.science/

Become a codechecker or reprohacker.

Strive to be an open science champion especially�if you’re junior in your field. We need to be the change, find communities.�[RIOT Science Club Talk by Gavin Buckinham; preprint by Sam Westwood]

19

20 of 53

Thanks!

Daniel NüstInstitut für Geoinformatik�d.n@wwu.de | @nordholmen0000-0002-0024-5046

Slides: http://bit.ly/hangout21-repro

21 of 53

Bonus slides for discussion

22 of 53

Reproducible AGILE and CODECHECK:�Highlights of Lessons learned

Spectrum or layers of reproducibility

Effect of guidelines at AGILE: improved reproducibility

Reproducibility reports/CODECHECK certificates full of recommendations for improvement, well received by authors, even included in revision before publication

Good practices spread slowly, establishing a process is tedious

Challenges for reproducibility reviewer: Inconsistencies and disconnects (figures), lack of documentation, unknown runtimes vs. no subsets of data, lack of repro guidance

Reproductions are rewarding and educational, matching expertises tricky

Safety net (👀), not security

22

Read full report at https://osf.io/7rjpe/

23 of 53

How does the future of reproducible research in peer review look like?

Reproducibility is possible, but disciplines/communities must agree what “peer review” entails and acknowledge the efforts (ECRs, RSEs) in a positive way.

Help each other! Move together as a community through disruptive changes.�Then reproducible research and peer review will get along just fine.

23

24 of 53

24

Independent execution of computations underlying research articles.

25 of 53

25

26 of 53

26

J. Leek’s tidypvals

“Notice�Anything�funny?”

27 of 53

The many problems of science

Publish or perish�Broken metrics (citations, JIF)�Structural change not considering � senior academics�Publication bias�Long-term funding for tools & infrastructure�HARKing�p-Hacking�Scholarly communication 1.0�Lack of reusability�Lack of transparency�Lack of reproducibility�Reinventing the wheel�Retraction practices�Not invented here syndrome�Fraud�Imposter syndrome�No “negative” citation�...

Open Science (OER, OA, OS, OPR)�Registered reports/preregistration�Altmetrics�Preprints�Leiden ManifestoDORAVienna Principles�Citing data and software�Software papers�Data and software as products of research�RSEng & RSEs (software sustainability)�CRediT�Research CompendiaTen Hot Topics Around Scholarly Publishing�Code review (PyOpenSci, ROpenSci, JOSS)�...�

27

28 of 53

28

  1. reproducibility helps to avoid disaster
  2. reproducibility makes it easier to write papers
  3. reproducibility helps reviewers see it your way
  4. reproducibility enables continuity of your work
  5. reproducibility helps to build your reputation

29 of 53

30 of 53

Traditional and modern scientists

30

T

Π

Broad knowledge: across disciplines�collaborate with other experts, apply outside of own field

Deep knowledge: expertise and skills within a single field

Computer & method skills�statistics, reproducibility, programming, data science

31 of 53

Code Review

31

Boettiger, C., Chamberlain, S., Hart, E., & Ram, K. (2015). Building Software, Building Community: Lessons from the rOpenSci Project. Journal of Open Research Software, 3(1), e8. doi:10.5334/jors.bu

32 of 53

Reproducible computational research in journals & conferences

32

33 of 53

Findings

Overall

  • Saw full spectrum of reproducibility
  • Compared to previous years’ submissions, the guidelines and increased community awareness markedly improved reproducibility
  • ⅚ reproduced papers have DASA; all embrace guidelines
  • Reproducibility reports with many recommendations for improvement, well received by authors, even included in revision before publication > reward!
  • Good practices spread slowly
  • Process

33

Read full report at https://osf.io/7rjpe/

34 of 53

Findings

Challenges for reproducibility reviewer:

  • Inconsistencies (identifiers, links) between paper and code
  • Lack of connections between artefacts (code <> figure)
  • Workspaces layout: no documentation, absolute paths
  • Unknown runtime and no demo subsets of data
  • No guidance on efforts and stop points

�All efforts beyond mere workflow execution

34

Read full report at https://osf.io/7rjpe/

35 of 53

🙌

35

How to put your community on a path towards�more reproducibility in 5 easy hard steps

  1. Build a team of enthusiasts (workshop, social events)
  2. Assess the current state and raise awareness (workshop, paper)
  3. Institutional support (🙏 AGILE Council 🙏 + committee chairs)
  4. Positive encouragement (no reproduction != bad science)
  5. Keep at it!

36 of 53

What can communities and institutions do?

Introduce reproducibility reviews - CODECHECK (or not)!

Workshops on RCR, ReproHacks

Provide support (R2S2, Anja)

Rewards and incentives

�Awareness > Change

36

37 of 53

Reproducibility review reports

37

38 of 53

Reproducibility review reports

38

39 of 53

Reproducibility review reports

39

40 of 53

The guidelines for

reproducibility reviewers (WIP)

Ideal vs. realistic

Role

Skills

Do’s & dont’s

40

41 of 53

🙌

41

How to put your community on a path towards�more reproducibility in 5 easy hard steps

  • Build a team of enthusiasts (workshop, social events)
  • Assess the current state and raise awareness (workshop, paper)
  • Institutional support (🙏 AGILE Council 🙏 + committee chairs)
  • Positive encouragement (no reproduction != bad science)
  • Keep at it!

42 of 53

The guidelines for data

“What if…” and Examples (not shown)

42

43 of 53

The guidelines�for workflows

Examples (not shown)

43

44 of 53

The guidelines for reproducibility reviewers (WIP)

Examples for “Do’s and Don’ts”:

  • Do shift burden to author
  • Do encourage and set examples
  • Do not accept private data sharing
  • Document your work in report (impact)
  • Be kind (career stage, knowledge, privileges)
  • No rummaging

44

45 of 53

Structural challenges

Metrics for acknowledging/measuring impact in science are broken (impact factor, ..) and they lead to publication bias, HARKing, p-Hacking, intransparency and lack of reproducibility

Leiden Manifesto: http://www.leidenmanifesto.org�DORA: https://sfdora.org �Vienna Principles: https://viennaprinciples.org

Acknowledging data and software as valuable products of research (instead of shoehorning software into papers)

46 of 53

47 of 53

Traditional and modern scientists

T

Π

Broad knowledge: across disciplines�collaborate with other experts, apply outside of own field

Deep knowledge: expertise and skills within a single field

Computer & method skills�statistics, reproducibility, programming, data science

48 of 53

Professionalisation

49 of 53

Motivation

Back to 2010 The Software Sustainability Institute (SSI, UK) run a study (1000 randomly chosen researchers) …

“It's impossible to conduct�research without software,�say 7 out of 10 UK�researchers”

https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers

50 of 53

Motivation

A study of Nature papers from Jan-March 2016 reveals that

“32 of the 40 papers examined mention software, and the 32 papers contain 211 mentions of distinct pieces of software, for an average of 6.5 mentions per paper.”

[2] Nangia, Udit; Katz, Daniel S. (2017): Understanding Software in Research: Initial Results from Examining Nature and a Call for Collaboration. doi:10.1109/eScience.2017.78

51 of 53

52 of 53

DEVELOPERS

CONDITIONS FOR

LEAD TO

EDUCATION OF

RESEARCHERS USING

53 of 53

”Software is 95% human and only�5% code” *

* Eric Albers, CCC2019, https://media.ccc.de/v/thms-49-ber-die-nachhaltigkeit-von-software �Bilder © H. Seibold, S. Janosch, OSD2019

RSEng = create research software

RSEs = people behind research software

RSEs ≠ IT !!!

Researcher uses scripts for data analysis and needs working stable software for her work. She learns what is necessary to achieve her research goals.

Reproducibility guru dives deeply into manifold software and tools to make his research reproducible and develops his own software in a sustainable way.

Person for tough problems knows how to solve all kinds of computer-related issues; he was not hired for that, but enjoys to help and spends time to get to the bottom of other people’s challenges.

Geek writes software as part of her research project and would like to code more, but must keep an eye on her career in science and needs to write papers.

Software developer was hired to implement software for a research project and contributes to large collaborative software projects to realise the next generation of digital infrastructure for science.