1 of 43

What should be taught to curb the reproducibility crisis?

Jean-Baptiste Poline

MNI, Ludmer Center, BIC, McGill

HWNI, UC Berkeley

1

2 of 43

Part I: what should be taught?

Part II : How, when, to whom

2

3 of 43

Part I: What should be taught�depends on why reproducibility occurs

Part II : Training challenges and solutions

3

4 of 43

Cause 1: Statistics

Cause 4: Publishing culture

Cause 2: Software

Cause 3: Data

4

5 of 43

Statistics: The One problem

6 of 43

Power issues

Button et al., NNR, 2013

6

7 of 43

Feeling the Future

Poldrack et al., PNAS, 2016

7

8 of 43

Feeling the Future

8

9 of 43

Cause 1: Statistics

Cause 4: Publishing culture

Cause 2: Software

Cause 3: Data

9

10 of 43

Software issues – misuse

1990’s: software industry realizes that:
“untested code is broken code”
The unit and integration testing framework started to be developed, coverage introduced
Neuroimaging software have bugs – many unknown?
How do you test the script that you inherit from the previous PhD/Postdoc ?

10

11 of 43

Software issues

D. Donoho, On sofware issues

11

12 of 43

Software, version, OS

Change from FSL to SPM?
Change from v.1.12 to v.2.1 ?
Change from cluster A to cluster B? Glatard et. al., finsc, 2015

G. Katuwal, f. in Brain Imaging Methods, 2016

12

13 of 43

ANTS – FS5.1- FS5.3

Size of the left caudal anterior

Cingulate

13

14 of 43

Estimating analytic flexibility of fMRI

A single event-related fMRI experiment to a large number of unique analysis procedures
Ten analysis steps for which multiple strategies appear in the literature : 6,912 pipelines
Plotting the maximum peak

J. Carp, f. Neuroscience, 2012

14

15 of 43

“Cluster failure”? Or RFT misuse?

Estimated 3,500 papers affected by low threshold ?
But 13000 w/o multiple comparisons ?

Eklund et al., PNAS, 2016 :

- Low threshold issue

- High threshold issue with Paradigm E1 ?

- Ad hoc procedure leads to around 70% FPR

15

16 of 43

Cause 1: Statistics

Cause 4: Publishing culture

Cause 2: Software

Cause 3: Data

16

17 of 43

Cause: bugs in data

A less rare case than ususally thought !
Database not containing what they say they do
Wrong QC – QC performed several times
Headers of files are not correct (cf the Left/Right issue)
Provenance of data lost
Example of the ABDC Philipps scanner batch
Example in the UK Biobank

17

18 of 43

Cause 1: Statistics

Cause 4: Publishing culture

Cause 2: Software

Cause 3: Data

18

19 of 43

Misplaced incentives

Publication = the only “currency” for researchers, universities
The high competition incites researchers to keep data and code as “assets” and to get as many authorships as possible
The current incentive system promotes poorly reproducible research

19

20 of 43

Publication model

Evidence that at the heart of research reproducibility issue is the publication culture

Publications dictates:

The short/long terms projects aspects
The collaborations
Grants
Jobs

“Today I wouldn't get an academic job. It's as simple as that. I don't think I would be regarded as productive enough.”

Peter Higgs, 2014

20

21 of 43

Mistakes in papers are easy to find but hard to fix

D. Allison, A. Brown, B. G. K. Kaiser, Nature 2016

21

22 of 43

Changing the publication model:��Good luck with that.

Publishing research products beyond narrative

Data
Code

Initiatives :

TOSI,
OHBM-TOPIC
CONP,
Jupyter,binder
OSF, etc

Reproducibility: A tragedy of errors, Allison Nature 2016

22

23 of 43

Part I: What should be taught�depends on why reproducibility occurs

Part II : Training challenges and solutions

23

24 of 43

ReproNim Training program

Comprehensive Content (What)

ReproNim modules
The ReproNim “How To”s

Reaching out (When, Where, How)

Training workshops
Hackathons
University courses

How do we scale

MOOCS
Train the Trainer program

How do we keep material uptodate?

24

25 of 43

NIH P41 ReproNim

25

26 of 43

Training principles

In depth (but time?)
Invest in tools
FAIR material
Teaching do tools:

Notebooks
Out of the NB ?

Give feedback
Help improve on GH

26

27 of 43

27

28 of 43

Software and code training

28

29 of 43

Fundamental tools

GIT / Github
Shell and tooling (ssh, etc)
Python / R / Octave
DB – Data models – Linked data
Docker / Singularity
DataLad

29

30 of 43

FAIR Data

30

31 of 43

Communicating data: language please ?

i

Cognitive

Atlas

NIDM

DCT

OBO

RDFS

HCSI

NCIT

STATO

NIF

31

32 of 43

Legal and ethical aspects
FAIR (indexing, licensing)
FAIR Meta data:

Vocabulary/ontology re-use
Tagging

Provenance & Versioning
Longevity and sustainability of repositories
Data organization standards:

for yourself
for others

Large data handling
Checking data – hashing

32

33 of 43

Module Stats

33

34 of 43

The fundamentals

Sampling - distribution
Prediction
Model comparison
P-values issues

Logical thinking

:Not (only) recipes

Non parametric
Re-sampling

How to check

Sensitivity

analyses

Simulations
Test retest

34

35 of 43

MOOC on Moodle

35

36 of 43

Training at workshops

Hands on - navigating the installation issues

Local installs ?
Download VM ?
Set up amazon instances ?

1 or 2 days are good - not sufficient: full courses needed

Summer schools
Official university course

Evaluation of impact

should be done weeks months later ?

36

37 of 43

Training at workshops

Nov 2017, George Washington University, first training materials beta-tested during 1 ½ day reproducible research workshop (25 attendees).

May 2017, NAMIC Summer programming Week used ReproNim content.

April 2017, DataDiscovery InterTRD “Codathon”, UCSD had a "Data" practical theme with ReproNim subtopics of discovery.

June 2017, 2018, 2019, OHBM TrainTrack: ReproNim secured a parallel training track to be held during the three day OHBM Brainhack hackathon. 20 attendees in 2017, 30 attendees in 2018 and 2019. “Repro hours”, a one or two hour teaching session during the hackroom hours at the main conference.

Nov. 2017, 2018, and 2019, SFN. A 2017 Educational satellite training event for 25 attendees, a 2018 SFN training event for 30 attendees, a 2019 Schizconnect workathon

Fall 2018, Aug 2019, McGill, Montreal. An official McGill course “Reproducibility in Neuroscience” to teach neuro data science, which used some of the ReproNim material (30 students). A 2020 course for 30 students is planned.

Jan 10-16, 2019, Miami: “Coastal Coding”: ReproNim material were used for teaching at the hackathon.

37

38 of 43

Train the Trainer program

In partnership with the INCF
Allowing to scale to a much larger community
Second edition in 2020

38

39 of 43

Building the community ?

New material development

Updating current material

A network of trainers

A network of future reproducible researchers

39

40 of 43

Our goal: a reproducible publication

Ghosh SS, Poline JB, Keator DB et al. A very simple, re-executable neuroimaging publication. F1000Research 2017, 6:124 (doi: 10.12688/f1000research.10783.2)

Words, as usual, PLUS the following supplemental information:
Data
Workflow Specification
Execution Environment Specification
Complete Results

In other words, given the data, workflow specification and execution environment specification, a third party can generate (and validate) the exact results independently and explore generalizability.

40

41 of 43

Ethical Scholarly Communications

Standards and principles : COPE
Citation of previous work
Fair reuse of objects
Authorships and Acknowledgments
Verifyability
Preprints

Popper, Khun

Hypothesis Testing

& Refutability

Science in society

Causes of irreproducibility

Internal & external bias

Epistemiology & Sociology

41

42 of 43

Thanks !

Thanks to ReproNim:

Dave Kennedy (UMMS)
Maryann Martone (UCSD)
Jeff Grethe (UCSD),
Al Crowley (TCG),
Christian Haselgrove (UMMS),
Satra Ghosh (MIT),
David Keator (UCI),
Yaroslav Halchenko (Dartmouth),
Matt Travers (TCG),
Nina Preuss (TCG),
Dorota Jarecka (MIT),
Sanu Abraham (MIT),
Kyle Meyer (Dartmouth)
Peer Herholz (McGill)
Our fellows
Many others !

42

43 of 43

43