Jean-Baptiste Poline
MNI, Ludmer Center, BIC, McGill
HWNI, UC Berkeley
1
Part I: what should be taught?
Part II : How, when, to whom
2
Part I: What should be taught�depends on why reproducibility occurs
Part II : Training challenges and solutions
3
Cause 1: Statistics
Cause 4: Publishing culture
Cause 2: Software
Cause 3: Data
4
Statistics: The One problem
See also : Mier, 2009: COMT and DLPFC
Molendijk, 2012: BDNF and hippocampal volume
Motivation
5
Power issues
Button et al., NNR, 2013
6
Feeling the Future
Poldrack et al., PNAS, 2016
7
Feeling the Future
8
Cause 1: Statistics
Cause 4: Publishing culture
Cause 2: Software
Cause 3: Data
9
Software issues – misuse
10
Software issues
D. Donoho, On sofware issues
11
Software, version, OS
G. Katuwal, f. in Brain Imaging Methods, 2016
12
ANTS – FS5.1- FS5.3
Size of the left caudal anterior
Cingulate
13
Estimating analytic flexibility of fMRI
J. Carp, f. Neuroscience, 2012
14
“Cluster failure”? Or RFT misuse?
Eklund et al., PNAS, 2016 :
- Low threshold issue
- High threshold issue with Paradigm E1 ?
- Ad hoc procedure leads to around 70% FPR
15
Cause 1: Statistics
Cause 4: Publishing culture
Cause 2: Software
Cause 3: Data
16
Cause: bugs in data
17
Cause 1: Statistics
Cause 4: Publishing culture
Cause 2: Software
Cause 3: Data
18
Misplaced incentives
19
Publication model
“Today I wouldn't get an academic job. It's as simple as that. I don't think I would be regarded as productive enough.”
Peter Higgs, 2014
20
Mistakes in papers are easy to find but hard to fix
D. Allison, A. Brown, B. G. K. Kaiser, Nature 2016
21
Changing the publication model:��Good luck with that.
Reproducibility: A tragedy of errors, Allison Nature 2016
22
Part I: What should be taught�depends on why reproducibility occurs
Part II : Training challenges and solutions
23
ReproNim Training program
24
NIH P41 ReproNim
25
Training principles
26
27
Software and code training
28
Fundamental tools
29
FAIR Data
30
Communicating data: language please ?
i
Cognitive
Atlas
NIDM
DCT
OBO
RDFS
HCSI
NCIT
STATO
NIF
31
32
Module Stats
33
The fundamentals
Logical thinking
:Not (only) recipes
How to check
34
MOOC on Moodle
35
Training at workshops
36
Training at workshops
Nov 2017, George Washington University, first training materials beta-tested during 1 ½ day reproducible research workshop (25 attendees).
May 2017, NAMIC Summer programming Week used ReproNim content.
April 2017, DataDiscovery InterTRD “Codathon”, UCSD had a "Data" practical theme with ReproNim subtopics of discovery.
June 2017, 2018, 2019, OHBM TrainTrack: ReproNim secured a parallel training track to be held during the three day OHBM Brainhack hackathon. 20 attendees in 2017, 30 attendees in 2018 and 2019. “Repro hours”, a one or two hour teaching session during the hackroom hours at the main conference.
Nov. 2017, 2018, and 2019, SFN. A 2017 Educational satellite training event for 25 attendees, a 2018 SFN training event for 30 attendees, a 2019 Schizconnect workathon
Fall 2018, Aug 2019, McGill, Montreal. An official McGill course “Reproducibility in Neuroscience” to teach neuro data science, which used some of the ReproNim material (30 students). A 2020 course for 30 students is planned.
Jan 10-16, 2019, Miami: “Coastal Coding”: ReproNim material were used for teaching at the hackathon.
37
Train the Trainer program
38
Building the community ?
39
Our goal: a reproducible publication
Ghosh SS, Poline JB, Keator DB et al. A very simple, re-executable neuroimaging publication. F1000Research 2017, 6:124 (doi: 10.12688/f1000research.10783.2)
In other words, given the data, workflow specification and execution environment specification, a third party can generate (and validate) the exact results independently and explore generalizability.
40
Ethical Scholarly Communications
Popper, Khun
Hypothesis Testing
& Refutability
Science in society
Causes of irreproducibility
Internal & external bias
Epistemiology & Sociology
41
Thanks !
Thanks to ReproNim:
42
43