�ENIGMA and COINSTAC��Theo G.M. van Erp, Ph.D.��Department of Psychiatry and Human Behavior�University of California Irvine, Irvine, CA, USA ��
ENIGMA Chairs Meeting 2020
ENIGMA
Enhancing NeuroImaging Genetics through Meta-Analysis
1400+ scientists
43 countries
50 working groups
Paul Thompson
Jason Stein
ENIGMA’s first consortium paper
Stein et al. VOLUME 44, NUMBER 5, MAY 2012, Nature Genetics
51 working groups (95 chairs), 31 disorders; 1,400 members, 400 institutions, 43 countries
3
33 clinical working groups
5
ENIGMA Disorder Working Group
Conduct prospective meta-/mega-analyses to:
Schizophrenia
Jess & Theo
Bipolar Disorder
Chris, Derrek, & Ole
Major Depressive Disorder
Lianne & Dick
ENIGMA’s Analysis Approach
Cortical thickness and surface area are thought to be influenced by separate sets of genes
(Panizzon eta l. 2009; Winkler et al. 2010).
9 groups published their cortical papers
8
ENIGMA initial meta-analysis: �Dependent on Freesurfer common structure, and human agreement on covariates of interest
Standard CSV files of variables
R scripts
zip/tar file of results returned to central analysis team for meta-analysis
Automation
via
Linux Shell Scripts
Automating ENIGMA: COINSTAC
Toolkit for Anonymous Computation (COINSTAC)
- enables federated data analysis
A. D. Sarwate, S. M. Plis, J. A. Turner, M. R. Arbabshirani, and V. D. Calhoun, "Sharing privacy-sensitive access to neuroimaging and genetics data: a review and preliminary validation," Front Neuroinform, vol. 8, p. 35, 2014, 3985022.
S. M. Plis, A. D. Sarwate, D. Wood, C. Dieringer, D. Landis, C. Reed, S. R. Panta, J. A. Turner, J. M. Shoemaker, K. W. Carter, P. Thompson, K. Hutchison, and V. D. Calhoun, "COINSTAC: A Privacy Enabled Model and Prototype for Leveraging and Processing Decentralized Brain Imaging Data," Frontiers in Neuroscience, vol. 10, Aug 19 2016.
COINSTAC: Local analysis, globally aggregated
The process begins when the lead site defines the local and global analyses.
The local analysis is downloaded by each site (docker/singularity image within coinstac) to run a local data analysis that generates intermediate results that are combined in the global analysis that is run at the lead site.
1
Lead Site
From: Turner, MD., Gazula, H., et al., ENIGMA COINSTAC: Increasing neuroimaging data diversity with managed privacy, https://doi.org/10.13140/RG.2.2.18471.37285
Participating sites --collaborating sites with similar data-- can see how their data is to be used (they can see all of the analyses proposed).
If they agree, they may join the consortium to participate in the meta-analysis, and map their data to the analysis.
2
Participating
Sites
3
Once the participating sites have joined the consortium and mapped their data, the lead site launches all of the local analyses.
Analyses at the participating run automatically and return the intermediate data to the lead site.
ENIGMA COINSTAC cross-diagnosis collaborations
ENIGMA SZ
rsfMRI
DTI
Structural
Regressions
Multivariate
Multimodal ML
ENIGMA BD
ENIGMA MDD
ENIGMA-COINSTAC: Advanced Worldwide Transdiagnostic Analysis Of Valence System Brain Circuits
Neural Circuitry Associated with Negative Symptom Domains
Strauss et al. 2019; JAMA Psychiatry
MPIs: Turner, Van Erp, & Calhoun
1R01MH121246
Overview of COINSTAC Analysis
COINSTAC ENIGMA Negative Symptom Domain Analysis Pipeline
Using COINSTAC
Where to download:
https://github.com/trendscenter/coinstac/releases
Currently v4.7.5 is the latest binary release:
After download and uncompressing, just double click on the “coinstac” icon
COINSTAC Login Screen
Computations
…
Download Docker/Synology Image
to local compute node
Remove Docker /Synology Image
from local compute node
Add a New Docker/Synology Image
Pipeline Menu
c
R code to run analyses at individual
sites + R code to run meta-analysis
Create a new pipeline
Consortia Menu
Start new consortium
Leave a consortium you joined
Join a consortium
Map data to a consortium you joined
Map Local Data
Launch the Pipeline
See the Remote and Local Pipeline Progress and get your results.
c
See that all sites have mapped
their data successfully
International multisite collaborations
Power - Statistical power to tackle new kinds of questions (‘cracking the brain’s genetic code’), impossible to solve before
Computational efficiency - Massive distributed computing, leverage infrastructure/data across 400 institutions across 40+ countries; beginning new cooperative machine learning, deep learning, new mathematics
Open Science & Reproducibility Studies are designed to ensure replication (as seen from Forest plots). Proposals = registered reports. Analyses run are public info.
Team Science experts from all fields working together makes for more expertise and a much stronger publication, and everyone learns and gains!
Advancing Science Rather than try to answer the same question independently, merge data already collected to drive new investigations
Acknowledgments
NSF: 1631819
NIH COINSTAC: R01DA040487
NIH COINSTAC ENIGMA: 1R01MH121246
NIH COINSTAC GIGA/ABCD: R01DA049238
Turner Lab