Powering open science and collaboration with Invenio
Northwestern University Invenio Team
03 March 2020
@inveniosoftware
OHDSI: open, collaborative science
BENEFICENCE
We seek to protect the rights of individuals and organizations within our community at all times.
COLLABORATION
We work collectively to prioritize and address the real world needs of our community’s participants.
COMMUNITY
Everyone is welcome to actively participate in OHDSI, whether you are a patient, a health professional, a researcher, or someone who simply believes in our cause.
OPENNESS
We strive to make all our community’s proceeds open and publicly accessible, including the methods, tools and the evidence that we generate.
REPRODUCIBILITY
Accurate, reproducible, and well-calibrated evidence is necessary for health improvement.
INNOVATION
Observational research is a field which will benefit greatly from disruptive thinking. We actively seek and encourage fresh methodological approaches in our work.
VALUES
MISSION: To improve health by empowering a community to collaboratively generate the evidence that promotes better health decisions and better care.
Benefits of opening science...
Invenio software powers open science
How did this collaboration start (and what about Zenodo?!)
What motivated the InvenioRDM project?
All these groups came together to create a collaborative open source project and grow a sustainable community.
Zenodo will also run on InvenioRDM by the end of the project period.
RDM platforms
are critical to help preserve and share research, enable reproducibility, and empower reuse of datasets, protocols, engagement or study materials, & a wide range of other research products.
*Draft NIH Policy for Data Management and Sharing https://osp.od.nih.gov/wp-content/uploads/Draft_NIH_Policy_Data_Management_and_Sharing.pdf
We’re leveraging Invenio as a strong foundation. Here’s why.
The InvenioRDM project has two goals:
The platform
A few highlights...
InvenioRDM stack
Invenio is JSON-native and provides RESTful APIs to make it easy to build apps on top of the framework
InvenioRDM roadmap
February
March
Standing up InvenioRDM
1- Install invenio-cli
pip install invenio-cli
2- Initialize your project
invenio-cli init --flavour=RDM
3- Run it
cd <project name>
invenio-cli containerize
4- Visit https://localhost
firefox https://localhost
System requirements
Invenio can run in Docker, on virtual machines, or on physical machines. Invenio can run on a single machine or a cluster of 100s of machines.
It all depends on exactly how much data you are handling and your performance requirements.
Small installation:
Medium installation:
Large installation:
Search and retrieve datasets using standards-based documentation
Robust search enhanced by:
Data management for reproducibility and
Open Access: study-focused resource types�
InvenioRDM helps you store, manage and, if needed, share your study’s outputs:
Communities & Collections
Phenotype Definitions
XYZ Clinical Study
Community: Define your research group or other collaborative unit
Collection: Create multiple Collections under the umbrella of the Community. Within Collections, deposit and describe your:
Phenotype Definitions
Definitions
Characterizations
Evaluations
Metadata
Dissemination Strategy
Clinical Studies
Research Proposals
Protocols
Data Management Plans
Methods Descriptions
Measures
Case Reports
Datasets and Analyses
Collections bring together related groupings of documentation to communicate process, enable sharing of results, and support publication, compliance, and reproducibility
Collections & Clinical Studies
Store multiple datasets with large numbers of detailed results from each analysis and re-use of data generated by a single study.
Results presented in InvenioRDM are:
Hone in on the results you seek using InvenioRDM’s robust metadata of subject and resource type terms.
InvenioRDM incorporates contributor roles for all records. Deposit your SQL code, statistical analysis plan, database code, and other study documentation; receive credit, and group all documents in a Collection
Biostatistician
Developer
Data Analyst
Gupta, Simran
Properly attribute all contributors to research
Gonzales, S., O’Keefe, L., Gutzman, K., Viger, G., Wescott, A., Farrow, B., . . . Holmes, K. (n.d.). Personas for the Translational Workforce. Journal of Clinical and Translational Science, 1-27. doi:10.1017/cts.2020.2
Collaborators: Work with them and discover new ones
InvenioRDM will allow private record sharing, so researchers can:
User 1’s files
User 2’s files
InvenioRDM will have a social component, allowing researchers to:
The community
https://inveniosoftware.org/ and click on “RDM”
InvenioRDM collaborators
How can Invenio support the OHDSI community?
We’re managing a large multi-site project, harmonizing data from numerous sources and managing research projects. We want to create communities of practice to integrate theories, data, techniques, and tools.
I lead a large basic science research group. We use InvenioRDM to support reproducible science by packaging combined with big data mining, a desire to process collected data using the latest bioinformatics tools.
I am a clinical researcher. I need a way to pre-register protocols or research proposals, search on demographics of participants in similar studies, get insights into recruitment, share portions of study for compliance.
Our multi-institution health equity project uses InvenioRDM to collaborate with our community- based partners and credit these partnerships. We can share materials from community health events, project materials, training materials, annual reports, and lay summaries of research. InvenioRDM helps us to be better partners, accountable to collaborators and the community,
I’m an early career researcher just getting started on my research career. I need to “put my best foot forward” to showcase my work and demonstrate my expertise and collaborations. Invenio gives me a way to make all of my research efforts findable and the metrics are helpful for reporting. and highlighting my impact to my leadership.
My team wants to find out about clinical trial opportunities to offer patients all options for treatment. It is important to us to openly share the latest research with patients. InvenioRDM communities give us a way to make these materials openly available and packaged in a cohesive and attractive manner. As resources are updated, we can upload the new versions and track access.
Some Use Cases
Our institute wants a way to publish and disseminate content such as our handbook, lay summaries, and more. We want to credit all contributors and produce an attractive and interactive resource that can be easily updated.
FAIR: OHDSI & InvenioRDM
InvenioRDM’s records are made findable through each being issued a Digital Object Identifier (DOI), and through their metadata being indexed and made searchable immediately.
OMOP database summaries can be published in InvenioRDM as findable descriptor records to reference the database for reproducibility and citation
Metadata in InvenioRDM are accessible because they are retrievable using a standardized communications protocol which is free and universally implementable.
OMOP data can be mapped through similar open protocols through SQL interfaces, though largely for secure querying. Results of analyses in multiple OMOP databases can be cataloged in InvenioRDM, and these records retrieved through the open protocol OAI-PMH.
InvenioRDM leverages metadata encoding (JSON) and vocabulary (FundRef, OpenAIRE, COAR Resource Types, etc.) standards to ensure maximum interoperability for records describing digital assets.
OMOP similarly ensures interoperability through its CDM and standardized vocabulary, and the OHDSI community goes beyond this work by providing a platform to enable an interoperable understanding of the analysis methods for healthcare data.
Ensuring the reusability of digital assets deposited in InvenioRDM is key and is achieved through assigning licenses and establishing provenance through registering users.
OHDSI’s Metadata Working Group is actively working toward attaching provenance information to OMOP records.
Links
Northwestern's Proof of Concept: http://bit.ly/inveniordm-at-nu
Install your own instance! https://inveniordm.docs.cern.ch/
With thanks…
Teams
Support
Work presented here is supported in part by:
NUCATS: UL1TR001422 (NCATS)
all of the InvenioRDM project partners
Guillaume Viger
Sara Gonzales
Lisa O’Keefe
Matt Carson