Visibility of the Research process e.g., online lab notebooks
Why is it good
Enables reproducability, verification, validation.
Publications cannot contain all of the data required for reproducibility – provenance data provides valuable supplement to the publication
Provides proof – rather than relying on trust
Online lab notebooks:
act as an academic portfolio and reflects the scientist’s output, quality and abilities.
enables the funders to see progress on a day to day basis.
Widens the social networking opportunities for the chemist.
Makes it more difficult to fake results;
provides improved method for mentoring research team/students.
Why is it bad?
exposes messiness and failures – e.g. sloppy pipetting.
time and effort required to capture process – expensive, perceived as waste of time
generates too much data, not searchable, never accessed - huge cost of curation & preservation
competitors steal ideas and experimental designs or publish first.
Industry - making profit out ofaccess to online lab notebooks, don’t put profits back into research
Who are the players and what are their roles;
Chemists (PI, post-docs, PhD students) – changes driven by PIs. The IP owner decides openness of access to research process.
their collaborators,
technologists (software developers)
database custodians (e.g., PDB)
chemical societies
university/university commercialization arm,
publishers,
competitor researchers
chemical industry (e.g., Pfizer, Merck)
instrument manufacturers (JEOL, Bruker – need to extract metadata/instrument settings from spectrometers, Xray diffractometers, synchrotrons in standardized formats)
What are the technical drivers and enablers?
Online lab notebooks – wireless access 24/7
Indexing, mining, visualization, reporting of data in online lab notebooks
High bandwith required – moving large amounts of data
Google Docs and Google Apps – e.g., processing docking data via cloud computing
LIMS – higher level view across multiple individual lab notebooks
Wikis, blogs
Workflow registries
Persistent URLs for experiments, compounds, images etc – identifier services
Remote access to labs
Web cams
remote control of lab instruments, robots
Automatic logging services -> generate log files
RSS feeds
Annotation and semantic tagging services
What are the risk factors?
Users are very protective of their data, hesitant to expose ideas.
Users/research institutions are not confident of the legal validity of data in lab notebooks – still prefer paper copies with signatures
Publishers won’t want to support links back to the lab notebooks. They need assurance of persistent URLs.
Many publishers don’t like papers already visible/published on a Wiki.
Risks of not doing it
Group A adopts open access – communicate data faster
Group B is closed – slower to communicate results/data
Scenario
A distributed collaboratory – 3 different research labs/universities collaborating.
Some of them are using online lab notebooks.
Using GoogleDocs to share lists of compounds;
Using blogs, wikis and online LIMS to share and monitor/track the results of all of the experiments of a collaboratory.
Lab #1 – modelling group
Monitor publications, blogs etc in the area;
Check databases;
Perform computational modelling to identify potential target protein for malaria
Lab #2 – synthetic group
Experimental design
Prepare the compound
Lab #3 – biology group
Perform assays
Provide feedback but don’t provide access to lab notebook
Data being produced has different levels of errors and confidence. Feedback and commenting tools on notebook data can be used to perform quality assurance.
Lab notebooks – backed up by uploading to central indexed database where further data can be extracted.
Write a paper via the wiki
Send to a pre-print server like Nature Proceedings (open access and commenting tools),
Send to a peer-reviewed journal e.g. JoVE, Nature Chemistry, PLOS – impact factors of these?
Paper links back to the online data describing the research process, available through the online lab notebook
Restriction on the publishers that can be used because many publishers won’t accept papers that have been already published via the wiki.
Nature journals are not open access but the pre-print can be made available via open access.
Challenges/Issues
Open notebooks
loads of different platforms, software, terminologies – can you define the set of data and metadata required for each type of experiment to enable reproducibility e.g., cloning.
Lack of interoperability between lab notebooks
Most chemists are not interested in recording details of the process because it is tedious and of little or no value to them.
We need to make it easier for them to record the process – e.g., automatically or semi-automatically.
Lab information systems different from online notebooks – provide method for tracking all experiments and experimental design plan of a lab.
Issue of publishing research proposals/grant application, budgets.
Different levels of access:
Real-time access to lab notebooks versus asynchronous access after publishing versus access after set period e.g. 2 years after publishing versus restricted/authenticated access only.
Openness depends on nature of the research and opportunities for commercialization e.g., Malaria versus Viagra. Depends on topic, lab staff, funders and clients