1 of 54

Striving for Reproducibility in Research

BASIC LABORATORY METHODS IN A REGULATED ENVIRONMENT

2 of 54

LECTURE OVERVIEW

Introduction; reproducibility in science, a contemporary hot topic
Possible causes of irreproducibility
Making our work reproducible – the focus of this course

3 of 54

LECTURE OVERVIEW

Introduction; reproducibility in science, a contemporary hot topic
Possible causes of irreproducibility
Making our work reproducible – the focus of this course

4 of 54

NOTE ABOUT TERMINOLOGY

Note: there are some differences in how certain words are used, particularly “reproducibility” and “replicability.”
We will use the terms “reproducibility” and “irreproducibility”

5 of 54

HEADLINES

What is being discussed, the problem

11 of 54

WHAT ARE THESE HEADLINES ABOUT?

Pharmaceutical companies scour scientific literature for leads
If find a promising study, Amgen scientists would try to replicate it, but seldom could
2012, C. Glenn Begley, decided to study this formally
Selected 53 papers that could have led to ground-breaking drugs and tried to replicate them in house

Scientific findings confirmed in only 6 cases

So asked original scientists to help, occasionally in their own labs, but using a blinded methodology

12 of 54

RESULT

Even original authors could not replicate most of the work
Bayer company scientists in 2011 had published results of similar study where they replicated only 25% of studies

13 of 54

14 of 54

WHY IS THIS A BIG DEAL?

Consequences for patient treatments/prevention

Patient advocacy groups are discouraged and angry

Huge financial implications; money is wasted
Affects the public’s view of science

Political ramifications

Affects careers
Strikes at the heart of what science is all about – “truth” should be reproducible

15 of 54

2023 SUMMARY FROM REPRODUCIBILITY PROJECT WEBSITE

The Reproducibility Project: Cancer Biology was 8-year effort to replicate experiments from high-impact cancer biology papers published between 2010 and 2012
The project was a collaboration between the Center of Open Science and Science Exchange
All papers and data freely available

16 of 54

From Reproducibility Project Website

18 of 54

STOP AND CONSIDER THESE FINDINGS

19 of 54

TWO IMPORTANT TAKEAWAYS

Idea of transparency; essential information was missing about procedures and data
Many results could not be reproduced, although some findings were reproducible in this study
What else do you take away from the findings?

20 of 54

SCIENTIFIC COMMUNITY RESPONSE

Some scientists think that it is reasonable to expect problems in reproducibility when cutting edge research is involved:

Academic scientists work at the edge of knowledge
We expect that many of their ideas will be wrong
This is part of science
Experimentation is always indirect
Biological systems are inherently variable

21 of 54

Nonetheless, we can still expect avoidable errors to be reduced and transparency to be improved – we can do better!

22 of 54

MORE ABOUT “TRANSPARENCY”

Important – if the details of a procedure are lost, it cannot be reproduced
If the raw (original) data are lost/undisclosed, the scientific community cannot properly evaluate a study
Transparency has everything to do with DOCUMENTATION, a huge topic that is the subject of another unit in this course

23 of 54

LECTURE OVERVIEW

Introduction; reproducibility in science, a contemporary hot topic
Possible causes of irreproducibility
Making our work reproducible – the focus of this course

24 of 54

CAUSES OF IRREPRODUCIBILITY COMMONLY TALKED ABOUT

Problems with using animal models

Mice are not small people with four legs

Problems with cell lines

“Studies should not be published using a single cell line or model”

Problems with antibodies
Poor use of statistics
Poor experimental design
Flawed culture in academic science

25 of 54

RODENT MODELS

Mice are affected by dozens of things that are difficult to control
For example, height of cage in room affects mice
Presence of male handlers

Even a man’s sweaty t-shirt in room affects mice

Bedding
Food
Etc.

26 of 54

RODENT MODELS

From Harris book:

“Imagine that I was testing a new drug to help control nausea in pregnancy and I suggested to the FDA that I tested it purely in 35 year old white women all in one small town in Wisconsin with identical husbands, identical homes, identical diets, which I formulate identical thermostats that I’ve set, and identical IQs. And incidentally they all have the same grandfather.” That would be recognized as a terrible experiment, “but that’s exactly how we do mouse work. And fundamentally that’s why I think we have this enormous failure rate.” Joseph Garner

27 of 54

CELL LINES

Cultured cells are often used in research
Must not be mixed up; for example, if studying liver enzymes, don’t want to accidentally use cervical cancer cells
Most famous example of mixed-up cell lines is HeLa story
But there are many other examples and thousands of studies that used cell lines that were not what they thought
Almost impossible to clean up the literature
Fortunately, now it is possible to have cell lines authenticated -- but many scientists are still not doing this

28 of 54

ANTIBODIES

Antibodies are used in research to seek out and bind to targets, for example, cell receptors
But antibodies may bind to the wrong molecules; researchers may be unaware that antibody is not binding to what they think it is
To some extent, good experimental design can help with this problem

29 of 54

MUCH WORK IS BEING DONE TO IMPROVE ANTIBODIES

30 of 54

COMMON PROBLEMS IN STUDY DESIGN

One study showed that only 17% of studies used blinded experimental design and random assignment of mice to groups
Blinding means researchers do not know which are experimental and which are control animals

Without blinding, researcher beliefs and attitudes can impact results

Random assignment of individuals to the control or experimental group is essential to help ensure that the two groups are not different to begin with

31 of 54

VARIOUS PROBLEMS IN COMMONLY USED METHODS OF STATISTICAL ANALYSIS

“Batch effect,” experimental and controls have some subtle difference in conditions; e.g., are run on different days where instrument performance differs – can result in differences being reported that are not related to the parameter of interest
Common idea that you need to repeat experiment only three times before reporting it – not based on statistical analysis
P-hacking

Re-analyze data until get results that are significant
This is a practice that is often done but should not be

32 of 54

“HARKING”

Means to create a hypothesis after results are known

Barn analogy:

Suppose someone shoots at a barn for a while and then draws the target around the holes in the barn
Person will look skilled, but is not

A hypothesis must always be generated before a study is performed

33 of 54

TO AVOID “HARKING”

FDA now requires that scientists running clinical trials register their hypotheses before beginning the trials

Robert Kaplan and Veronica Irvin reviewed major studies of drugs and dietary supplements supported by National Heart, Lung, and Blood institute between 1970 and 2012. Prior to law 57% showed efficacy, afterwards, only 8% did.

34 of 54

FLAWED CULTURE

Science career reward system

Competition
Pressure to publish
Incentive to achieve new and exciting results
No incentive to publish negative results -- but they should be reported

Retractions for honest mistakes often viewed as evidence of fraud; means scientists are reluctant to report errors
True fraud is often not corrected

35 of 54

SUGGESTIONS FOR IMPROVEMENT

Fall 2015 conference at Stanford:

Get individual scientists to change their ways
Get journals to change incentives and practices, publish negative results, publish retractions easily, include statistical review for papers using statistics

Use online methods to evaluate work, such as open pre-publication
Require more transparency in publications

36 of 54

RECOMMENDATIONS FROM BEGLEY AND ELLIS

More opportunities to present negative data

Preclinical studies must be required to report all findings
Funding agencies, reviewers and journal editors must agree that negative data is just as informative as positive
Journal editors must play an active role

Greater dialogue between physicians, scientists, patient advocates and patients
Get universities on board

More credit for teaching and mentoring
Reward quality vs quantity
Rely on more than publications in top-tier journals as benchmark

They made specific recommendations for cancer research tools

37 of 54

MORE ABOUT TRANSPARENCY

Journals have already responded by demanding much more detail in research reports, leads to more transparency
Not unusual to lose documentation when a grad student or post-doc leaves the lab; systems must be created to save all raw data
Transparency is related to documentation and traceability, two ideas very familiar to those working in quality systems

We will talk about documentation and traceability more in later discussions

38 of 54

RESPONSE FROM FUNDING AGENCY AND SOCIETIES

NIH Training Modules to enhance Data Reproducibility

Experimental Design
Laboratory Practices
Analysis and Reporting
Culture of Science

Society for Neuroscience

Training Modules to Enhance Data Reproducibility

41 of 54

LECTURE OVERVIEW

Introduction; reproducibility in science, a contemporary hot topic
Possible causes of irreproducibility
Making our work reproducible – the focus of this course

42 of 54

ENHANCING REPRODUCIBILITY REQUIRES REDUCING VARIABILITY �

Enhancing reproducibility is closely related to reducing variability
Reducing variability is understood to be fundamental in all production settings including

Biomanufacturing
Medical devices

Formal quality systems in any production environment aim to control/understand/reduce variability

43 of 54

UNDERSTANDING VARIABILITY

A chart is sometimes used to monitor variability in a process

44 of 54

VARIABILITY AND IRREPRODUCIBILITY

Just as you must truly understand and control variability throughout manufacturing, so you must control it in the lab

Otherwise, cannot achieve reproducibility

45 of 54

VARIABILITY IN THE LAB

”Small changes in experimental design, such as buffer conditions, pH, slight differences in cell line, reagents used in studies, cell culture changes and even differences in tubing and labware suppliers could change the outcome of experiments”

Quote from Biopharm article, Jan 19, 2017: “Reproducibility Project only Partially Able to Validate Findings of Prominent Cancer Studies”

46 of 54

TO REDUCE VARIABILITY, BEGIN AT THE BEGINNING

Beginning with:

Making reagents
Ordering supplies
Running routine (and nonroutine) assays
Making measurements

Scientists often miss basic causes of variability

Assume reagents are made properly and consistently
Assume instruments are properly calibrated
Forget to document everything that might be required in future

47 of 54

CONSIDER THESE DATA

Data when six teams, all with at least BS degree in biological science,

prepare 1 M Tris Buffer, pH 8.0

Conductivity is used to assess

a buffer solution

How might the variability in these solutions affect future results?

48 of 54

Range of pH for Tris buffer made by this group of students, all with at least a BS degree

in a biological science is 7.16 to 8.4.

How might the variability in these solutions affect future results?

49 of 54

REDUCING VARIABILITY REQUIRES ATTENTION AND TRAINING

Reducing variability requires:

Methods of assessing variability, for example, checking conductivity of buffer solutions
Training, simply possessing a degree in biological science is not sufficient
Attention, consistency must always be a goal

50 of 54

INSTRUCTORS’ EXPERIENCE WITH pH

Two instructors spent a day and a half playing with pH
Consistency achieved was + 0.14 pH units on same buffer solution on two different days, using five pH meter systems

51 of 54

INSTRUCTORS FOUND THAT:

Many factors affected pH measurements
3-point vs 2-point calibration modes:

With our meter model, 2-point calibration mode was more accurate than 3-point mode

Difficult to detect faulty electrode

We checked efficiency of each electrode in the lab by plotting mV vs pH readings
Found one faulty electrode – would have been difficult to detect without checking efficiency

Obtained most consistent results by calibrating every two hours

52 of 54

BOTTOM LINE REGARDING pH MEASUREMENTS

It is difficult to reduce variability in pH measurements if you do not pay attention to numerous factors
If pH measurements are inaccurate, how can you expect solutions to be consistent?

53 of 54

TAKEAWAYS FOR THIS COURSE

We will spend a lot of time on:

How to document work in the lab
How to make accurate and consistent measurements
How to prepare biological reagents

Consistently, properly
With attention to monitoring the quality of those reagents

How to perform assays, that is, tests of samples

Consistently, properly
With attention to monitoring the quality of those results

We will always strive to understand/control/reduce variability in our work

54 of 54

TO DELVE DEEPER INTO THE TOPICS IN THIS LECTURE

Chapter 5 in Basic Laboratory Methods for Biotechnology: Textbook and Laboratory Reference, 3^rd Edition has more information about reproducibility in labs, including case studies.

1 of 54

2 of 54

3 of 54

4 of 54

5 of 54

6 of 54

7 of 54

8 of 54

9 of 54

10 of 54

11 of 54

12 of 54

13 of 54

14 of 54

15 of 54

16 of 54

17 of 54

18 of 54

19 of 54

20 of 54

21 of 54

22 of 54

23 of 54

24 of 54

25 of 54

26 of 54

27 of 54

28 of 54

29 of 54

30 of 54

31 of 54

32 of 54

33 of 54

34 of 54

35 of 54

36 of 54

37 of 54

38 of 54

39 of 54

40 of 54

41 of 54

42 of 54

43 of 54

44 of 54

45 of 54

46 of 54

47 of 54

48 of 54

49 of 54

50 of 54

51 of 54

52 of 54

53 of 54

54 of 54