1 of 35

Preprinting a pandemic: trends, dissemination, and regulation of COVID-19 preprints

Jonny A Coates

Voisin lab

William Harvey Research Institute

(Resigned yesterday)

Institute for Globally Distributed Open Research and Education (IDGORE)

Jonny.coates@igdore.org

@JACoates

2 of 35

Traditional publishing is slow, requires more data than ever and hinders ECRs

Sekara et al. PNAS 2018 https://doi.org/10.1073/pnas.1800471115

We need to improve speed of knowledge distribution and our means of quality assessment

Preprints are manuscripts shared online before the completion of journal-organized peer review.

3 of 35

Preprints

Preprints are manuscripts shared online before the completion of journal-organized peer review.

Permanent

Versioned

Citable

Fraser et al, 2019 10.1101/673665v1

Sources: ASAPbio; Puebla et al, in prep.

  • 2020 survey:�42% posted, 81% read

Not peer reviewed ≠ poor quality

peer reviewed ≠ good quality

4 of 35

Months - years

~2 days

Months - years

5 of 35

Preprints were almost poised for a pandemic…

In contrast to the slow, laborious traditional publishing methods

Are preprints being used more than normal to communicate COVID-19 science?

What usage are preprint servers experiencing?

What do COVID-19 preprints look like?

How are preprints being shared?

Can we comment on the quality of preprints?

6 of 35

June

2019

Dec

2019

medRxiv launched

First cases

2020

Chaos

now

7 of 35

Are preprints being used more than normal to communicate COVID-19 science?

Scientific community has rapidly responded to the pandemic

  • 30,260 COVID-19 preprints

  • = 25% of all COVID-19 research

  • 10,232 posted to medRxiv + bioRxiv

  • = 23% of all preprints on medRxiv + bioRxiv

8 of 35

COVID-19 preprints were published at accelerated rates

  • More likely to be published�(21.1% vs 15.4%)�(Chi-square test, p < 0.001)

  • Published more quickly�(median publishing time, �68 days vs 116 days)�(Mann-Whitney test, p < 0.001)

9 of 35

  • How much more quickly depends on publisher!

  • Greatest estimated difference for AAAS (Science), = 102 days

(two-way ANOVA, preprint type*publisher interaction, F9,5273 = 6.6, p < 0.001)

10 of 35

The scientific response to the pandemic was rapid – within 1 month of first case

Preprints represent a significant proportion of the COVID-19 literature

Especially early on….

Summary I

11 of 35

COVID-19 preprints are being accessed and downloaded at unprecedented levels

  • Early COVID-19 preprints viewed 18.2 times and downloaded 27.1 times more than non-COVID-19�(rate ratios from time-adjusted negative binomial GLMs, p < 0.001)

  • Did early COVID-19 preprints slowly accumulate usage?�Or have less competition?

Or COVID fatigue?�

12 of 35

How are COVID-19 preprints being shared?

COVID-19 preprints are cited, tweeted and covered by news organisations

13 of 35

Preprints represent a significant proportion of the COVID-19 literature

Labs are posting preprints for the first time directly as a response to the pandemic

COVID-19 preprints are being accessed, downloaded and shared at unprecedented levels

Summary II

14 of 35

  • Top hashtags associated with top 100 most tweeted preprints shows audiences beyond scientific ones ��– including conspiracy theory and nationalist ideologies

But there is a danger to all this sharing as science is hijacked by right-wing media and conspiracy groups

But! Not just preprints, also seen in published, peer reviewed, articles.

Which is more dangerous?

15 of 35

Can we trust preprints?

16 of 35

~100 articles

Jan – April 2020 (initial phase of the pandemic)

~100 COVID & 100 Non-COVID article pairs (total of 400 manuscripts)

17 of 35

COVID-19 preprints show little change in figure content upon publication

Over 70% (COVID or non-COVID) preprints have only figure rearrangements or no changes upon publication

18 of 35

Over 85% of COVID-19 (>94% of non-COVID-19) abstracts have no significant changes upon publication

6% of non-COVID-19

15% of COVID-19 abstracts undergo a discrete change in key conclusions

19 of 35

  • Variation is difficult to capture!

  • Following study of reporting standards1, examine change in preprint -> publication

  • Large scale (>30,000) NLP analysis of bioRxiv corpus1

  • Separately investigated our subset

  • Couldn’t readily separate our data from the rest of the corpus – suggesting our dataset is applicable to the wider preprint literature

20 of 35

1. What are the rates of preprint posting?

2. Who is posting COVID-19 preprints?

3. How are COVID-19 preprints accessed and shared?

4. Are COVID-19 preprints sufficient “quality”?

100 times past epidemics, ¼ of articles

authors from UK, US, China new to preprinting

18 times more views; 27 times more downloads than non-COVID

quality not detectably different to non-COVID, >85% have no significant changes to key conclusions upon publication

sharing work/code on Twitter & posting preprints can lead to collabs!

Preprints have experienced a cultural shift during COVID-19

21 of 35

From bottom left clockwise:Dr Nicholas Fraser, Leibniz Information Centre for Economics

Dr Liam Brierley, University of Liverpool

Dr Máté Pálfy, Company of Biologists

Dr Gautam Dey, EMBL

Dr Federico Nanni, Alan Turing Institute

Dr Jessica Polka, ASAPbio

Thanks, Grazie, Gracias, danke, Kiitos, Kea leboga, Merci, ob-ree-gah-doh, Asante

22 of 35

Why preprint?

Gives you more visibility & more citations

Fraser et al, 2019 10.1101/673665v1

Steve Royle, https://quantixed.org/2020/03/30/screenager-screening-times-at-biorxiv/

Can use altmetrics in cover letter to journals to show impact

Screening takes ~1 day

Makes academia more equitable and supports ECRs

https://ecrlife.org/why-you-should-publish-your-work-as-a-preprint-a-conversation-with-dr-prachee-avasthi/

Prachee Avasthi

23 of 35

  • COVID-19 rate: 39.5 preprints/day
  • Ebola, Zika rates: < 0.3 preprints/day

This significant use of preprints is unique to the COVID-19 pandemic

24 of 35

So who is publishing all these preprints?

  • Most COVID-19 corresponding authors from US, UK or Chinese institutions
  • First preprints posted close to first cases�(Spearman’s rank, rho = 0.54, p < 0.001)

25 of 35

Labs shifted expertise to best help with pandemic research

26 of 35

usage metric

rate ratio

Blogs

3.7

Wikipedia articles

4.5

Tweets

7.6

Comments

11.0

Citations

13.7

News articles

92.8

(all correlations Spearman’s rank rho estimates)

COVID-19 preprints are being widely shared across multiple platforms

27 of 35

Scientific messages are getting through but significant “hijacking” by right-wing conspiracy groups – and this can be linked to specific preprints

Aerosol and surface stability of HCoV-19 (SARS-CoV-2) compared to SARS-CoV-1

COVID-19 Antibody Seroprevalence in Santa Clara County, California

Nb. This data was sampled earlier than the prev slide

28 of 35

What do COVID-19 preprints look like?

COVID-19 preprints are shorter than non-COVID-19 preprints

29 of 35

  • More authors posting preprint for first time among COVID-19 authors�(Bonferroni-adjusted Chi-square tests, *** p < 0.001, ** p < 0.01, * p < 0.05)

***

***

**

**

***

**

***

*

Dark bar = previously posted preprints

Light bar = First time posting preprints

30 of 35

COVID-19 articles have less data availability and less transparency in peer-review

31 of 35

COVID-19 preprints published more and in wide-array of journals

32 of 35

The degree of change is not associated with any specific location or type of change

33 of 35

Major conclusion changes do not associate with a longer time to publication

34 of 35

The degree of change does not appear to be impacted by final published journal

35 of 35

Impact

Altered processes for revisions of articles

Cited in policy documents

Recommended by expert faculty