1 of 15

Preprint discoverability

coverage, indexing and implications for engagement

ASAPbio Community call 20220825

Bianca Kramer, Sesame Open Science

@MsPhelps with input from Jeroen Bosman (Utrecht University Library)

2 of 15

COVID-19 preprint diversity

3 of 15

Preprint data from Fraser, Nicholas; Kramer, Bianca (2020): covid19_preprints. figshare. Software. https://doi.org/10.6084/m9.figshare.12033672.v40; news data from Nexis Uni, checked 20201216 by Jeroen Bosman

Yet, in newspapers …

December 2020

4 of 15

Not all preprints are seen equally

“All preprints are freely available, but there are persistent disparities in the visibility and attention paid to preprints according to the authors’ institutions, geographical area, language and other backgrounds” (text ASAPbio Community Call)

Effects of visibility / discoverability of different preprint servers?

��

5 of 15

Diversity …

6 of 15

Preprint archives - types

disciplinary

linked to publisher

regional

at

at

by

by

used

by

by

by

with varying governance and ownership: non-profit / community-based / commercial

7 of 15

Preprint archives - characteristics

  • Persistent identifiers
    • DOI (Crossref, DataCite, other registrars)
    • Handle
    • OAI-PMH
    • none
  • Platform
    • Open Science Framework
    • Open Preprint Systems (PKP)
    • Janeway
    • publisher-specific

consequences for the availability, types, and indexing of preprint metadata

8 of 15

Preprint archives - dynamics

  • Changes in platform / publisher
    • EarthArxiv (OSF → California Digital Library (Janeway))
    • INA-Rxiv (OSF) → RINarxiv (Indonesian government)
    • Indiarxiv (OSF → PKP)
    • ChemRxiv (Figshare → Cambridge Open Engage)
  • Multiple platforms / publishers
    • AfricArxiv

9 of 15

10 of 15

Publisher capture of preprint archives/sharing

  • ACS: ChemrXiv (first at Figshare, now at Cambridge Open Engage)
  • AGU: ESSOar
  • Copernicus (PPPR publishing)
  • CUP: Cambridge Open Engage
  • Elsevier: SSRN
  • IEEE: TechrXiv
  • MDPI: preprints.org
  • RSC: ChemrXiv (first at Figshare, now at Cambridge Open Engage)
  • Sage: Sage Advance
  • SpringerNature: ResearchSquare (but RS “publisher neutral”)
  • Taylor & Francis: F1000Research (PPPR-publishing)
  • Wiley: Authorea

11 of 15

Preprint coverage & retrieval in search engines

Full live version: https://tinyurl.com/searchenginecomparison (tab G)

12 of 15

Preprint coverage & retrieval in search engines�- summary (with lots of caveats!)

Full live version: https://tinyurl.com/searchenginecomparison (tab G)

preprint archives present Dec 2020

preprint archives �present Aug 2022

preprint archives with >80% coverage Dec 2020

preprint archives with �>80% coverage Dec 2020

total number of preprints indexed Aug 2022

48

51

40

40

2552692

12

12

9

9

478134

38

47

4

6

1943282

50

58

40

41

724662

30

38

5

17

3351569

32

32

25

25

2362360

18

19

2

2

2233307

62

69

47

47

all 7 together

*

Dimensions

Europe PMC

G. Scholar

LENS

OpenAire

OSF

ScienceOpen

+

?

13 of 15

Discovery depends on

  • Preprint archives:
    • Availability of DOIs and standardized, machine-readable metadata
    • Completeness of metadata (e.g. abstracts)�
  • Search engines / bibliographic databases:
    • Broad inclusion of preprint archives
    • Processing of metadata (esp. publication type, source, publisher)
    • Availability of search fields / filters�
  • Researchers, science journalists:
    • Awareness and choice of search engines
    • Optimal search methods
    • Awareness of preprint landscape

information literacy !

14 of 15

Potential biases

  • Indications of discoverability biases by discipline, language, country, size and type (platform, ownership)? Work in progress�
  • Awareness not only dependent on discovery but also integration in workflows, branding and visibility;
    • easier for larger archives to integrate their services with that of publishers
    • easier for publisher-owned preprint archives to get visibility�
  • Devil’s advocate: diversity vs. consolidation ?

15 of 15

Preprint discoverability

coverage, indexing and implications for engagement

ASAPbio Community call 20220825

Bianca Kramer, SesameOpenScience

@MsPhelps with input from Jeroen Bosman (Utrecht University Library)