1 of 111

2 of 111

The Metascience Lab

Day 1

https://researchonresearch.org/project/a-f-i-r-e/

3 of 111

Welcome From Tom Stafford

Professor of Cognitive Science

& University Research Practice Lead

University of Sheffield

https://tomstafford.github.io/

Senior Research Fellow,

Research on Research Institute

https://researchonresearch.org/

4 of 111

Metascience Lab @ MS2025

- in partnership with Open Philanthropy and RoRI’s AFiRE programme

- three linked sessions will facilitate matchmaking and networking for experimentation

- all areas of metascience, with a focus on interventions to support higher quality, lower cost and more impactful research.

- Each session will showcase metascience principles, methods or examples of experimentation, as well as providing a platform for co-developing new project ideas by participants. Researchers, funders, universities, publishers and other actors in the research ecosystem are invited to propose experiments and matchmake with potential collaborators.

- The Abundance and Growth Fund at Open Philanthropy is happy to consider proposals that emerge from this process

- Topics you’d like considered? Please get in touch

5 of 111

Three days, three themes, three formats

Why and How to experiment

Funder experiments

Building institutional capacity

6 of 111

Welcome From Matt Clancy

Senior Program Officer, Abundance & Growth, Open Philanthropy

https://www.openphilanthropy.org/about/team/matt-clancy/

Senior fellow, Institute for Progress

7 of 111

Today’s plan (DAY ONE)

1400 Chair’s introduction

1405 Matt Clancey, Open Philanthropy: Why metascience needs new experiments

1410 Fiona Booth: "T0255: What is the appropriate ethics and governance framework for meta-research?”

1420 Albert Bravo-Biosca. “T0485 Exploring the use of randomised experiments in metascience”

1430 Response from Misha Teplitskiy

1440 Activity: matchmaking (facilitators: George Richardson, Amanda Kvarven, Youyou Wu, Albert Bravo-Biosca)

1510 Plenary: new idea pitches and challenge suggestions

8 of 111

What is the appropriate ethics and governance framework for meta-research?

Fiona Booth (University of Bristol)

Neil Jacobs (UK Reproducibility Network)

Marcus Munafò (University of Bath)

Pen-Yuan-Hsing (University of Bristol)

9 of 111

Local Community

Wider society

Anticipated harm

Unanticipated

harm

Kaupapa Māori Research^1,2

Who defined the research problem?
For whom is the study worthy and relevant?
Who says so?
What knowledge will the community gain from this study?
What are some likely positive outcomes from this study?
What are some possible negative outcomes?
How can the negative outcomes be eliminated?
To whom is the researcher accountable?
What processes are in place to support the research, the researched and the researcher?

10 of 111

Wamba et al 2024²

11 of 111

“….our story also illustrates the harm and challenges that can occur when researchers do not prioritize developing a foundation of trust with their participants.”

Manage relationships when starting and ending research with human participants (Joel Wambua (Busara), Anisha Singh (London School of Economics and Political Science), Kelvin Kihindas (Common Goal Research Center), Irene Gachungi (DIME, The World Bank) and Patrick S. Forscher (Busara) (2024). In P.S. Forscher & M. Schmidt (eds), A better how: notes on developmental meta-research (pp 161-166). Busara. DOI: doi.org/10.62372/ISCI6112

12 of 111

Trialling narrative CVs

Use of LLMs to screen conference abstracts versus manual review

Non-native speakers may be disadvantaged
Increased workload for applicants & reviewers
Bias against those without institutional support to �adapt to a different style of CV³

Potential for bias
Transparency of decision-making
Risk that authors write in styles adapted to LLMs �rather than styles which are optimised for humans

CONSENT TO PARTICIPATE, FREEDOM TO WITHDRAW WITHOUT PENALTY

Unintended consequences

13 of 111

Safeguards for Meta-Research

Foundation of Trust

Consent (and withdrawal of it)

Cautious interpretation

“For whom is the study worthy & relevant?”

“Who says so?”

“To whom is the researcher accountable?”

14 of 111

References

Guidelines for Researchers on Health Research Involving Māori 2010 Version 2, Health Research Council of New Zealand, ISBN 978-9-908700-86-5
Walker, S., Eketone, A., & Gibbs, A. (2006). An exploration of kaupapa Maori research, its principles, processes and applications. International Journal of Social Research Methodology, 9(4), 331–344. https://doi.org/10.1080/13645570600916049
Manage relationships when starting and ending research with human participants (Joel Wambua (Busara), Anisha Singh (London School of Economics and Political Science), Kelvin Kihindas (Common Goal Research Center), Irene Gachungi (DIME, The World Bank) and Patrick S. Forscher (Busara) (2024). In P.S. Forscher & M. Schmidt (eds), A better how: notes on developmental meta-research (pp 161-166). Busara. DOI: doi.org/10.62372/ISCI6112

15 of 111

16 of 111

Matchmaking activity

1. Are you a problem owner or a researcher?

2. Take three post-its

Problem owners: Yellow

Researcher: Green

3. Ask yourself this question: “if you are a problem owner, what is the most urgent problem you need a solution to? if you are a researcher, what metascientific area are you most excited about researching?”

4. Now write a two word phrase on a post-its along with your name, replicating the same thing on three other post-its

5. From a pair, ideally with someone with a different colour post-it

6. Explain your post-it note in a short pitch, then hand it to the other person in your pair

17 of 111

Suggestions ->

1 minute on this activity

Volunteer to pitch!

18 of 111

Peer review workbench

19 of 111

Plenary

Where are the biggest gains in improving efficiency or effectiveness in the research system?

What is something you learnt about someone else’s problem or perspective?

Topics for tomorrow?

Got an idea? https://forms.gle/E7hBDqwnbtHW9Bki9

20 of 111

Art of Funding @ MS2025

Come join a small group of funders to discuss the “Art of Funding”. Topics may include advancing new ideas within your organization, overcoming bottlenecks, efficiencies, and logistics of making and monitoring awards.

Bring your questions and ideas to share. Drinks will be served.

Please RSVP https://forms.gle/aHhpVWWc2gs7Fpux8

Discussions will continue afterwards at a restaurant of your choice.

21 of 111

Desk Rejection EoI

Funders! We are interested in speaking to research funders who use, or are considering using, quality-review based desk rejection

https://forms.gle/fWL4sa2ZdkU9tEwt8

22 of 111

TOMORROW: Funder experiments

23 of 111

2

24 of 111

Suggestions ->

The Metascience Lab

Day 2

https://researchonresearch.org/project/a-f-i-r-e/

25 of 111

Welcome From Tom Stafford

Professor of Cognitive Science

& University Research Practice Lead

University of Sheffield

https://tomstafford.github.io/

Senior Research Fellow,

Research on Research Institute

https://researchonresearch.org/

26 of 111

Metascience Lab @ MS2025

- in partnership with Open Philanthropy and RoRI’s AFiRE programme

- three linked sessions will facilitate matchmaking and networking for experimentation

- all areas of metascience, with a focus on interventions to support higher quality, lower cost and more impactful research.

- Each session will showcase metascience principles, methods or examples of experimentation, as well as providing a platform for co-developing new project ideas by participants. Researchers, funders, universities, publishers and other actors in the research ecosystem are invited to propose experiments and matchmake with potential collaborators.

- The Abundance and Growth Fund at Open Philanthropy is happy to consider proposals that emerge from this process

- Topics you’d like considered? Please get in touch

27 of 111

Three days, three themes, three formats

Suggestions

Why and How to experiment

Funder experiments

Building institutional capacity

28 of 111

What is an experiment?

PURITY

PLURALISM

RCTs

Planned

Principled

Public

More: https://researchonresearch.org/project/a-f-i-r-e/

29 of 111

Theodore Hodapp

Program Director, Science

Gordon and Betty Moore Foundation

co-chair

AFIRE Programme

Research on Research Institute

researchonresearch.org/project/a-f-i-r-e/

30 of 111

Today’s plan (DAY TWO)

1400 Chair’s introduction

1405 Ted Hodapp, Gordon and Betty Moore Foundation: What a research funder wants

1410 Stephen Pinfield “T0362 Evaluating Distributed Peer Review at the Volkswagen Foundation”

1420 Rhys Thomas and Adrian Barnett. “T0408 Did the switch to using partial randomisation at The British Academy change the characteristics of applicants?”

1430 Eric Brewe ”T0388 Evaluating scientific impact: A control group study at the Gordon and Betty Moore Foundation”

1440 Activity : topic discussions (facilitators: George Richardson, Amanda Kvarven, Youyou Wu, Albert Bravo-Biosca)

1510 Plenary: new idea pitches and challenge suggestions

31 of 111

Evaluating Distributed Peer Review at the Volkswagen Foundation��Anna Butters, Melanie Benson Marshall, Tom Stafford & Stephen Pinfield (Research on Research Institute and University of Sheffield);�Hanna Denecke, Alexander Bondarenko, Barbara Neubauer, Robert Nuske & Pierre Schwidlinski (Volkswagen Foundation)

32 of 111

Distributed Peer Review (DPR)

Potential (being tested)
Builds on accepted mechanism: peer review
Solves reviewer recruitment
Incentivises timely submission by reviewers
Aligns reviewer understanding of call criteria
Trains participants in grant reviewing (and by extension grant writing)
Provides more feedback to applicants
Diversified and democratised grant review
Scalable: more applicants, more reviewers
Accelerated process – time saving
Cost savings

Concerns (being tested)
Lack of expertise
Bias
Gaming the system
Scooping
Time commitment for applicants
Confidence of applicants

Applicants review other applications submitted for the same funding call
Has been used at the European Southern Observatory (ESO), Netherlands Research Council (NWO) and more recently by UK Research and Innovation (UKRI)

33 of 111

DPR Experiment at the Volkswagen Foundation

Experiment at the Volkswagen Foundation for the “Open Up” programme – focus on innovation in the Humanities and Social Sciences
Parallel implementation of DPR and established panel review
Additional funding provided: funding recommendations from both panel review and DPR
Mixed methods analysis of results: quantitative analysis of data from submissions and surveys of participants, and qualitative analysis of interviews with a sample of participants
Rich datasets to gain insight into dynamics of grant peer review e.g.:
Comparisons between review processes
Reviewer uncertainty
Consistency between reviewers
Stability of funding decision
Attitudes of actors

34 of 111

Internal Shortlisting

70 shortlisted

Quick Assessment

45 with 1+ A-, A, A+

Panel discussion

42 discussed

11 proposals recommended for funding

Proposal Matching

323 reviewers

Peer Review

1387 reviews

Proposal ranking

Trimmed mean method

10 proposals recommended for funding

140 proposals submitted

Panel Review

Distributed Peer Review and Panel Review - Parallel Processes

18 proposals funded

3 recommended by both processes

60% overlap

47% overlap

DPR

35 of 111

Panel selected proposals are found across the full range of DPR scores

36 of 111

DPR selected proposals are found across all Panel stages

37 of 111

Some headlines and moving forward

In DPR, more time is spent reviewing but distributed more equally between more people (each applicant completed 4 or 5 reviews)
DPR could reduce the duration of the funding allocation process
DPR and panel reviewers used criteria similarly
Stability increases with more reviews per proposal but no optimal number of reviews
The majority of DPR participants felt positive about the process but positivity higher amongst those who were funded
Comparisons difficult – conventional systems often seen as “tried and tested” but commonly a “black box” (with little feedback) compared with more transparent DPR (each applicant received 9 or 10 review reports)
Important not to see one system as normative but recognise trade-offs
Implications for peer review more widely: From the ‘wisdom of the gatekeeper’ to the ‘wisdom of the (expert) crowd’

Areas of concern, particularly:

Gaming
Workload
Review quality
…

Our work is focusing on how these concerns can be addressed

38 of 111

Please let us know your thoughts!��researchonresearch.org�@RoRInstitute�

t.stafford@sheffield.ac.uk a.l.butters@sheffield.ac.uk

s.pinfield@sheffield.ac.uk m.benson-marshall@sheffield.ac.uk

https://doi.org/10.6084/m9.figshare.29270534.v1

39 of 111

Did the switch to using partial randomisation at The British Academy change the characteristics of applicants?

Metascience Lab (II): Brokering experiments

Rhys.Thomas@dph.ox.ac.uk

1^st July 2025

Presented by Rhys Llewellyn Thomas

Dr Rhys Llewellyn Thomas (University of Oxford), Dr Ken Emond (The British Academy), Professor Philip Clarke (University of Oxford), Professor Adrian Barnett (Queensland University of Technology)

The pattern may be replaced by an image, if preferred. When replacing the cover image, don't forget to �"send to back"

Slide

/ 9

40 of 111

Background

In 2022, The British Academy began trialling a conditional lottery to allocate research funding,
Aim was to assess the benefits of receiving research funding and assess whether the lottery allocation resulted in fewer ex-ante and ex-post biases,

Small Grant Scheme

Awards of up to £10,000,
Tenable for up to 24 months,
Cover the costs associated with a defined research project,
Open to postdoctoral scholars, resident in the United Kingdom.

Conditional Lottery

Two-stage application:

Applicants are required to pass a high-quality threshold, which is assessed by expert academics,
Grants are then randomly allocated to those who pass the threshold.

40

To adjust the slide number total:

Select the “View” tab from the top ribbon
Then select �“Slide Master”
Go to the top master slide and you will be able to edit it there. (All the other slides �will then update automatically.)

Slide

/ 9

41 of 111

Data

Anonymised data from the British Academy for the grant rounds 2020-21 to 2023-24 on all applicants to the:

British Academy Mid-Career Fellowships
British Academy Postdoctoral Fellowships
British Academy/Leverhulme Senior Research Fellowships
British Academy/Leverhulme Small Research Grants (two rounds per year)

Data includes information on current institution, current position, academic discipline, and comparative equal opportunities data,
Data aggregated to the scheme-grant period level for analysis.

41

Slide

/ 9

42 of 111

Empirical Strategy

Statistical analysis uses a two-way fixed effect difference-in-differences estimator,

Basic idea: compare the change in outcomes from before and after the partial randomisation with the change in outcomes of a control group,

Control Group

We only use Postdoctoral Fellowship (PDF) and Mid-Career Fellowship (MCF) as a control group, because applicants to these schemes are more comparable than those to the Senior Research Fellowship,

Outcomes

Total number of applicants
Either total number or proportion of applicants that are:

Female, Male, Asian or Asian British, Black or Black British, White, Mixed/Multiple Ethnic Groups, Other Ethnic Group, from Golden Triangle universities, or Russell Group Universities.

42

Slide

/ 9

43 of 111

43

Slide

/ 9

44 of 111

44

Slide

/ 9

45 of 111

45

Slide

/ 9

46 of 111

46

Slide

/ 9

47 of 111

Discussion and Conclusion

Mechanism of the effect

Introduction of partial increased applicants’ perceived likelihood of receiving funding, leading to an increase in application rates.
This change in perceived likelihood of funding was likely heterogeneous across potential applicants.
Despite higher perceived chances, the unconditional probability of being awarded funding remained roughly the same (from 25% to 26%).
Applicants from minority backgrounds, appeared disproportionately more encouraged to apply, possibly due to reduced concerns about bias in evaluation.

Conclusion

The change to lottery led to large increases in the number of applicants to the British Academy’s Small Grant Scheme.
The lottery likely contributed to more diverse applicants and research topics.

47

Slide

/ 9

48 of 111

Evaluating scientific impact: A control group study at the Gordon and Betty Moore Foundation

Eric Brewe, Meagan Sundstrom, Theodore Hodapp, Catherine Mader, Manolis Antonoyiannakis, Heidi Williams, Sheen S. Levine

48

49 of 111

Acknowledgements

Drexel PER Network

Meagan Sundstrom

Justin Gambrell

Maxwell Franklin

Colin Green

Ibukun Bukola

Ian Olivant

Gordon and Betty Moore Foundation

Tess Labbe

Richard Margoluis

49

50 of 111

Background

50

Experimental Physics Investigator Initiative (EPI)

Goal: Fund transformative science
5 year grants, $1.25 M
Post Tenure

Pre-proposal - feedback
Full proposal - reviews
3 groups

Red - flawed in some way
Yellow - fund if money were no object
Green - fund

51 of 111

Research Question

51

Do people who receive grant funding have more scientific impact than their equal-potential counterparts who do not get funded?

52 of 111

Measuring Scientific Impact

52

Citation indexes - author level

number of citations
h-index, eigenfactor, Erdős number…

Issues

time
field
collaboration.

53 of 111

Network Normalized Citation Index

53

Ke, Q., Gates, A. J., & Barabási, A. L. (2023)

Citation Network

Normalize by average citations of papers in same year.

54 of 111

Random Assignment of Participants

54

Red - Some randomly assigned to Comparison
Yellow - Random assignment to Investigators / Compariton
Green - Investigators

55 of 111

Preregistering study of Ĉ

55

Cohort 1 - 2022

16 Investigators, 8 Comparison
OpenAlex data pull
Calculation of Ĉ₅ for all papers
Bayes Factor t-test to compare <Ĉ₅>

56 of 111

Preregistering study of Ĉ

56

Cohort 1 - 2022

16 Investigators, 8 Comparison
OpenAlex data pull
Calculation of Ĉ₅ for all papers
Bayes Factor t-test to compare <Ĉ₅>

	Role
	Investigator	Comparison
Number of papers	734	434
Average Ĉ	1.86	2.09
Standard Dev. Ĉ	1.98	2.22

Bayes Factor = 0.34 Null model is ~3x as likely

57 of 111

Thank You!

57

58 of 111

59 of 111

Activity - until 1510

Suggestions

1. Find a table according the “science production” process stage you are interesting

(ideally spread yourselves out across all tables)

2. The mission: identify an opportunity for improvement and agree on an IF..THEN.. sentence which captures an intervention (the IF) and the outcome measure (the THEN) in a simple sentence.

3. We will be sharing these at the end.

60 of 111

Suggestions

Suggestion form:

1. Have your details circulated to all attendees (and receive these details + the slides)

2. Suggest topics

3. Record your IF THEN idea

4. Volunteer to pitch

61 of 111

Plenary

Suggestions

1. Sharing our IF - THEN ideas

2. Suggestions for topics for tomorrow / pitches?

Got an idea? https://forms.gle/E7hBDqwnbtHW9Bki9

62 of 111

Art of Funding @ MS2025

Come join a small group of funders to discuss the “Art of Funding”. Topics may include advancing new ideas within your organization, overcoming bottlenecks, efficiencies, and logistics of making and monitoring awards.

Bring your questions and ideas to share. Drinks will be served.

Please RSVP https://forms.gle/aHhpVWWc2gs7Fpux8

Discussions will continue afterwards at a restaurant of your choice.

63 of 111

Desk Rejection EoI

Funders! We are interested in speaking to research funders who use, or are considering using, quality-review based desk rejection

https://forms.gle/fWL4sa2ZdkU9tEwt8

64 of 111

TOMORROW:

Building institutional capacity

65 of 111

3

66 of 111

Suggestions ->

The Metascience Lab

Day 3

https://researchonresearch.org/project/a-f-i-r-e/

67 of 111

Welcome From Tom Stafford

Professor of Cognitive Science

& University Research Practice Lead

University of Sheffield

https://tomstafford.github.io/

Senior Research Fellow,

Research on Research Institute

https://researchonresearch.org/

68 of 111

Metascience Lab @ MS2025

- in partnership with Open Philanthropy and RoRI’s AFiRE programme

- three linked sessions will facilitate matchmaking and networking for experimentation

- all areas of metascience, with a focus on interventions to support higher quality, lower cost and more impactful research.

- Each session will showcase metascience principles, methods or examples of experimentation, as well as providing a platform for co-developing new project ideas by participants. Researchers, funders, universities, publishers and other actors in the research ecosystem are invited to propose experiments and matchmake with potential collaborators.

- The Abundance and Growth Fund at Open Philanthropy is happy to consider proposals that emerge from this process

- Topics you’d like considered? Please get in touch

69 of 111

Three days, three themes, three formats

Suggestions

Why and How to experiment

Funder experiments

Building institutional capacity

70 of 111

McKenzie Leier

Policy Manager

Abdul Latif Jameel Poverty Action Lab | MIT

Science for Progress Initiative

71 of 111

Science for Progress Initiative (SfPI)

McKenzie Leier

72 of 111

J-PAL Has Funded Over 2,200 RCTs Across the Globe

72

Agriculture

Crime, Violence, �& Conflict

Education

Environment & Energy

Finance

Firms

Political Economy �& Governance

Social Protection

Gender

Health

Labor Markets

GROWTH IN J-PAL RCTS OVER TIME

2,223

2024

1,367

2018

792

2013

327

2008

103

2003

J-PAL | Metascience 2025

For those who are unfamiliar with J-PAL, we’re an economics research center based at MIT that specializes in randomized controlled trials. J-PAL was founded by economists so we tend to be more well-known by economists, but happy to have a lot of disciplines represented here and share out to those who come from other backgrounds.

Our mission is to reduce poverty by ensuring that policy is informed by evidence, and we do it through research - funding RCTs, sharing evidence with policy makers, and offering training and education programs on RCTs. On the research front, J-PAL affiliated researchers have conducted over 2,000 randomized evaluations spanning 95 countries. We work many fields, ranging from agriculture to finance to social protection programs and metascience – which I’ll talk a little bit about today.

Our small but mighty Science for Progress Initiative team is not yet featured on our map here, but we’ve funded 4 travel proposal development projects, 4 pilot RCTs, and 2 full RCTs since we started around 2 and half years ago.

73 of 111

Relevant research questions for SfPI

What contracts, incentives, and institutions work best when funding scientific research?

��How can we ensure that the most talented individuals – including younger researchers, those entering science from non-traditional career paths, and those from underrepresented groups – are not discouraged from pursuing science?

��How best should we encourage the diffusion of socially valuable scientific discoveries out of labs and academic papers, so as to encourage innovation and economic growth?

73

74 of 111

High-skilled immigration RCT: A case study in failure

High-risk/high-reward project:

Young researchers
Lack of causal evidence in an area with potential for high policy impact
Implementing partner was a startup

The project does end up failing – the researchers realize their design is not feasible after the pilot. However, this was a worthy failure in our eyes and worth taking a risk on.

74

Oftentimes researchers and funders only share out success stories when talking to external audiences or at conferences, but reflecting on projects that didn't work out can be a really useful exercise and provide a lot of fodder for discussion.

As an initiative, we want to take chances on projects that are high-risk, high-reward; we knew there was a chance that this project might not succeed, but we believed the potential payoff was worth that risk. We had the opportunity to “matchmake” a pair of younger researchers with a startup firm that offered visa application assistance for STEM PhD students in the U.S., and was interested in randomizing some aspect of their work to see if we could learn what marginal value of having an additional high-skilled immigrant in the U.S. was.

So why was this project high-risk, high-reward?

One factor was having younger researcher. As an initiative, we want to support younger researchers and provide opportunities for them; we make an effort as an initiative to give opportunities to young researchers in the hopes of fostering an up and coming cohort of researchers interested in metascience. While this is a principle and commitment we stand by as an initiative, we also know that it does add some risk for researchers who are less experience to work on a project.

The project also had some risk in the beginning due to the fact that the implementing partner was a startup and hadn’t done an RCT before.

Why did this project have the potential to be high impact? The lack of causal evidence in the high-skilled immigration literature meant that results from this project would be hugely valuable. The issue was/is also timely and results could have had real, immediate policy impact.�—----------

So ultimately, this project does end up failing – I’m not going to get into the details of the design for time reasons, but I’m happy to talk with anyone about the project afterwards. The researchers realized after piloting that their intervention was not going to significantly impact the outcomes they were looking for in a detectable way. Further, they realized that while the startup had a really promising setup and data collection infrastructure, they were still a young organization and didn’t have the capacity to support the research team enough to get the project done, nor were they operating at the scale needed to power the experiment.

In the end, we think of this case study as an example where if we had the chance to go back and fund it again - we probably would. We could offer some potential tweaks and guidance to see if it would make a difference, but ultimately we think it’s worth it to take a chance on some high-risk, high-reward projects in your funding portfolio.

—-------�Partner was a startup in the process of building capabilities -- they just weren't ready to operate at scale we needed, and didn't have the bandwidth to support the project, even though the tech they had built would have recorded ideal data for us��We learned, in piloting, that the likely take-up rate for any low-cost intervention would be too low to move outcomes of economic interest. IE, it's cheap to send emails but they will induce very few people to stay in the USA when they would not have otherwise, which creates severe challenges to look at downstream outcomes like earnings or innovation��One valuable lesson I learned was: even if a partner is eager to take on a project, it's important for researchers to ensure they have not just the interest, but also the capacity and capabilities to execute it successfully.

75 of 111

Call for Researchers + Proposals + Contact information

We have a rolling call for proposals for RCTs in metascience
J-PAL is currently accepting invited researcher nominations until August 1st, 2025
Feel free to reach out with ideas/question:

McKenzie Leier – mleier@povertyactionlab.org

75

76 of 111

Today’s plan (DAY THREE)

1130 Chair’s introduction

1135 McKenzie Leier, Poverty Action Lab: Effective partnerships

1145 Tom Stafford “T0354 Can AI be used for better matching of proposals to reviewers? Feasibility and formal evaluation with the Metascience 2025 conference”

1155 Hannelore Vanhaverbeke “T0164: Leveraging Success: How KU Leuven’s Internal Grants Boost External Funding Acquisition”

1205 Pitch consultancy (facilitators: George Richardson, Amanda Kvarven, Youyou Wu, James Phipps)

1220 Plenary: new idea pitches and challenge suggestions

1255 Jordan Dworkin, Open Philanthropy: Closing remarks

77 of 111

Can AI be used for better matching of proposals to reviewers? Feasibility and formal evaluation with the Metascience 2025 conference

Tom Stafford, Amanda Kvarven & The MS2025 Programme Committee

2025-06-27

https://researchonresearch.org/project/a-f-i-r-e/

78 of 111

Meta-metascience

Observation is not enough - we have to try things

- feasibility

- causal inference

Finding (enough, good) reviewers is a conceptual and practical problem

79 of 111

The “shadow” experiment

Consent from those submitting and reviewers

All analyses done after final programme decisions

All analyses local - no data left the conference

441 submissions: Title, Abstracts

25 reviewers: assigned to submissions via keywords

1323 reviews: scores & suitability

(each proposal seen by 3 reviewers)

Research Questions

1. Can language models help match proposals to reviewers?

2. Is it feasible for something like a conference to adopt/adapt this technology

3. Can it be done securely/privacy respecting?

80 of 111

Average Suitability was good

81 of 111

Matching - via embedding

Reviewer keywords & proposal title+abstract -> embedding space

Code from SNSF: https://github.com/snsf-data/snsf-grant-similarity

- thanks Gabriel Osaka and SNSF data team!

Model: SPECTER2: BERT model pre-trained on scientific texts and augmented by a citation graph

82 of 111

Actual proposal-reviewer matching far outperformed random matching

83 of 111

Optimal proposal-reviewer matching outperforms actual matching

84 of 111

You can predict suitability from matching score

85 of 111

…and from this you can predict gain in suitability from using the optimal match

86 of 111

Research Questions

1. Can language models help match proposals to reviewers?

2. Is it feasible for something like a conference to adopt/adapt this technology

3. Can it be done securely/privacy respecting?

Maybe - evidence for meaningful improvements beyond human matching

�Definitely yes

Definitely yes

Caveats:

- restricted range: just metascience & metascientists

- are there better models?

- will predicted gains in suitability pan out in metrics like p(accepts review) or review quality?

Thanks to all participants!

87 of 111

Join the conversation - sign up to the RoRI mailing list for updates on AFIRE projects

researchonresearch.org

@RoRInstitute

Funder Peer learning workshop (online):

Practicalities of implementing

language models locally

8th of July, 2pm BST / 3pm CEST

t.stafford@researchonresearch.org

88 of 111

Leveraging Success ��KU Leuven’s Internal Grants �Boost External Funding Acquisition

KU Leuven – Research Office

Hannelore Vanhaverbeke, Klara Gijsbers & Levent Bingöl

89 of 111

Research Coordination Office

KU Leuven (Belgium) - Research Office – Data Management & Analysis Unit

Reorganisation – broadening scope to metascience

Showcase

Network/learn

90 of 111

Research Coordination Office

KU Leuven (Belgium) - Research Office – Data Management & Analysis Unit

Reorganisation – broadening scope to metascience

Showcase

Network/learn

91 of 111

Government

Vicerector

Policy makers

RMA colleagues

Team (3.2 FTE)

Reports& Lists

Analyses & dashboards

Workflow optimalisation

New ideas

92 of 111

Government

Vicerector

Policy makers

RMA colleagues

Team (3.2 FTE)

Reports& Lists

Analyses & dashboards

Workflow optimalisation

New ideas

External panel review of internal funding mechanisms

93 of 111

Funds

Allocation

Mixed sources

Flemish government (80%)

Directly to institutions

Special Research Fund

Industrial Research Fund

Indirectly & competitively to researchers

Government-subsidized funding agencies

94 of 111

assist KU Leuven researchers in strengthening scientific CVs & develop their research strategies
enable researchers to attract external funding or initiate new collaborations

projects

95 of 111

assist KU Leuven researchers in strengthening scientific CVs & develop their research strategies
enable researchers to attract external funding or initiate new collaborations

projects

96 of 111

Counterfactual analysis

Matched pairs/Difference in difference

Significance of observed difference

Approach

97 of 111

Known issues

Control group: formation

Matthew Effect: how to avoid/reduce?

Causality: how to prove?

Research Coordination Office

98 of 111

Nearest Neighbour

Control group

formation

Research Coordination Office

Data: 2015-2023; 9 cohorts based on start of the C1/C2 grant

Age

Gender

Nationality

% Employment

Years since PhD

Years tenured

Science group

Internal

funding

No internal

funding

99 of 111

Some researchers have obtained prior funding - others have limited/no budget = difference at the start

5 budget classes (amount € per person per year)

< 5 k€

5 - 55 k€

55 - 120 k€

120 - 250 k€

>= 250 k€

only pairs with research budgets of similar magnitude retained

Why not do this before making pairs: tremendous amounts of data & calculations

Matthew effect reduction

Research Coordination Office

198 researcher pairs

100 of 111

Are these really matches?

paired t-tests (Wilcoxon) with Bonferroni correction to account for multiple testing
p-values > 0.05 = not significantly different, therefore comparable

Research Coordination Office

Budget size	Nr of matched pairs	p value
< 5 k€	74	0.54
5 - 55 k€	52	0.36
55 - 120 k€	28	0.03
120 - 250 k€	16	0.68
>250 k€	28	0.23

101 of 111

Leverage effect: expectation

Research Coordination Office

internal

external

102 of 111

Leverage effect: when in evidence?

Research Coordination Office

+ 1 year

start date = (duration C1 or C2)/2

end date = ((duration C1 or C2)/2 + (duration C1 or C2) + 1 year)

Start project

End project

Mid-term project

2015

2018

2017

2020

2021

103 of 111

Leverage effect: results

Null hypothesis: there is no difference between the funded target group and the control group in terms of acquiring external funding

Research Coordination Office

Budget size	Nr of matched pairs	p value	Effect size
< 5 k€	74	< 0.0001	0.7468 (medium)
5 - 55 k€	52	0.0067	0.5492 (large)
55 - 120 k€	28	0.0001	0.9067 (large)
120 - 250 k€	16	0.3942
>250 k€	28	0.0480

Difference-in-Difference: Mann-Whitney U hypothesis testing confirms significant difference between both groups with a medium effect size

104 of 111

Leverage effect: results

Research Coordination Office

Significant leverage effect for researchers with starting budgets under 120 k€

No significant differences observed for those with starting budgets between 120 – 250 k€

however: smallest group in the analysis, caution

Starting budget over 250 k€ initially showed a significant leverage effect, but this dissolved after Bonferroni correction

For budget classes < 5 k€ and 55 - 120 k€, the size effect was significantly large

Negligible or very weak correlations between the amount of external funding and the initial budget size

105 of 111

Looking for input on collaboration

Expertise on data structure, semantics, limitations, …

Experience with ‘standard’ analyses

Input needed esp. on novel approaches

Towards a RMA – researcher collaboration: advice?

106 of 111

Pitch consultancy activity - until 1220

Suggestions

1. Groups of Three People: A (pitching), B & C (consultants)

2. A pitches for <2 minutes, B & C don’t interrupt!

B&C take notes on how to improve the pitch

3. B & C discuss pitch idea, A doesn’t interrupt!

A takes notes on how pitch landed

4. A shares what they learnt

Notes: https://learninginnovation.ca/wp-content/uploads/2020/05/3WayPitch.pdf

107 of 111

Pitches

Mandated external partners: Maria Aleksandrova

Research capacity training: Habeeb Kolade

Communicating Robustness: Alexandra Sarafoglou

Redfining Significance: Jack Fitzgerald

Sharing marginal near hits between funders: Noam Tal-Parry

Funders of Clinical Trials: Maia Salholz-Hillel

Open Research Sabbaticals: Corinne Jola

108 of 111

Targeted Capacity Development and Mentorship for Researchers in the Global South: Interventions to drive employability and higher quality research outputs

Habeeb Kolade | ResearchRound Institute | habeeb@researchround.com

Metascience Conference 2025 | UCL | July 2, 2025

RESEARCH PROJECTS

MENTORING

TARGETED TRAINING

Provide hands-on practice with real-time feedback during mentorship sessions.

Design and deliver various research classes on foundational research skills and interdisciplinary topics.

Support researchers to complete research projects for critical thinking development and

empowering researchers to design and analyze

THEN

PIPELINES OF MOTIVATED SCIENTISTS WITH TRANSFERABLE SKILLS

STRONGER CONFIDENCE IN DOING RESEARCH

HIGHER QUALITY RESEARCH

IF

109 of 111

Suggestions

Suggestion form:

1. Have your details circulated to all attendees (and receive these details + the slides)

4. Volunteer to pitch

110 of 111

Seed Grants

Who we fund

Open to researchers at any university
Priority for studies that generate insights applicable to the U.S. context or other OECD countries
Must be members of the IGL Research Network.
Must be submitted by a Principal Investigator (PI) affiliated with an academic institution.

Examples of activities we’re looking to fund

Activities Leading to RCTs
Pilot Studies
Feasibility Assessments
Intervention Design
Data Collection Methods Development
Preliminary Data Analysis

Funded by the Alfred P. Sloan Foundation, IGL Seed Grants support researchers in piloting innovative experimental ideas and activities that yield the potential to carry out Randomised Controlled Trials (RCTs) that generate high-quality evidence for innovation, science, and productivity.

Funding Range: Awards ranging up to $8,000 USD.

Timeline: Call Opens: 15 September 2025; Deadline for Proposals: 15 October 2025

https://www.innovationgrowthlab.org/seed-grants

111 of 111

Jordan Dworkin

Suggestions

Senior Program Associate, Innovation Policy

Open Philanthropy

jordan.dworkin@openphilanthropy.org