1 of 50

Economics 148 - Reproducibility continued

Data Science for Economics

Spring 2025 - UC Berkeley

Eric Van Dusen

1

LECTURE 18

Econ 148 - UC Berkeley

2 of 50

Outline

Lecture 18, Econ 148, Spring 2025

Reproducibility continued - eg Data Science

Journal Impact Factor

Author example

Controversy - “replication crisis”

Reinhart Rogoff

Brad Lyons - UCOP

2

Econ 148 - UC Berkeley

3 of 50

Journal Impact Factor - Data Science Measure of Journal Importance

Citation Approach

Past 3 years - the average number of citations

2020 - what is the average number of citations for articles published in 2018 and 2019

You need a database of all articles and all the citation and to link all the citations to the original journals

How to evaluate the “importance” of a journal

What journal should a library subscribe to?

3

Who is on the board of a journal

�Who is the editor of a journal

Econ 148 - UC Berkeley

4 of 50

4

https://www.nature.com/news/beat-it-impact-factor-publishing-elite-turns-against-controversial-metric-1.20224

Journal Impact Factor - How many times is a Journal cited

( originally as a question for libraries for which journals to subscribe to)

Econ 148 - UC Berkeley

5 of 50

5

Econ 148 - UC Berkeley

6 of 50

6

Sub Category - development economics

Trends in impact factors

Econ 148 - UC Berkeley

7 of 50

7

Econ 148 - UC Berkeley

8 of 50

Databases track Citations

Web of Science

Google Scholar

Can this be manipulated?

Review Articles / Meta - Analysis ? ( More Citations)

Case Reports? (Less Citations)

8

Econ 148 - UC Berkeley

9 of 50

Emi Nakamura in REPEC

9

Econ 148 - UC Berkeley

10 of 50

Emi Nakamura - Google Scholar -

10

Econ 148 - UC Berkeley

11 of 50

NBER

11

Econ 148 - UC Berkeley

12 of 50

Papers on Nakamura website - Replication Files

12

Econ 148 - UC Berkeley

13 of 50

An Arc to the Story

There are some people who do review other people’s work

There are methods to make things shareable and reproducible

�Just the burden of knowing you are going to deposit a copy of data and code changes approach to data

It's a high pressure thing to get published

It's a hard thing to review other people’s data

Mistakes get through

Incentives to cheat are strong

13

Econ 148 - UC Berkeley

14 of 50

David Broockman - Political Science professor - PS 3

14

https://www.thecut.com/2015/05/how-a-grad-student-uncovered-a-huge-fraud.html

NYT and This American Life

Econ 148 - UC Berkeley

15 of 50

Broockman goes on to do the science right!

15

Econ 148 - UC Berkeley

16 of 50

Academic studies

ON RCTs about dishonesty!

Papers retracted

On Leave

Co-authors on papers!

16

https://www.thecrimson.com/article/2023/6/23/alleged-data-fraud-gino/

Econ 148 - UC Berkeley

17 of 50

https://www.newyorker.com/magazine/2023/10/09/they-studied-dishonesty-was-their-work-a-lie

17

Econ 148 - UC Berkeley

18 of 50

18

Econ 148 - UC Berkeley

19 of 50

Francesca Gino - Formerly of HBS - Suing HBS for $25m

19

Econ 148 - UC Berkeley

20 of 50

https://poetsandquants.com/2023/08/03/harvard-business-school-professor-sues-the-school-for-25-million/

20

Econ 148 - UC Berkeley

21 of 50

She also sued Data Colada!

21

Econ 148 - UC Berkeley

22 of 50

22

Econ 148 - UC Berkeley

23 of 50

23

Econ 148 - UC Berkeley

24 of 50

Reinhart and Rogoff -Growth in a Time of Debt

Why is it a big deal?

Financial Austerity as a response to a recession / economic crisis

Spend more or spend less?

Does it matter that governments borrow above 90% of GDP?

Does having this huge amount of government debt stifle growth?

( Reinhart and Rogoff disavow responsibility for this, but other people quote their work in advocating austerity)

Econ 148 - UC Berkeley

25 of 50

Econ 148 - UC Berkeley

26 of 50

Econ 148 - UC Berkeley

27 of 50

Where is the US at ? What about COVID

Econ 148 - UC Berkeley

28 of 50

Econ 148 - UC Berkeley

29 of 50

Econ 148 - UC Berkeley

30 of 50

Reinhart and Rogoff

Three places to look at

  1. Mysteriously dropped 5 countries because of Spreadsheet Error
  2. Dropped selective years of high debt positive growth for one country ( New Zealand)
  3. Average of country averages - not country year observations. Treating a summary statistic as a unit of observation

Econ 148 - UC Berkeley

31 of 50

Econ 148 - UC Berkeley

32 of 50

Microsoft Excel

The code is not quite reviewable - as a body of code

  • You have to dig into each individual cell
  • You can’t rerun all the analysis on a body of data - code is inseparably embedded into the document

Comments or explanations of code are not viewable, at least in an overall document

Motivation for Lecture NB - think about Jupyter vs Excel

Econ 148 - UC Berkeley

33 of 50

Selective use of # Years

exclude Australia (1946-1950), New Zealand (1946-1949), and Canada (1946-1950)

Econ 148 - UC Berkeley

34 of 50

Mysteriously dropped 4 countries

Dropped 5 countries apparently by alphabetical order

Australia

Austria

Belgium ( only Belgium is in the over 90% category)

Canada

Denmark

Econ 148 - UC Berkeley

35 of 50

Averaging across averages

Looking at country interval averages, not country -years as a unit of analysis

Weighting - a country with one year in high debt - weighted equally to a country with several years in high debt

Perhaps they didn’t think about weighting at all , but just ran analysis

Econ 148 - UC Berkeley

36 of 50

36

Unconventional Weighting. Reinhart-Rogoff divides country years into debt-to-GDP buckets. They then take the average real growth for each country within the buckets. So the growth rate of the 19 years that the U.K. is above 90 percent debt-to-GDP are averaged into one number. These country numbers are then averaged, equally by country, to calculate the average real GDP growth weight.

In case that didn’t make sense, let’s look at an example. The U.K. has 19 years (1946-1964) above 90 percent debt-to-GDP with an average 2.4 percent growth rate. New Zealand has one year in their sample above 90 percent debt-to-GDP with a growth rate of -7.6. These two numbers, 2.4 and -7.6 percent, are given equal weight in the final calculation, as they average the countries equally. Even though there are 19 times as many data points for the U.K.

Econ 148 - UC Berkeley

37 of 50

Lecture NB

37

Econ 148 - UC Berkeley

38 of 50

Reproducibility Policies

Journals Now require the submission of data and code

38

Econ 148 - UC Berkeley

39 of 50

AEA website - all papers must have a replication package

39

Econ 148 - UC Berkeley

40 of 50

40

Econ 148 - UC Berkeley

41 of 50

https://www.icpsr.umich.edu/files/deposit/dataprep.pdf

41

Econ 148 - UC Berkeley

42 of 50

42

Econ 148 - UC Berkeley

43 of 50

https://aeadataeditor.github.io/

43

Econ 148 - UC Berkeley

44 of 50

https://www.bitss.org/education/bitss-textbook/

44

Econ 148 - UC Berkeley

45 of 50

45

Econ 148 - UC Berkeley

46 of 50

https://www.openicpsr.org/openicpsr/search/aea/studies

46

Econ 148 - UC Berkeley

47 of 50

47

Econ 148 - UC Berkeley

48 of 50

48

Econ 148 - UC Berkeley

49 of 50

49

Econ 148 - UC Berkeley

50 of 50

Opportunity Insights

50

Econ 148 - UC Berkeley