ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
If you have any datasets you'd like to add, please add a comment to this sheet and tag me.
2
3
Free and publicly available datasets
4
DatasetUnitFrequencyWebsiteDescription
5
Discern 2.0Patent1980 to 2021
https://zenodo.org/records/13153196
Links patents and scientific publications to U.S. publicly listed firms, taking subsidies into account
6
American StoriesArticleDaily
dell-research-harvard/AmericanStories · Datasets at Hugging Face
2 centuries of newspaper articles - easy python import
7
ACSHouseholdYearly
https://usa.ipums.org/usa/
Big US household survey 2006-onwards
8
IPUMSVariousVarious
https://www.ipums.org/
Census and survey data from around the world integrated across time and space. IPUMS integration and documentation makes it easy to study change, conduct comparative research, merge information across data types, and analyze individuals within family and community contexts.
9
FREDNationalDaily
https://fred.stlouisfed.org/
Macro time series, Pandas can import
10
Zillow research dataZipcodeMonthly
https://www.zillow.com/research/data/
Housing prices, sales, etc
11
Realtor.com dataZipcodeMonthly
https://www.realtor.com/research/data/
Housing prices, sales, etc
12
American Housing SurveyHouseholdAnnual
https://www.census.gov/programs-surveys/ahs/data.html
Housing characteristics (costs, physical condition, with household characteristics)
13
Home Mortgage Disclosure ActMortgageAnnual
https://www.consumerfinance.gov/data-research/hmda/historic-data/
Mortgage rates, LTV, borrower characteristics
14
National Survey of Mortgage OriginationsHouseholdQuarterly
https://www.fhfa.gov/DataTools/Downloads/Pages/National-Survey-of-Mortgage-Originations-Public-Use-File.aspx
Mortgages
15
Consumer expenditure surveyHouseholdYearly
https://www.bls.gov/cex/
Consumer expenditures on very fine categories
16
Survey of consumer financesHouseholdYearly
https://www.federalreserve.gov/econres/scfindex.htm
Consumer expenditures and finance data
17
American time use surveyHouseholdYearly
https://www.bls.gov/tus/ AND https://timeuse.ipums.org/
Consumer time use
18
Panel Study of Income DynamicsHouseholdYearly (panel)
https://psidonline.isr.umich.edu/
Tracks consumers over time, so can be used to look at career progression, etc.
19
Call reportsBankQuarterly
https://cdr.ffiec.gov/public/
Panel data on banks' balance sheets
20
American Fact FinderGeographyYearly
https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml
Various firm + household characteristics
21
NHGISGeographyYearly
https://data2.nhgis.org/main
Various geographic characteristics (demographics, etc.)
22
General Social SurveyHouseholdYearly
https://gss.norc.org/
Demographics, social characteristics, attitudes
23
CFPB dataVariousVarious
https://www.consumerfinance.gov/data-research/
Various
24
Data.govVariousVariousData.govVarious
25
data.census.govVariousVarious
data.census.gov
Various census datasets.
26
data.worldVariousVariousdata.worldVarious
27
https://ourworldindata.orgVariousVarious
https://ourworldindata.org
Many data series, and a python package for easy data loading https://github.com/owid/owid-importer . See their other repos for hacks to make analysis easier.
28
CongressDataMemberYearly
https://cspp.ippsr.msu.edu/congress/
Data from 1789-2021 on characteristics of congressional districts, the members of congress themselves, and the behavior of those members in policymaking.
29
Household Pulse SurveyHouseholdWeekly
https://www.census.gov/programs-surveys/household-pulse-survey/datasets.html
Weekly COVID survey
30
I³ Open Innovation Dataset IndexVariousVarious
https://iiindex.org/datasets
A collection of innovation datasets, and related tools, platforms and resources
31
Harvard DataverseVariousVarious
https://dataverse.harvard.edu/dataverse/harvard
Includes replication kits for publications in some journals. Recent finance and econ pubs in many journals require replication code, and these can be found on their sites. Also, google for the authors personal sites (like mine!), often code and data will be there even if not on the journal site.
32
Management ScienceVariousVarious
https://pubsonline.informs.org/toc/mnsc/current
MS is an A+ journal and has required replication data and code for years. Go to the page of an article and look for a link to "Supplementary Materials" to get code.
33
Pandas-datareaderN/AN/A
https://pydata.github.io/pandas-datareader/remote_data.html
Data importer.
34
Ken French data libraryPortfolioVarious
https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
Portfolio returns. Pandas can import.
35
Open Asset PricingPortfolioVarious
https://www.openassetpricing.com/
Portfolio returns, asset pricing code for tests.
36
Assaying AnomaliesN/AN/A
https://sites.psu.edu/assayinganomalies/
Immediately test if your asset pricing signal produces alpha.
37
OECDCountryYearly
https://stats.oecd.org/
Country statistics. Pandas can import.
38
World BankVariousVarious
https://data.worldbank.org/
Panel data around the world on many subjects.
39
QuandlFirmDaily
https://www.quandl.com/
Stock prices. Pandas can import.
40
Yahoo FinanceFirmDaily
https://finance.yahoo.com/
Stock prices. Pandas can import.
41
Notre Dame Software Repository for Accounting and Finance
N/AN/A
https://sraf.nd.edu/
Textual analysis resources, designed for business text. Business-domain sentiment word lists, code and resources.
42
Parsing messy names (people, address, firms)N/AN/A
https://github.com/datamade?q=&type=all&language=&sort=stargazers
Repos: usaddress and probablepeople
43
Subnational ideology and presidential vote estimatesVariousYearly
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BQKU4M
2006-2021 estimates of the ideology and presidential voting behavior for ~1 million survey respondents, ZCTA's, cities, counties, sub-county units, school districts, state legislative districts, and congressional districts
44
US Panel Study of Income Dynamics (PSID)HouseholdYearly
https://psidonline.isr.umich.edu/default.aspx
Longitudinal household survey
45
46
Highly useful but not free
47
DatasetUnitFrequencyWebsiteDescription
48
CRSPStock and asset price/return/trading data
49
CompustatAccounting data on public firms
50
I/B/E/S
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100