ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAP
1
UUIDTitleRecord Creation TimestampShortnameLocationCostDescriptionTerms of useTimeframeDocumentationError MetricsCitationCodeVersioningAPI or Bulk downloadsTagsDOIRelated PublicationsRelated Datasets
2
bd8a562a-ce58-4a61-925d-88f0d0695974PatCit11/17/2020 10:38:00patcithttps://doi.org/10.5281/zenodo.3710993NoneIn-text and front page citations to non-patent literature and in-text patent citations, extracted and parsed. Open source projectCC-BY 4.0 International1836-2018https://cverluise.github.io/PatCit/yesCyril Verluise, Gabriele Cristelli, Kyle Higham, Lucas Violon, & Gaétan de Rassenfosse. (2020). PatCit: A Comprehensive Dataset of Patent Citations (Version 0.3.1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4391095https://github.com/cverluise/PatCitYesBulkCitations, In-text, Front page, Patent, Science, Database, Wikipedia, Standardhttps://doi.org/10.5281/zenodo.3710993https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3754772
3
e65da1db-6608-4246-98a7-c260dfc28e45Chilean IP and firm data11/13/2020 17:20:46chilean_iphttps://eml.berkeley.edu//~bhhall/Chile_ipdata.htmlNoneThese data are a public release from a joint WIPO-INAPI project. not specified1995-2005https://eml.berkeley.edu//~bhhall/Chile_ipdata/chile_inno_ip.txtAbud, M.J., Fink, C., Hall, B. and Helmers, C., 2013. The use of intellectual property in Chile (Vol. 11). WIPO.Bulk
4
50fbdb5a-1288-46e9-b93d-27ac99cd4eb2The scientific knowledge base of low carbon energy technologies (updated and extended version)6/13/2021 20:55:50low_carbon_knowledgehttps://doi.org/10.4119/unibi/2950291NoneThis data publication offers updated data about low-carbon energy technology (LCET) patents and citations links to the scientific literature. Compared to a previous version, it also contains data on biofuels and fuels from waste technologies. The updated version also contains the code (R-scripts) that have been used to (1) compile the data and (2) to reproduce the statistical analysis including figures and tables presented in the final paper Hötte, Pichler, Lafond (2021): "The rise of science in low-carbon energy technologies", RSER. DOI: 10.1016/j.rser.2020.110654. CC BY 4.0 license. See: https://creativecommons.org/licenses/by/4.0/legalcode 1836-2019https://doi.org/10.4119/unibi/2950291NoHötte, Pichler, Lafond (2021): "The rise of science in low-carbon energy technologies", RSER. DOI: 10.1016/j.rser.2020.110654Included in the bulk downloadNoBulkcitations to scholarly literature, low-carbon energy technologieshttps://doi.org/10.4119/unibi/2950291
5
2a0949bb-2f36-45a7-b4cf-109456cec21dChinese Patent Data Project11/14/2020 17:20:46chinese_patent_datahttps://sites.google.com/site/sipopdb/cpdp-homeNoneIn this project, patents from China's State Intellectual Property Office (SIPO) are matched to various types of companies. Matching SIPO patents to firms in the Annual Survey of Industrial Enterprises (ASIE) of China's National Bureau of Statistics.Bulk
6
53f2e34b-8088-42a3-a763-f471c26b5ac6Reliance on Science in Patenting11/16/2020 17:20:46ronshttps://zenodo.org/record/3575146#.XfQZMWRKiUkNoneThis contains citations from the front pages of worldwide patents to articles in the Microsoft Academic Graph (MAG) from 1800-2018. Open Data Commons Attribution License v1.01834-2019https://zenodo.org/record/4235193#.X6Fgb5CSm38YesMarx, Matt and Aaron Fuegi, "Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles"https://github.com/mattmarx/reliance_on_scienceYesBulkerror marginshttps://doi.org/10.5281/zenodo.3575146
7
07ec4549-2429-4e8e-9ee3-6deefca0b075Japanese Patent Office11/15/2020 17:20:46japanese_patent_officehttp://www.iip.or.jp/e/index.htmlNonePatent database of the IIPOnly for use by academic research institutions and other institutions for academic research purposes, cannot be used for commercial purposes.1964-9/2019State that you used: III Patent DBBulk
8
bfc3892d-2170-47ed-b056-a573c845efa5MIT Scholarly Works 1950-201811/17/2020 17:20:46mit_scholarlyhttps://lens-public.s3-us-west-2.amazonaws.com/sloan/scholarly/201932/mit_scholarly.zipNoneScholarly works produced by MIT 1950-2018
9
265a814e-a4a5-4302-9cc0-0f78cf1c70fcMIT Scholarly Works Cited by Patents11/18/2020 17:20:46mit_scholarly_citationshttps://lens-public.s3-us-west-2.amazonaws.com/sloan/scholarly/201932/mit_scholarly_cited_by_patents.zipNoneMIT Scholarly Works Cited by Patents 1950-2018
10
6476ac03-71ee-4480-b2aa-e25871179689Patents Citing MIT Publications11/19/2020 17:20:46patents_citing_mithttps://www.lens.org/lens/search/patent/list?collectionId=22790&p=0&n=10NoneThis collection encompasses patents that cite the scholarly works of Massachusetts Institute of Technology.
11
a238826e-8135-4b6d-8b59-615fc9769f03Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database5/14/2022 14:41:04co_authorship_disambiguationhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/5F1RRINoneName disambiguation of US inventors, 1975-2010CC0 - "Public Domain Dedication"Ronald Lai; Alexander D'Amour; Amy Yu; Ye Sun; Lee Fleming, 2011, "Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database (1975 - 2010)", https://doi.org/10.7910/DVN/5F1RRI, Harvard Dataverse, V5, UNF:5:RqsI3LsQEYLHkkg5jG/jRg== [fileUNF] https://github.com/funginstitute/downloadscoauthor networkhttps://doi.org/10.7910/DVN/5F1RRI
12
3e2ed123-d6c0-46af-8683-e23d64b04efcThe careers and co-authorship networks of U.S. patent-holders, since 197511/21/2020 17:20:46co_authorship_careershttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YJUNUNNoneThe identification enables construction of social networks based on patent co-authorship. We will eventually provide descriptive statistics of individual and collaborative variables and illustrated examples of networks for an individual, an organization, a technology, and a region. The data and code will be publically available for community use and improvement and will enable updating as frequently as new patents are issued. CC0 - "Public Domain Dedication" Ronald Lai; Alexander D'Amour; Lee Fleming, 2010, "The careers and co-authorship networks of U.S. patent-holders, since 1975", https://doi.org/10.7910/DVN/YJUNUN, Harvard Dataverse, V3, UNF:5:daJuoNgCZlcYY8RqU+/j2Q== [fileUNF] coauthor networkhttps://doi.org/10.7910/DVN/YJUNUN
13
00c6f78f-f689-4d50-a965-812bfd528477Penn World Tables11/22/2020 17:20:46pwthttps://www.rug.nl/ggdc/productivity/pwt/?lang=enNonePWT version 10.0 is a database with information on relative levels of income, output, input and productivity, covering 183 countries between 1950 and 2019. Access to the data is provided in Excel, Stata and online formats.CC 4.01950-2017https://www.rug.nl/ggdc/docs/pwt100-user-guide-to-data-files.pdfFeenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), "The Next Generation of the Penn World Table" American Economic Review, 105(10), 3150-3182, available for download at www.ggdc.net/pwthttps://doi.org/10.15141/S50T0R
14
068fb03e-642a-4896-b61c-ff6a16251e08Worldwide Count of Priority Patents11/23/2020 17:20:46priority_patentshttp://www.gder.info/download_wwc_excel.htmlNoneThe goal of the project was to produce a dataset of priority patent applications filed across the globe, allocated by inventor and applicant location.De Rassenfosse, G., Dernis, H., Guellec, D., Picci, L., & van Pottelsberghe de la Potterie, B. (2013). The worldwide count of priority patents: A new indicator of inventive activity. Research Policy, 42(3), 720–737. doi:10.1016/j.respol.2012.11.002 http://www.gder.info/download_wwc_mysql.html
15
6fe3b5e5-93a8-4f07-9331-d9998b9000b8Geocoding of worldwide patent data11/24/2020 17:20:46geocoding_patentshttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/OTTBDXNoneCC0 - "Public Domain Dedication" 30 yearsA detailed data description can be found in de Rassenfosse, Kozak, Seliger 2019: Geocoding of worldwide patent data, published in 'Scientific Data' and available at https://doi.org/10.1038/s41597-019-0264-6Seliger, Florian; Kozak, Jan; de Rassenfosse, Gaétan, 2019, "Geocoding of worldwide patent data", https://doi.org/10.7910/DVN/OTTBDX, Harvard Dataverse, V5 https://github.com/seligerf/Imputation-of-missing-location-information-for-worldwide-patent-dataYgeographyhttps://doi.org/10.7910/DVN/OTTBDX
16
d76b71a1-2f43-447d-b296-a1b52db6e3d7On the price elasticity of demand for patents11/25/2020 17:20:46patent_price_elasticityhttp://www.gder.info/download_OBES_data.htmlNoneFees since 1980 at the European (EPO), the US and the Japanese patent offices.Rassenfosse, G. de, & Potterie, B. van P. de la. patent demand
17
c66bdabd-a80c-4a7e-b9b9-f706e4ed7395Patents arising from U.S. government funding11/26/2020 17:20:46us_gov_patentshttps://zenodo.org/record/3369582NoneDataset of patents arising from government funding since the year 2000. CC-BY 4.0 International2000-2019de Rassenfosse Gaétan, & Emilio Raiteri. (2019). 3PFL: Database of Patents and Publications with a Public-Funding Linkage (Version 1.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3369582https://doi.org/10.5281/zenodo.3369582
18
e390a212-3a92-4d8f-ac4d-ca2c960a36d3PATSTAT11/27/2020 17:20:46patstathttps://www.epo.org/searching-for-patents/business/patstat.html#tab3€975.00 - € 1460.00PATSTAT contains bibliographical and legal event patent data from leading industrialised and developing countries. This is extracted from the EPO’s databases and is either provided as bulk data or can be consulted online. Requires a subscription to accessPATSTATpatstat cookbook' by Gaétan de Rassenfosse https://onlinelibrary.wiley.com/doi/full/10.1111/1467-8462.12073
19
c39f4844-5ae2-4dcb-bf2c-d6b957125704Lens.org11/28/2020 17:20:46lenshttps://lens.org/NoneLens serves nearly all of the patent documents in the world as open, annotatable digital public goods that are integrated with scholarly and technical literature along with regulatory and business data. The Lens will allow document collections, aggregations, and analyses to be shared, annotated, and embedded to forge open mapping of the world of knowledge-directed innovation. Cambia grants you a non-exclusive, non-transferable, revocable, limited license to access and personally use the features of the Service. The conditions by which The Lens data may be used are intended to resonate with the principles of Creative Commons Attribution licenses with a public benefit element.Please use the expression 'Enabled by The Lens' or 'Data Sourced from The Lens' and the Lens.org URL.
20
9c4124ed-5337-4b36-a1c9-7cf256a3384bMicrosoft Academic Graph11/29/2020 17:20:46maghttps://academic.microsoft.com/homeNoneThe Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals, conferences, and fields of study. ODC-BYArnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MA) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246. DOI=http://dx.doi.org/10.1145/2740908.2742839 K. Wang et al., “A Review of Microsoft Academic Services for Science of Science Studies”, Frontiers in Big Data, 2019, doi: 10.3389/fdata.2019.00045Microsoft Academic
21
233d7290-f32f-46bb-8a6d-8837e59d9ffbCrios‐Patstat Database11/30/2020 17:20:46crios_patstathttps://www.icrios.unibocconi.eu/wps/wcm/connect/Cdr/Icrios/Home/Resources/Databases/PATENTS-ICRIOS+database/NoneDisambiguated inventor's and applicant's names for EPO records.EPO LicenseFor a detailed description of the algorithm please refer to Coffano, Monica and Tarasconi, Gianluca, Crios - Patstat Database: Sources, Contents and Access Rules (February 1, 2014). Available at SSRN: http://ssrn.com/abstract=2404344Coffano, M., & Tarasconi, G. (2014). CRIOS - Patstat Database: Sources, Contents and Access Rules. SSRN Electronic Journal. doi:10.2139/ssrn.2404344
22
d9cf4e57-a90e-4d18-8a3b-08fea43a2f49NBER US Patent Data Project12/1/2020 17:20:46nber_citationhttps://sites.google.com/site/patentdataproject/Home/downloadsNoneThe main dataset extends from Jan 1, 1963, through december 30, 1999, and includes all the utility patents granted during that period. The citations file includes all citations made by patents granted in 1975-1999.1963-1999The main dataset extends from Jan 1, 1963, through december 30, 1999, and includes all the utility patents granted during that period. The citations file includes all citations made by patents granted in 1975-1999.Bronwyn H. Hall, Jim Bessen, Grid Thoma
23
cf1780b1-e265-4e49-8d1d-83b9cfe0fd9aUSPTO PatentsView12/2/2020 17:20:46patentsviewhttps://www.patentsview.org/download/NonePatentsView includes US patent data including raw data and disambugations of inventors and assignees, also inventor gender.Creative Commons Attribution 4.0 International License.1963-1999Provided at linkAttribution should be given to PatentsView for use, distribution, or derivative works.https://github.com/CSSIP-AIR/PatentsView-Code-Snippets/
24
6f3605ad-5edb-4a73-8b3b-6d6d35064d4cMicrosoft Academic Knowledge Graph12/3/2020 17:20:46makghttp://ma-graph.org/NoneA large RDF data set with over eight billion triples with information about scientific publications and related entities, such as authors, institutions, journals, and fields of study. The data set is based on the Microsoft Academic Graph and licensed under the Open Data Attributions license. Furthermore, we provide entity embeddings for all 210M represented scientific papers.Open Data Commons Attribution License (ODC-By) v1.0@inproceedings{DBLP:conf/semweb/Farber19, author = {Michael F{\"{a}}rber}, title = "{The Microsoft Academic Knowledge Graph: {A} Linked Data Source with 8 Billion Triples of Scholarly Data}", booktitle = "{Proceedings of the 18th International Semantic Web Conference}", series = "{ISWC'19}", location = "{Auckland, New Zealand}", pages = {113--129}, year = {2019}, url = {https://doi.org/10.1007/978-3-030-30796-7\_8}, doi = {10.1007/978-3-030-30796-7\_8} }https://github.com/michaelfaerber/makg-linkingMicrosoft Academic
25
303ce18b-f411-4752-9fe6-d4fcc369f43cIPRoduct12/4/2020 17:20:46iproducthttps://iproduct.io/appNoneThe IPRoduct project seeks to link innovative goods to the patents upon which they are based. By directly linking products to patents, this project tracks innovation to the point where it meets consumers, the true commercial end point of investments in Science & Technology. The output of the project is a database of linked product-patent pairs that is made publicly available.Products
26
50c1e32c-d2f5-4328-be8e-b7f172772a26Replication Data for: Government-funded research increasingly fuels innovation12/05/2020 17:20:46gov_research_fuels_innovationhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRCNoneThis includes patent level metadata, 1926-1975 (OCRed from USPTO Image PDF files), 1976-2017 (parsed from USPTO HTML files), patent meta data, CPC, geography, agencies, entity size of the patent owner etc, government support categories at patent level and finally, aggregate yearly statistics. (2019-06-02) CC0 - "Public Domain Dedication"1926-1975 and 1975-2017 Lee Fleming; Hillary Green; Guan-Cheng Li; Matt Marx; Dennis Yao, 2019, "Replication Data for: Government-funded research increasingly fuels innovation", https://doi.org/10.7910/DVN/DKESRC, Harvard Dataverse, V4, UNF:6:kMIqsh3DCvKiKYgMT6/H8A== [fileUNF] https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRC
27
d24e8a7e-7d27-4280-9d85-c6598a1b9b8eGoogle Patents Public Datasets12/6/2020 17:20:46google_patents_publichttps://console.cloud.google.com/marketplace/details/google_patents_public_datasets/google-patents-public-dataNoneWorldwide (100+ countries) bibliographic and USPTO full-text, available via BigQuery. Provided by IFI CLAIMS Patent Services, a worldwide bibliographic and US full-text dataset of patent publications. Updated quarterly.CC BY 4.0, requires subscription to query API1834-present (quarterly)https://cloud.google.com/blog/topics/public-datasets/google-patents-public-datasets-connecting-public-paid-and-private-patent-data“Google Patents Public Data” by IFI CLAIMS Patent Services and Google, used under CC BY 4.0patent analysis sample code: https://github.com/google/patents-public-data, source code not accessibleYes, quarterlyAPI, Bulk exportGoogle Patents
28
ff4ffcf9-5721-4148-ac59-140b9ed4dab5Semantic Scholar Open Research Corpus12/7/2020 17:20:46sem_scholar_open_researchhttp://s2-public-api-prod.us-west-2.elasticbeanstalk.com/corpus/Semantic Scholar's records for research papers published in all fields provided as an easy-to-use JSON archive. ODC-BYWaleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL https://www.semanticscholar.org/paper/09e3cf5704bcb16e6657f6ceed70e93373a54618 Citation affect
29
e80542a8-a9bb-4205-8364-c0e9f3a2b683UVA Darden Global Corporate Patent Dataset (disambiguated assignees)11/13/2020 17:47:00uva_global_corporate_patentshttps://patents.darden.virginia.edu/The dataset has information on about 3 million USPTO patents, which were granted between 1980 and 2017, assigned to publicly listed companies worldwide, and linked to those assignee companies using the following identifiers: Unique Patent Number, as given by the USPTO, GVKEY, as the firm identifier, from the S&P Compustat Global database. CC BY-NC 4.0 Attribution-NonCommercial 4.0 International1980-2017https://patents.darden.virginia.edu/documents/DataConstructionDetails_v01.pdfJan Bena, Miguel A. Ferreira, Pedro Matos, and Pedro Pires. "Are foreign investors locusts? The long-term effects of foreign institutional ownership." Journal of Financial Economics 126, no. 1 (2017): 122-146
30
f2fcc603-7883-4e18-a82a-6275ffd82e98DISCERN patent/compustat crosswalk11/13/2020 17:47:00discernhttps://zenodo.org/record/4320782#.YONFTugzY2wPatents (as well as scientific articles, and NPL citations at the aggregate firm-level) matched to U.S. Compustat firms over the period 1980-2015. In extending the match to Compustat up to 2015, we address two major challenges: name changes and ownership changes. Our UO and subsidiary historical standardized firm name lists, including the dynamic reassignment, are publicly available for researches to match to their database of interest.1980-2015Provided at linkArora Ashish, Belenzon Sharon, and Sheer Lia, 2021. "Knowledge spillovers and corporate investment in scientific research". American Economic Review, 111(3), pp.871-98.​

Arora Ashish, Belenzon Sharon, and Sheer Lia, 2021. "Matching patents to Compustat firms, 1980–2015: Dynamic reassignment, name changes, and ownership structures". Research Policy, 50(5), p.104217.
YesBulkCompustat, Patents, Publications, NPL, Name changes, Dynamic reassignment, GVKEYhttps://doi.org/10.5281/zenodo.4320782`
31
f1a7dfa7-c1f0-4414-a6b9-5a0f0d0e37f1Patent Citation Similarity11/14/2020 17:47:00patent_citation_similarityhttps://storage.googleapis.com/jmk_public/Kuhn-Younge-Marco_Patent_Citation_Similarity_2017-10-23.csvMany studies of innovation rely on patent citations to measure intellectual lineage and impact. To create this dataset, we use a vector space model of patent similarity to compute the technological similarity between each pair of citing-cited patents. The VSM model analyzes the full text of each document to position it as a vector in a vector space that includes more than 700,000 dimensions and then calculates the angular distance between the two vectors. The dataset includes similarity values for all citations made by patents issued between 1976 and 2017 to issued patents or published patent applications.These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article.1976-2017Paper: https://ssrn.com/abstract=2714954https://ssrn.com/abstract=2714954
32
b547441d-efdd-4b30-8c78-852d68c9c2acPatent Scope and Examiner Toughness11/15/2020 17:47:00patent_scope_toughnesshttps://storage.googleapis.com/jmk_public/Kuhn-Thompson_Patent_Scope_2017-10-23.csvThis dataset includes an easy-to-use measure of patent scope that is grounded both in patent law and in the practices of patent attorneys. Our measure counts the number of words in the patents’ first claim. The longer the first claim, the less scope a patent has. This is because a longer claim has more details – and all those details must be met for another invention to be infringing. Hence, the more details there are in the patent, the greater are the opportunities for others to invent around it. We validate our measure by showing both that patent attorneys’ subjective assessments of scope agree with our estimates, and that the behavior of patenters is consistent with it. To facilitate drawing causal inferences with our measure, we show how it can be used to create an instrumental variable, patent examiner Scope Toughness, which we also validate.These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article.Need to check paper https://ssrn.com/abstract=2977273Not unless it’s in the paperExaminershttps://ssrn.com/abstract=2977273
33
2d88904f-056b-4230-96b4-f70c178d9f88Patent Citation Timing and Source11/16/2020 17:47:00patent_citation_timinghttps://storage.googleapis.com/jmk_public/Kuhn-Younge-Marco_Patent_Citation_Source_and_Timing_2017-09-25.csvInnovation studies frequently distinguish between patent citation submitted by the patent examiner and those submitted by the patent application. However, publicly available citations data is often misleading, for instance by attributing a patent citation to the patent examiner when it was in fact first submitted by the patent application. This dataset uses internal USPTO data to identify the date on which each citation was first submitted as well as the party (examiner or applicant) who first submitted it. The dataset includes observations for citations made by patents issued 2001-2014, although some level of leftward truncation is evident due to limitations in internal data availability at the USPTO.These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article.2001-2004Not unless it’s in the paper here https://ssrn.com/abstract=2714954https://ssrn.com/abstract=2714954
34
eaee5eaa-985b-4ba5-a13a-797d3cfeef1fPatent Families Datasetpatent_familieshttps://storage.googleapis.com/jmk_public/Younge-Kuhn_Patent_Families_2017-09-25.csvPatent applicants frequently file groups of patent applications linked together by priority claims. These priority claims create families of patent applications that share features such as inventors, priority dates, and technical descriptions. By analyzing these linkages, each patent can be assigned a family identifier that it shares with other patents in the same family. This data set includes two levels of family identifiers (clone for near copies, and extended for more attenuated linkages) for each patent issued 2005-2014These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article.2005-2014Not unless it’s in the paper (https://ssrn.com/abstract=2709238)https://ssrn.com/abstract=2709238
35
f9127a91-85f3-483d-a817-437671875d56Geography of patentspatent_geographyhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BPC15W1836-1975https://www.nature.com/articles/sdata201674#MOESM51Petralia, S., Balland, PA. & Rigby, D. Unveiling the geography of historical patents in the United States from 1836 to 1975. Sci Data 3, 160074 (2016). https://doi.org/10.1038/sdata.2016.74doi:10.7910/DVN/BPC15W
36
e77ef2c0-6a35-437a-8893-83eb88ad7bc9Inventor disambiguationinventor_disambiguationhttps://dataverse.harvard.edu/dataverse/patent1975-2010Ronald Lai; Alexander D'Amour; Amy Yu; Ye Sun; Lee Fleming, 2011, "Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database (1975 - 2010)", https://doi.org/10.7910/DVN/5F1RRI, Harvard Dataverse, V5, UNF:5:RqsI3LsQEYLHkkg5jG/jRg== [fileUNF]Disambiguation
37
f1561d9b-8512-470f-abed-557d6e3e19adPatent-to-article intext citations for 244 journalspatent_to_article_intexthttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZEZWBX197?-2015?Bryan, Kevin, 2019, "In-Text Patent Citation Database Bryan/Ozcan/Sampat Beta version .9", https://doi.org/10.7910/DVN/ZEZWBX, Harvard Dataverse, V2, UNF:6:+28YcwvDoaxFl/9hPXQaSA== [fileUNF]
38
798f092c-3597-41bb-be5d-e5eb15c2b5d3Patent valuepatent_valuehttps://iu.box.com/patentsUpdated Mar 19, 2014 by Noah Stoffman1926-2010
39
fa668908-1b25-4582-92aa-3d8bf4d3085aGovernment-funded US patentsgov_funded_us_patentshttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRCThis includes patent level metadata, 1926-1975 (OCRed from USPTO Image PDF files), 1976-2017 (parsed from USPTO HTML files), patent meta data, CPC, geography, agencies, entity size of the patent owner etc, government support categories at patent level and finally, aggregate yearly statistics. (2019-06-02) CC0 - "Public Domain Dedication" 1926-2017https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRCYesLee Fleming; Hillary Green; Guan-Cheng Li; Matt Marx; Dennis Yao, 2019, "Replication Data for: Government-funded research increasingly fuels innovation", https://doi.org/10.7910/DVN/DKESRChttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRCNBulk
40
131e13f8-342c-4dd7-a3e6-fbf5a5ba6a5cPatentCitypatentcityhttps://mailchi.mp/e0495246a573/patentcityData coming soon; accessible via Google BigQueryhttps://github.com/Antoberge/patent_citycoming soon
41
44f33a6f-5099-4481-abed-af9aadf0bd4fPatent text: code, data, and new measurespatent_text_new_measureshttps://zenodo.org/record/3515985Different open access data files related to the text of USPTO patent documents, including 1) for each US patent a list of processed, cleaned and stemmed keywords, 2) for each patent a list of the 1,000 most similar patents (based on cosine similarity) from the entire population of US patents, 3) for each US patent the average cosine similarity with all prior patents from the previous 5 years, and the average cosine similarity with all later patents in the following 5 years, 4) each new keyword (unigram), bigram (sequence of two adjacent keywords), trigram, and pairwise keyword combination introduced for the first time in history by a US patent, the number of the patent introducing it for the first time, and the total number of patents from the entire population using these new keywords, bigrams, trigrams, and new keyword combinations.Open Data Commons Attribution License v1.01969-2018https://zenodo.org/record/3515985YesArts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144)https://github.com/sam-arts/respol_patents_codeYesBulkpatent measures, text, natural language processing, novelty, impact, USPTO, technological progresshttps://doi.org/10.5281/zenodo.3515985Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144)
42
e22dcf03-9504-48c7-9cb4-468d98ec2bb2Matched inventor ages from patents, based on web scraped sources08/12/2021, 15:17:03matched_inventor_ageshttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YRLSKU
We use information about U.S. residing inventors from patents which include name and location and search for age and date of death information from publicly available online web directories and build a scoring system to indicate the quality of information that we collect. After applying a variety of heuristics and robustness checks, we find 1,508,676 inventor ages. We also find the death dates of 206,589 inventors, though are not as confident in its accuracy.

@article{kaltenberg_matched_2021,
title = {Matched inventor ages from patents, based on web scraped sources},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YRLSKU},
doi = {10.7910/DVN/YRLSKU},
abstract = {We use information about U.S. residing inventors from patents which include name and location and search for age and date of death information from...},
language = {en},
urldate = {2021-08-12},
author = {Kaltenberg, Mary and Jaffe, Adam and Lachman, Margie E.},
month = may,
year = {2021},
note = {type: dataset},
}
Inventors, Ages, Gender, Death Dates, Patents
43
fddedcfc-9f4e-47c6-bc82-3e04bb3c4262Newpaper.com Index08/12/2021, 15:45:21newspaper_comhttps://elisabethperlman.net/code.html
44
6ba552a7-ec31-4710-9d8b-d8177b293a90Tools for Harmonizing County Boundaries08/13/2021, 08:55:41harmonising_county_boundarieshttps://elisabethperlman.net/code.htmlThis tool creates the csv tables that allow county boundaries to be synchronized to a base year, exported to the directory you run this from. While this code takes shape files of any type and preforms an intersect, it was written to follow the method used in Hornbeck (2010) (see https://www.dropbox.com/s/1cygkeoo4p89vrw/BWreplication_BorderFixes.rar for those replication files), that is to say, I wrote it to take shapefiles of US counties from NHGIS from a selections of years and then to reapportioning them by area to the boundaries as they were in a base year. The stata code that uses these csvs was writen to be used with Haines' census data (ICPSR 02896).
45
1f556a96-61fc-4d4c-a046-ed711d9807f9Long-Term Productivity database08/16/2021, 13:46:40long_term_productivityhttp://longtermproductivity.com/download.htmlThe Long-Term Productivity database was created as a project at the Bank of France in 2013 by Antonin Bergeaud, Gilbert Cette and Remy Lecat. Following the work of Cette, Mairesse and Kocoglu (2009), we extended the database to include 17 countries in the latest version (2016). The latest version of the database includes the following countries -- Australia, Belgium, Canada, Denmark, Germany, Finland, France, Italy, Japan, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom, United States. We offer data on Total Factor Productivity per hour worked, Labor productivity per hour worked, capital intensity and GDP per capita. These series cover at least the period 1890 to present annually. In addition, other data corresponding to each of the papers linked to this project are available. This includes age of capital stock, education attainment, electricity production per capita. 1890-2016
46
410dd9de-2520-4f57-a409-0ade7ec11b65Collection of Historical Data on the Uses of Petroleum International Network08/16/2021, 14:36:05uses_of_petroleumhttp://www.longtermproductivity.com/chdupin/
47
bf073285-5243-4dc6-a990-c8a8c3f79898Classification Data for "Classifying Patents Based on their Semantic Content"08/17/2021, 08:40:25classifying_patents_semantic_contenthttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZULMOYClassification Data for Bergeaud, Potiron and Raimbault, 2017, Classifying Patents Based on their Semantic Content.
@article{bergeaud_classification_2017,
title = {Classification {Data} for "{Classifying} {Patents} {Based} on their {Semantic} {Content}"},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZULMOY},
abstract = {Classification Data for Bergeaud, Potiron and Raimbault, 2017, Classifying Patents Based on their Semantic Content.},
language = {en},
urldate = {2021-08-17},
author = {Bergeaud, Antonin and Yoann, Potiron and Raimbault, Juste},
month = apr,
year = {2017},
note = {type: dataset},
}
48
f61ebc77-4082-43c5-ae60-383a756ce308List of USPTO patents from US universities08/17/2021, 09:11:41us_university_patentshttps://sites.google.com/site/abergeaudeco/data?authuser=0from the paper "Innovation and Top Income Inequality" (Aghion, Akcigit, Bergeaud, Blundell, Hémous). This dataset lists all USPTO patent from 1969 to 2016 whose assignee is a univeristy and give the name and state of this university (originally taken from USPTO and improved). 1969-2016
49
fb81106d-3933-488b-acd9-aff177f82423HistPat International Dataset08/17/2021, 09:21:25histpat_internationalhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QT4OJSHistPat International provides the geography of historical patents granted to foreigns by the United States Patent and Trademark Office (USPTO) fro...
@article{petralia_histpat_2019,
title = {{HistPat} {International} {Dataset}},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QT4OJS},
doi = {10.7910/DVN/QT4OJS},
abstract = {HistPat International provides the geography of historical patents granted to foreigns by the United States Patent and Trademark Office (USPTO) fro...},
language = {en},
urldate = {2021-08-17},
author = {Petralia, Sergio},
month = mar,
year = {2019},
note = {type: dataset},
}
50
40f30ff4-d152-4aa8-89a9-e31dddc812dcHistPat Dataset08/24/2021, 15:31:52histpathttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BPC15WHistPat provides the geography of historical patents granted by the United States Patent and Trademark Office (USPTO) from 1790 to 1975. This histo...
@article{petralia_histpat_2019,
title = {{HistPat} {Dataset}},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BPC15W},
doi = {10.7910/DVN/BPC15W},
abstract = {HistPat provides the geography of historical patents granted by the United States Patent and Trademark Office (USPTO) from 1790 to 1975. This histo...},
language = {en},
urldate = {2021-08-24},
author = {Petralia, Sergio and Balland, Pierre-Alexandre and Rigby, David},
month = jan,
year = {2019},
note = {type: dataset},
}
10.7910/DVN/BPC15W
51
9651d1f2-3c24-46ef-9ade-e2e31f4ffe12BACI08/24/2021, 15:32:40bacihttp://www.cepii.fr/CEPII/en/bdd_modele/presentation.asp?id=37BACI provides disaggregated data on bilateral trade flows for more than 5000 products and 200 countries.
52
1b372a68-18ae-45e3-9a28-a6feecc3e7b8Matching SIPO patents to Chinese listed firms ("Main Board")08/17/2021, 11:16:07sipo_matchinghttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/CF1IXOMatching SIPO patents to Chinese listed firms ("Main Board"). Please refer to the user documentation "Chinese Patent Database User Documentation: M... through 2016?last updated 2017. See also this 2019 update w/ 3 varieties (now #66): https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QUH8KT
@article{he_matching_2019,
title = {Matching {SIPO} patents to {Chinese} listed firms ("{Main} {Board}")},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/CF1IXO},
doi = {10.7910/DVN/CF1IXO},
abstract = {Matching SIPO patents to Chinese listed firms ("Main Board"). Please refer to the user documentation "Chinese Patent Database User Documentation: M...},
language = {en},
urldate = {2021-08-17},
author = {He, Zi-Lin and Tong, Tony and Zhang, Yuchen and He, Wenlong},
month = dec,
year = {2019},
note = {type: dataset},
}
53
5ab54caa-f53c-4537-8dac-8bf20cab594eGPT Indicators08/17/2021, 11:25:28gpt_indicatorshttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PQGHKAThis database contains yearly technology-level measures of Growth, Use Complementarity (UC) and Innovation Complementarity (IC) since 1920 for all ...
@article{petralia_gpt_2020,
title = {{GPT} {Indicators}},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PQGHKA},
doi = {10.7910/DVN/PQGHKA},
abstract = {This database contains yearly technology-level measures of Growth, Use Complementarity (UC) and Innovation Complementarity (IC) since 1920 for all ...},
language = {en},
urldate = {2021-08-17},
author = {Petralia, Sergio},
month = mar,
year = {2020},
note = {type: dataset},
}
54
fb46d05b-2bd9-41fc-a739-91b77a2e85d6Imputation of missing applicant country codes in worldwide patent data08/17/2021, 11:51:42missing_applicant_codeshttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XNTL0WWe present a general method for imputing missing information in the Worldwide Patent Statistical Database (PATSTAT) and make the resulting datasets publicly available. The PATSTAT database is the de facto standard for academic research using patent data. Complete information on patents is essential to obtain an accurate picture of technological activities across countries and over time. However, the coverage of the database is far from complete. Our data imputation method exploits detailed institutional knowledge about the international patent system, and we codify it in a SQL algorithm. We provide two datasets related to the imputation of missing country codes and missing technology classification. We also release the algorithm that can be easily adapted to impute other pieces of information that are missing in PATSTAT. CC0 - "Public Domain Dedication" https://www.sciencedirect.com/science/article/pii/S2352340920314955
@article{seliger_imputation_2020,
title = {Imputation of missing applicant country codes in worldwide patent data},
url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XNTL0W},
doi = {10.7910/DVN/XNTL0W},
abstract = {The file ctry\_app\_person.txt contains identifiers for patent first filings and the applicant (corresponding to appln\_id and person\_id in PATSTAT) a...},
language = {en},
urldate = {2021-08-17},
author = {Seliger, Florian},
month = oct,
year = {2020},
note = {type: dataset},
}
https://github.com/seligerf/Imputation-of-missing-location-information-for-worldwide-patent-data Patents, Location of applicants, PATSTAT, Imputationhttps://doi.org/10.7910/DVN/XNTL0W https://doi.org/10.1016/j.dib.2020.106615
55
46a031fd-8827-4bab-91b3-b41ca447f152Patent Examination Data System08/28/2021, 16:51:00pedshttps://ped.uspto.gov/peds/#!/PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.The data can be accessed by anyone using the web interface or the provided Application Programming Interface (API). PEDS is updated daily and mirrors the data available in the Patent Application Location and Monitoring system (PALM). PEDS provides access to public applications including: published patent applications and patents. PCT applications that have not been published by WIPO. Any applications that have not been released by the USPTO will not be available in PEDS.terms given here: https://www.uspto.gov/sites/default/files/documents/Patent%20Electronic%20System%20Access%20Document_0.pdf1981-2021https://ped.uspto.gov/peds/#!/#%2FuserManualAPI and Bulk
56
fc72efb0-8b24-4415-9b50-b0b7f33dc8b4Indian Patent Advanced Search System08/31/2021, 08:28:19india_patent_databasehttps://ipindiaservices.gov.in/publicsearchPlatform for accessing indian public patents data NoneNoneNoneNoneNoneNoneNoneinnovation, platform
57
5d387b72-6d6c-4479-8626-e9a1a9b693f7UK IPO09/02/2021, 09:58:24uk_ipohttps://www.gov.uk/government/publications/ipo-patent-dataSnapshots of British patent/SPC applications received and subsequently published by the Intellectual Property Office.Open Government License 3.0 https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
58
a16242e8-fe81-49eb-bf1d-4df0a1927738Monthly statistics -- Patents, trade marks, and designs09/02/2021, 10:13:39uk_ipo_monthlyhttps://www.gov.uk/government/statistics/monthly-statistics-patents-trade-marks-and-designs-july-2021These statistics include monthly data for designs, patents, trade marks.
59
29154d41-30ef-4539-b428-819ca4c66965Open Sourced Database for CEO Dismissal 1992-201809/02/2021, 11:24:03ceo_dismissalhttps://zenodo.org/record/4618103CEO Dismissal data for S&P 1500 Companies 1992-2018
@misc{richard_j._gentry_open_2021,
title = {Open {Sourced} {Database} for {CEO} {Dismissal} 1992-2018},
url = {https://zenodo.org/record/4618103},
abstract = {There is a newer version of this database - please check the right-hand navigation for the latest version. We update the change log, versioning and other information on a Google Doc that is updated and continuous between posted versions of this database. We have included a snapshot of the documentation file here to help with future use along with an Excel version of the file for non-STATA users. This document also includes information on submitting edits and corrections to the open source data, which we welcome and encourage. We will acknowledge the participation of editors in the versioning changes at the bottom of the Google Doc.  This revision includes potentially relevant 8k filings from 270 days before and after the CEO's departure date. These filings were not all useful for understanding the departure, but might be useful in general.   If you would like to get an email notification when we update the database, sign-up here. We're happy to let you know when it is updated.},
urldate = {2021-09-02},
publisher = {Zenodo},
author = {{Richard J. Gentry} and {Joseph Harrison} and {Timothy Quigley} and {Steven Boivie}},
month = feb,
year = {2021},
doi = {10.5281/zenodo.4618103},
note = {type: dataset},
keywords = {CEO Dismissal, Management, Strategic Management},
}
CEO, Dismissal Management, Strategic ManagementDOI: 10.5281/zenodo.4618103
type: dataset
60
1a7fc85d-38af-4fe6-83b8-0d629e85d418A large-scale COVID-19 Twitter chatter dataset for open scientific research09/07/2021, 16:35:04covid_twitter_chatterhttps://zenodo.org/record/5458943Dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th yielding over 4 million tweets a day.

The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full dataset, and a cleaned version with no retweets. There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms, the top 1000 bigrams, and the top 1000 trigrams. Some general statistics per day are included for both datasets.
Jan 2020-presenthttp://www.panacealab.org/covid19/
@misc{banda_large-scale_2021,
title = {A large-scale {COVID}-19 {Twitter} chatter dataset for open scientific research - an international collaboration},
url = {https://zenodo.org/record/5458943},
abstract = {Version 78 of the dataset. The peer-reviewed publication for this dataset has now been published  in Epidemiologia an MDPI journal, and can be accessed here: https://doi.org/10.3390/epidemiologia2030024. Please cite this when using the dataset. Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. Since our first release we have received additional data from our new collaborators, allowing this resource to grow to its current size. Dedicated data gathering started from March 11th yielding over 4 million tweets a day. We have added additional data provided by our new collaborators from January 27th to March 27th, to provide extra longitudinal coverage. Version 10 added {\textasciitilde}1.5 million tweets in the Russian language collected between January 1st and May 8th, gracefully provided to us by: Katya Artemova (NRU HSE) and Elena Tutubalina (KFU). From version 12 we have included daily hashtags, mentions and emoijis and their frequencies the respective zip files. From version 14 we have included the tweet identifiers and their respective language for the clean version of the dataset. Since version 20 we have included language and place location for all tweets. The data collected from the stream captures all languages, but the higher prevalence are:  English, Spanish, and French. We release all tweets and retweets on the full\_dataset.tsv file (1,198,902,806 unique tweets), and a cleaned version with no retweets on the full\_dataset-clean.tsv file (306,791,449 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent\_terms.csv, the top 1000 bigrams in frequent\_bigrams.csv, and the top 1000 trigrams in frequent\_trigrams.csv. Some general statistics per day are included for both datasets in the full\_dataset-statistics.tsv and full\_dataset-clean-statistics.tsv files. For more statistics and some visualizations visit: http://www.panacealab.org/covid19/  More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19\_twitter) and our pre-print about the dataset (https://arxiv.org/abs/2004.03688)  As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data ONLY for research purposes. They need to be hydrated to be used.},
urldate = {2021-09-07},
publisher = {Zenodo},
author = {Banda, Juan M. and Tekumalla, Ramya and Wang, Guanyu and Yu, Jingyuan and Liu, Tuo and Ding, Yuning and Artemova, Katya and Tutubalina, Elena and Chowell, Gerardo},
month = sep,
year = {2021},
doi = {10.5281/zenodo.5458943},
note = {type: dataset},
keywords = {social media, twitter, nlp, covid-19, covid19},
}
https://github.com/thepanacealab/covid19_twitterYesBulksocial media, twitter, nlp, covid-19, covid19, twitter, covid, open-sourceDOI: 10.5281/zenodo.5458943
type: dataset
61
fcf09f34-d5a8-483d-94a3-09a03c167100Biospolar Antarctic Literature and Patents09/10/2021, 08:10:36biospolarhttps://osf.io/py6ve/Mapping the scientific and patent landscapes for biodiversity based research and innovation from Antarctica and the Southern Ocean. Created under the Biospolar Project, Research Council of Norway
@article{oldham_biospolar_2019,
title = {Biospolar {Antarctic} {Literature} and {Patents}},
url = {https://osf.io/py6ve/},
doi = {10.17605/OSF.IO/PY6VE},
abstract = {Mapping the scientific and patent landscapes for biodiversity based research and innovation from Antarctica and the Southern Ocean. Created under the Biospolar Project, Research Council of Norway (RCN project number 257631/E10)
Hosted on the Open Science Framework},
language = {en},
urldate = {2021-09-10},
author = {Oldham, Paul},
month = may,
year = {2019},
}
CC-By Attribution 4.0 Internationalantarctic, krill10.17605/OSF.IO/PY6VE
62
868eaad1-3c6a-4730-a70f-853996962d39US Patent Similarity Data09/15/2021, 05:50:18us_patent_similarityhttps://zenodo.org/record/3552078Pairwise semantic similarity measures for US utility patents. Includes measures for citing/cited patent pairs, 100 most-similar patents for each patent, and doc2vec vectors for each patent. Creative Commons Attribution 4.0 International
@misc{whalen_us_2019,
title = {{US} {Patent} {Similarity} {Data}},
url = {https://zenodo.org/record/3552078},
abstract = {Pairwise semantic similarity measures for US utility patents. Includes measures for citing/cited patent pairs, 100 most-similar patents for each patent, and doc2vec vectors for each patent.},
urldate = {2021-09-15},
publisher = {Zenodo},
author = {Whalen, Ryan and Lungeanu, Alina and DeChurch, Leslie and Contractor, Noshir},
month = nov,
year = {2019},
doi = {10.5281/zenodo.3552078},
note = {type: dataset},
keywords = {patents, intellectual property, innovation, semantic similarity, empirical legal studies},
}
Yes patents, intellectual property, innovation, semantic similarity, empirical legal studies, patents, similarity10.5281/zenodo.3552078
63
eb43fc38-8786-4b0f-b3b8-b9d610f456edPatstat Register10/04/2021patstat_registerhttps://www.epo.org/searching-for-patents/business/patstat.html€ 1.420,00 - € 1.460,00This database contains bibliographic and legal event data on published European and Euro-PCT patent applications.

Like the core PATSTAT database, it is maintained by the EPO, however PATSTAT Register only contains information about patent applications at the European Patent Office (EPO). The information in PATSTAT Register is, however, considerably deeper and more detailed.
Requires a subscription to accesshttps://www.epo.org/searching-for-patents/business/patstat.htmlhttps://github.com/gderasse/patstat_register
64
3360e0a5-ee9b-47d3-91df-9348b86af0cfPATENTSCOPE10/13/2021patentscopehttps://www.wipo.int/patentscope/en/The PATENTSCOPE database provides access to international Patent Cooperation Treaty (PCT) applications in full text format on the day of publication, as well as to patent documents of participating national and regional patent offices.1978-2021https://patentscope.wipo.int/search/en/help/help.jsfpatents, legal
65
fc08c62e-5eae-4831-9eae-4a59276e29fcWIPO PATENT REGISTER PORTAL10/13/2021patent_registerhttps://www.wipo.int/patent_register_portal/en/index.htmlThe WIPO's Patent Register Portal gives details of the availability of online patent registers by country / jurisdiction, as well as their search functionalities and the type of information they provide.geography, index, patents
66
66a84027-0208-4096-a96e-cfff30942626Matching SIPO patents update8/2019sipo_updatehttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QUH8KTNoneUpdate to #53, part of #5through 2018?
67
7da1dc8e-9e6c-4a53-9571-1b2f527a5dcdEPO worldwide bibliographic data (DOCDB)docdbhttps://www.epo.org/searching-for-patents/data/bulk-data-sets/docdb.html#tab-1€ 2.700,00 (main dataset), € 9.100,00 (backfile)DOCDB is the EPO's master documentation database with worldwide coverage. It contains bibliographic data, abstracts, citations and the DOCDB simple patent family, but no full text or images. available through paid subscription, https://www.epo.org/service-support/ordering/raw-data-terms-and-conditions.html
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100