A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | AC | AD | AE | AF | AG | AH | AI | AJ | AK | AL | AM | AN | AO | AP | AQ | AR | AS | AT | AU | AV | AW | AX | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | uuid | title | shortname | location | description | contributors | citation | terms_of_use | cost | open_access | maintained_by | tags | timeframe | documentation | error_metrics | size | code | versioning | bigquery | doi | related_publications | thumbnail_url | related_projects | relationship_description | record_superceded_by | schema_fields | salient_fields | last_edit | |||||||||||||||||||||||
2 | bd8a562a-ce58-4a61-925d-88f0d0695974 | PatCit | patcit | https://doi.org/10.5281/zenodo.3710993 | In-text and front page citations to non-patent literature and in-text patent citations, extracted and parsed. patCit builds on DOCDB, the largest database of Non Patent Literature (NPL) citations. First, we deduplicate this corpus and organize it into 10 categories. Then, we design and apply category specific information extraction models using spaCy. Eventually, when possible, we enrich the data using external domain specific high quality databases. Managed as an open-source, collaboratively maintained project. | Cyril Verluise,Gabriele Cristelli, Kyle Higham, Lucas Violon, Gaétan de Rassenfosse | Cyril Verluise, Gabriele Cristelli, Kyle Higham, Lucas Violon, & Gaétan de Rassenfosse. (2020). PatCit: A Comprehensive Dataset of Patent Citations (Version 0.3.1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4391095 | CC-BY 4.0 International | None | Cyril Verluise | citation, scholarly literature, in-text, front-page, patent, science, database, Wikipedia, validation | 1836-2018 | https://cverluise.github.io/PatCit/ | yes | https://cverluise.github.io/notebook | https://console.cloud.google.com/bigquery?project=patcit-public-data&p=patcit-public-data&page=project | https://doi.org/10.5281/zenodo.3710993 | https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3754772 | [{'uuid': 'e390a212-3a92-4d8f-ac4d-ca2c960a36d3', 'shortname': 'patstat', 'relationship_type': 'similar'}, {'uuid': 'c39f4844-5ae2-4dcb-bf2c-d6b957125704', 'shortname': 'lens', 'relationship_type': 'similar'}, {'uuid': '53f2e34b-8088-42a3-a763-f471c26b5ac6', 'shortname': 'rons', 'relationship_type': 'similar'}] | DOI, npl_cat_language_flag, wg, tsg, meeting, hostname, publication_date, reference_count, patcit_id, date, acc_num, body, name, ISSN, journal_title_abbrev, hash_id, language_code, source, ref, inpadoc_family_id, PMCID, tech, docdb_family_id, PMID, language_is_reliable, URL, npl_cat, author, issue, page, volume, is_referenced_by_count, funder, ISBN, subject, cited_by, npl_cat_score, event, institution, item, version, url, type, bibref_score, pat_publn_id, npl_publn_id, md5, title, journal_title, tdoc_num, appln_id, is_cited_by_count, publication_number, abstract, reference_doi, citation | DOI, PMID, ISSN, ISBN | Mon, 25 Sep 2023 19:06:41 GMT | |||||||||||||||||||||||||||||
3 | e65da1db-6608-4246-98a7-c260dfc28e45 | Chilean IP and firm data | chilean_ip | https://eml.berkeley.edu//~bhhall/Chile_ipdata.html | This study describes patterns and trends of intellectual property (IP) use in Chile, drawing on a new database containing all patent, trademark, utility model, and design filings received by the Chilean IP office over the period 1991-2010. The database provides harmonized applicant names, enabling the unique identification of applicants across all four forms of IP. Among other things, the study offers insights into the drivers of filing growth, the origin of filings, the distribution of applicants, the importance of different applicant types, the share of filings by different economic sectors, the relevance of IP bundles, and the patenting behavior of Chilean applicants overseas | Bronwyn H. Hall | Abud, M.J., Fink, C., Hall, B. and Helmers, C., 2013. The use of intellectual property in Chile (Vol. 11). WIPO. | not specified | None | Bronwyn Hall | Chile, trademark squatting, pharmaceuticals, disambiguation | 1995-2005 | https://eml.berkeley.edu//~bhhall/Chile_ipdata/chile_inno_ip.txt | Mon, 19 Jun 2023 16:35:20 GMT | |||||||||||||||||||||||||||||||||||||
4 | 50fbdb5a-1288-46e9-b93d-27ac99cd4eb2 | The scientific knowledge base of low carbon energy technologies (updated and extended version) | low_carbon_knowledge | https://doi.org/10.4119/unibi/2950291 | This data publication offers updated data about low-carbon energy technology (LCET) patents and citations links to the scientific literature. Compared to a previous version, it also contains data on biofuels and fuels from waste technologies. The updated version also contains the code (R-scripts) that have been used to (1) compile the data and (2) to reproduce the statistical analysis including figures and tables presented in the final paper Hötte, Pichler, Lafond (2021): "The rise of science in low-carbon energy technologies", RSER. DOI: 10.1016/j.rser.2020.110654. | Hötte K, Lafond F, Pichler A | Hötte, Pichler, Lafond (2021): "The rise of science in low-carbon energy technologies", RSER. DOI: 10.1016/j.rser.2020.110654 | CC BY 4.0 license. See: https://creativecommons.org/licenses/by/4.0/legalcode | None | citation, scholarly literature, low-carbon energy technologies | 1836-2019 | https://doi.org/10.4119/unibi/2950291 | No | Included in the bulk download | https://doi.org/10.4119/unibi/2950291 | https://www.sciencedirect.com/science/article/abs/pii/S0172219011001979?via%3Dihub | Thu, 27 Jul 2023 09:13:49 GMT | ||||||||||||||||||||||||||||||||||
5 | 2a0949bb-2f36-45a7-b4cf-109456cec21d | Chinese Patent Data Project | chinese_patent_data | Chinese Patent Data Project | In this project, patents from China's State Intellectual Property Office (SIPO) are matched to various types of companies. Matching SIPO patents to firms in the Annual Survey of Industrial Enterprises (ASIE) of China's National Bureau of Statistics. | Wenlong He, Zi-lin He, Tony W. Tong, Yuchen Zhang | None | Zi-lin He, Z.L.He@uvt.nl; Tony W. Tong, tony.tong@colorado.edu; Yuchen Zhang, yzhang54@tulane.edu | disambiguation, China, corporate structure | https://doi.org/10.7910/DVN/CF1IXO | https://www.nature.com/articles/sdata201842 | sipo_matching | Sun, 24 Sep 2023 10:07:26 GMT | ||||||||||||||||||||||||||||||||||||||
6 | 53f2e34b-8088-42a3-a763-f471c26b5ac6 | Reliance on Science in Patenting | rons | https://zenodo.org/record/3575146#.XfQZMWRKiUk | We introduce an open-access dataset of references from the front pages of patents granted worldwide to scientific papers published since 1800. Each patent-paper linkage is assigned a confidence score, which is characterized in a random sample by false negatives versus false positives. All matches are available for download at http://relianceonscience.org. We outline several avenues for strategy research enabled by these new data. This contains citations from the front pages of worldwide patents to articles in the Microsoft Academic Graph (MAG) from 1800-2020. | Matt Marx, Aaron Fuegi | Marx, Matt and Aaron Fuegi, "Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles" | Open Data Commons Attribution License v1.0 | None | Matt Marx, mmarx@cornell.edu | citation, scholarly literature, front-page, error metrics | 1834-2019 | https://zenodo.org/record/4235193#.X6Fgb5CSm38 | Yes | https://github.com/mattmarx/reliance_on_science | https://doi.org/10.5281/zenodo.3575146 | [{'uuid': 'bd8a562a-ce58-4a61-925d-88f0d0695974', 'shortname': 'patcit', 'relationship_type': 'similar'}] | Mon, 19 Jun 2023 16:35:24 GMT | |||||||||||||||||||||||||||||||||
7 | 07ec4549-2429-4e8e-9ee3-6deefca0b075 | Japanese Patent Office | japanese_patent_office | https://www.iip.or.jp/e/patentdb/index.html | IIP Patent Database (IIP Patent DB) is a database developed for statistical analysis of patents based on the Japan Patent Office (JPO) “Standardized Data.“ Intellectual Property Institute (IIP) provides the IIP patent DB to further promote patent statistical research. | JPO | State that you used: III Patent DB | Only for use by academic research institutions and other institutions for academic research purposes, cannot be used for commercial purposes. | None | Foundation for Intellectual Property, iip-patentdb@fdn-ip.or.jp | Japan, patents, patent office | 1964-9/2019 | Mon, 19 Jun 2023 16:34:49 GMT | ||||||||||||||||||||||||||||||||||||||
8 | bfc3892d-2170-47ed-b056-a573c845efa5 | MIT Scholarly Works Over Time | mit_scholarly | https://lens-public.s3-us-west-2.amazonaws.com/sloan/scholarly/201932/mit_scholarly.zip | Scholarly works produced by MIT 1950-2018 | The Lens | Cambia grants you a non-exclusive, non-transferable, revocable, limited license to access and personally use the features of the Service. The conditions by which The Lens data may be used are intended to resonate with the principles of Creative Commons Attribution licenses with a public benefit element. | None | The Lens | scholarly literature | 1950-2021 | https://www.lens.org/lens/search/scholar/analysis?q=&st=true®ex=false&institution.must=Massachusetts%20Institute%20of%20Technology&p=0&n=10&s=score&d=%2B&dateFilterField=publishedYear&dashboardId=189&preview=false | Sat, 21 Oct 2023 00:24:11 GMT | ||||||||||||||||||||||||||||||||||||||
9 | 265a814e-a4a5-4302-9cc0-0f78cf1c70fc | MIT Scholarly Works Cited by Patents | mit_scholarly_citations | https://lens-public.s3-us-west-2.amazonaws.com/sloan/scholarly/201932/mit_scholarly_cited_by_patents.zip | MIT Scholarly Works Cited by Patents 1950-2018 | The Lens | Cambia grants you a non-exclusive, non-transferable, revocable, limited license to access and personally use the features of the Service. The conditions by which The Lens data may be used are intended to resonate with the principles of Creative Commons Attribution licenses with a public benefit element. | None | The Lens | citation, scholarly literature | 1950-2021 | https://www.lens.org/lens/labs/dashboards | Mon, 19 Jun 2023 16:34:51 GMT | ||||||||||||||||||||||||||||||||||||||
10 | 6476ac03-71ee-4480-b2aa-e25871179689 | Patents Citing MIT Publications | patents_citing_mit | https://www.lens.org/lens/search/patent/list?collectionId=22790&p=0&n=10 | This collection encompasses patents that cite the scholarly works of Massachusetts Institute of Technology. | The Lens | Cambia grants you a non-exclusive, non-transferable, revocable, limited license to access and personally use the features of the Service. The conditions by which The Lens data may be used are intended to resonate with the principles of Creative Commons Attribution licenses with a public benefit element. | None | The Lens | citation, scholarly literature | 1950-2021 | https://www.lens.org/lens/labs/dashboards | Mon, 19 Jun 2023 16:34:52 GMT | ||||||||||||||||||||||||||||||||||||||
11 | a238826e-8135-4b6d-8b59-615fc9769f03 | Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database | co_authorship_disambiguation | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/5F1RRI | Name disambiguation of US inventors, 1975-2010. Using a Bayesian supervised learning approach, we identify individual inventors from the U.S. utility patent database, from 1975 to the present. An interface to calculate and illustrate patent co-authorship networks and social network measures is also provided. The network representation does not require bounding the social network beforehand. We provide descriptive statistics of individual and collaborative variables and illustrate examples of networks for an individual, an organization, a technology, and a region. The paper provides an overview of the technical algorithms and pointers to the data, code, and documentation, with the hope of further open development by the research community. | Ronald Lai, Alexander D'Amour, Amy Yu, Ye Sun, Lee Fleming | Ronald Lai; Alexander D'Amour; Amy Yu; Ye Sun; Lee Fleming, 2011, "Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database (1975 - 2010)", https://doi.org/10.7910/DVN/5F1RRI, Harvard Dataverse, V5, UNF:5:RqsI3LsQEYLHkkg5jG/jRg== [fileUNF] | CC0 - "Public Domain Dedication" | None | Contact maintainer through Dataverse | coauthor network, disambiguation, United States | 1970-2010 | https://github.com/funginstitute/downloads | https://doi.org/10.7910/DVN/5F1RRI | https://doi.org/10.1016/j.respol.2014.01.012 | Thu, 27 Jul 2023 09:19:39 GMT | |||||||||||||||||||||||||||||||||||
12 | 3e2ed123-d6c0-46af-8683-e23d64b04efc | The careers and co-authorship networks of U.S. patent-holders, since 1975 | co_authorship_careers | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YJUNUN | The identification enables construction of social networks based on patent co-authorship. We will eventually provide descriptive statistics of individual and collaborative variables and illustrated examples of networks for an individual, an organization, a technology, and a region. The data and code will be publically available for community use and improvement and will enable updating as frequently as new patents are issued. | Ronald Lai, Alexander D'Amour, Lee Fleming | Ronald Lai; Alexander D'Amour; Lee Fleming, 2010, "The careers and co-authorship networks of U.S. patent-holders, since 1975", https://doi.org/10.7910/DVN/YJUNUN, Harvard Dataverse, V3, UNF:5:daJuoNgCZlcYY8RqU+/j2Q== [fileUNF] | CC0 - "Public Domain Dedication" | None | Contact maintainer through Dataverse | coauthor network, United States, social networks | https://doi.org/10.7910/DVN/YJUNUN | Mon, 19 Jun 2023 16:36:25 GMT | ||||||||||||||||||||||||||||||||||||||
13 | 00c6f78f-f689-4d50-a965-812bfd528477 | Penn World Tables | pwt | https://www.rug.nl/ggdc/productivity/pwt/?lang=en | PWT version 10.0 is a database with information on relative levels of income, output, input and productivity, covering 183 countries between 1950 and 2019. Access to the data is provided in Excel, Stata and online formats. | Robert C. Feenstra, Robert Inklaar, Marcel P. Timmer | Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), "The Next Generation of the Penn World Table" American Economic Review, 105(10), 3150-3182, available for download at www.ggdc.net/pwt | CC 4.0 | None | Contact pwt@rug.nl | geography, GDP, productivity | 1950-2017 | https://www.rug.nl/ggdc/docs/pwt100-user-guide-to-data-files.pdf | https://doi.org/10.15141/S50T0R | Fri, 13 Oct 2023 11:13:33 GMT | ||||||||||||||||||||||||||||||||||||
14 | 068fb03e-642a-4896-b61c-ff6a16251e08 | Worldwide Count of Priority Patents | priority_patents | http://www.gder.info/download_wwc_excel.html | The goal of the project was to produce a dataset of priority patent applications filed across the globe, allocated by inventor and applicant location. | Gaétan de Rassenfosse, Hélène Dernis, Dominique Guellec, Lucio Picci, Bruno van Pottelsberghe de la Potterie | De Rassenfosse, G., Dernis, H., Guellec, D., Picci, L., & van Pottelsberghe de la Potterie, B. (2013). The worldwide count of priority patents: A new indicator of inventive activity. Research Policy, 42(3), 720–737. doi:10.1016/j.respol.2012.11.002 | None | Gaétan de Rassenfosse | priority patents, location of inventors | http://www.gder.info/download_wwc_mysql.html | Mon, 19 Jun 2023 16:36:38 GMT | |||||||||||||||||||||||||||||||||||||||
15 | 6fe3b5e5-93a8-4f07-9331-d9998b9000b8 | Geocoding of worldwide patent data | geocoding_patents | Geocoding of worldwide patent data - Harvard Dataverse | The dataset provides geographic coordinates for inventor and applicant locations in 18.8 million patent documents spanning over more than 30 years. The geocoded data are further allocated to the corresponding countries, regions and cities. When the address information was missing in the original patent document, we imputed it by using information from subsequent filings in the patent family. The resulting database can be used to study patenting activity at a fine-grained geographic level without creating bias towards the traditional, established patent offices. | Florian Seliger, Jan Kozak, Gaétan de Rassenfosse | Seliger, Florian; Kozak, Jan; de Rassenfosse, Gaétan, 2019, "Geocoding of worldwide patent data", https://doi.org/10.7910/DVN/OTTBDX, Harvard Dataverse, V5 | CC0 - "Public Domain Dedication" | None | Contact maintainer through Dataverse | geography, location of inventors, PATSTAT | 30 years | https://doi.org/10.1038/s41597-019-0264-6 | https://github.com/seligerf/Imputation-of-missing-location-information-for-worldwide-patent-data | https://doi.org/10.7910/DVN/OTTBDX | https://doi.org/10.1038/s41597-019-0264-6 | Sat, 29 Jul 2023 06:33:13 GMT | ||||||||||||||||||||||||||||||||||
16 | d76b71a1-2f43-447d-b296-a1b52db6e3d7 | On the price elasticity of demand for patents | patent_price_elasticity | http://www.gder.info/download_OBES_data.html | Fees since 1980 at the European (EPO), the US and the Japanese patent offices. | Gaétan de Rassenfosse, Bruno van Pottelsberghe de la Potterie | Rassenfosse, G. de, & Potterie, B. van P. de la. | None | Gaétan de Rassenfosse | patent demand, United States, Europe, Japan | Mon, 19 Jun 2023 16:36:44 GMT | ||||||||||||||||||||||||||||||||||||||||
17 | c66bdabd-a80c-4a7e-b9b9-f706e4ed7395 | Patents arising from U.S. government funding | us_gov_patents | https://zenodo.org/record/3369582 | The 3PFL database links information on patented inventions and scientific publications related to a public procurement contract or a research grant awarded by the U.S. Federal Government to detailed contract-level/grant-level information (e.g., awarding agency, recipient organization, award size). We have combined data from multiple sources, including (but not limited to) the United States Patent and Trademark Office bulk database, the Federal Procurement Database System, the Award Submission Portal (ASP), and the European Patent Office's PATSTAT database. We also provide a link to the scientific publications associated with these patents. The 3PFL database provides rich and original information that opens the door to novel empirical research in the economics of innovation and science. | Gaétan de Rassenfosse, Emilio Raiteri | de Rassenfosse Gaétan, & Emilio Raiteri. (2019). 3PFL: Database of Patents and Publications with a Public-Funding Linkage (Version 1.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3369582 | CC-BY 4.0 International | None | Gaétan de Rassenfosse | research funding, United States | 2000-2019 | https://doi.org/10.5281/zenodo.3369582 | Mon, 19 Jun 2023 16:36:51 GMT | |||||||||||||||||||||||||||||||||||||
18 | e390a212-3a92-4d8f-ac4d-ca2c960a36d3 | PATSTAT | patstat | https://www.epo.org/searching-for-patents/business/patstat.html#tab3 | PATSTAT contains bibliographical and legal event patent data from leading industrialised and developing countries. This is extracted from the EPO’s databases and is either provided as bulk data or can be consulted online. | EPO | PATSTAT | Requires a subscription to access | €975.00 - € 1460.00 | European Patent Office | Europe, patents | patstat cookbook' by Gaétan de Rassenfosse https://onlinelibrary.wiley.com/doi/full/10.1111/1467-8462.12073 | Mon, 19 Jun 2023 16:35:00 GMT | ||||||||||||||||||||||||||||||||||||||
19 | c39f4844-5ae2-4dcb-bf2c-d6b957125704 | Lens.org | lens | https://lens.org/ | Lens serves nearly all of the patent documents in the world as open, annotatable digital public goods that are integrated with scholarly and technical literature along with regulatory and business data. The Lens will allow documents and analyses to be shared and embedded to support open mapping of knowledge-directed innovation. | Cambia | Please use the expression 'Enabled by The Lens' or 'Data Sourced from The Lens' and the Lens.org URL. | Cambia grants you a non-exclusive, non-transferable, revocable, limited license to access and personally use the features of the Service. The conditions by which The Lens data may be used are intended to resonate with the principles of Creative Commons Attribution licenses with a public benefit element. | None | Cambia Foundation, https://about.lens.org/contact-us/ | citation, scholarly literature | Mon, 19 Jun 2023 16:35:01 GMT | |||||||||||||||||||||||||||||||||||||||
20 | 9c4124ed-5337-4b36-a1c9-7cf256a3384b | Microsoft Academic Graph | mag | https://academic.microsoft.com/home | The Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals, conferences, and fields of study. | Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. | Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MA) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246. DOI=http://dx.doi.org/10.1145/2740908.2742839 K. Wang et al., “A Review of Microsoft Academic Services for Science of Science Studies”, Frontiers in Big Data, 2019, doi: 10.3389/fdata.2019.00045 | ODC-BY | None | Currently in transition | citation, scholarly literature | Mon, 02 Oct 2023 00:48:49 GMT | |||||||||||||||||||||||||||||||||||||||
21 | 233d7290-f32f-46bb-8a6d-8837e59d9ffb | Crios‐Patstat Database | crios_patstat | https://www.icrios.unibocconi.eu/wps/wcm/connect/Cdr/Icrios/Home/Resources/Databases/PATENTS-ICRIOS+database/ | Disambiguated inventor and applicant names for EPO records. A major problem with PATSTAT was that data are provided in a raw format. Data have been therefore thoroughly elaborated by ICRIOS to produce a cleaned and harmonized database: PATENTS-ICRIOS. Data processing consisted mainly in a thorough work of cleaning and standardization of rough information provided by the EPO. Such work of name standardization has been carried out at the level of individual inventors and applicants. In addition to this, each patent document also reports further information not included in Patstat, (FI concordance tables to convert IPC codes into more aggregated and manageable technological classes). Data included in these reports are for EPO patent office only; last update has been released on 10/2016; starting date for EPO applications is 1978, in many reports by priority date you might see earlier dates. | Monica Coffano, Gianluca Tarasconi | Coffano, M., & Tarasconi, G. (2014). CRIOS - Patstat Database: Sources, Contents and Access Rules. SSRN Electronic Journal. doi:10.2139/ssrn.2404344 | EPO License | None | crios@unibocconi.it | disambiguation, Europe | http://ssrn.com/abstract=2404344 | Fri, 01 Dec 2023 18:12:01 GMT | ||||||||||||||||||||||||||||||||||||||
22 | d9cf4e57-a90e-4d18-8a3b-08fea43a2f49 | NBER US Patent Data Project | nber_citation | https://sites.google.com/site/patentdataproject/Home/downloads?authuser=0 | The main dataset extends from Jan 1, 1963, through december 30, 2006, and includes all the utility patents granted during that period. The citations file includes all citations made by patents granted in 1975-1999. | Bronwyn H. Hall, Jim Bessen, Grid Thoma | HJT refers to Hall, Bronwyn, Adam Jaffe and Manuel Trajtenberg, "The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools," NBER Working Paper 8498. | The data in these files are freely available to members of this community. We expect members to inform the community of errors in the data or documentation and to provide fixes/improvements. | None | Adam Jaffe | United States | 1976-2006 | https://docs.google.com/document/d/1FyDsjZHhq7okHWMBOc_E7EquLUoAwwEZYtxw5M3UDTY/edit | Wed, 11 Oct 2023 03:01:57 GMT | |||||||||||||||||||||||||||||||||||||
23 | cf1780b1-e265-4e49-8d1d-83b9cfe0fd9a | USPTO PatentsView | patentsview | https://patentsview.org/ | PatentsView includes US patent data including raw data (summaries, applications, pregrant applications), disambugations of inventors and assignees, and inventor gender estimates. Also foreign priority data, # of figures and sheets, and government interest statements. | USPTO | Attribution should be given to PatentsView for use, distribution, or derivative works. | Creative Commons Attribution 4.0 International License. | None | USPTO | disambiguation, United States, gender | 1963-1999 | https://patentsview.org/query/builder-faqs | https://github.com/CSSIP-AIR/PatentsView-Code-Snippets/ | https://console.cloud.google.com/bigquery?p=patents-public-data&d=patentsview&page=dataset | http://dx.doi.org/10.2139/ssrn.3868599 | citation_id, state_fips, f102_date, classification_status, name_last, lapse_of_patent, disamb_assignee_id_20181127, assignee_id, mainclass_id, rawlocation_id, attribution_status, designation, male_flag, subclass, disamb_inventor_id_20171003, disamb_assignee_id_20190312, section, disamb_inventor_id_20180528, subcategory_id, deceased, length, latitude, uuid, disclaimer_date, city, type, category, sector_title, rule_47, patent_id, disamb_inventor_id_20200331, disamb_inventor_id_20170307, action_date, disamb_assignee_id_20190820, fname, disamb_inventor_id_20181127, filename, id, num_claims, f371_date, subsection_id, reldocno, num_figures, disamb_assignee_id_20191008, disamb_assignee_id_20200331, rawinventor_id, term_grant, disamb_assignee_id_20191231, publication_number, abstract, level_one, _371_date, main_group, disamb_assignee_id_20200929, gi_statement, longitude, subgroup, disamb_inventor_id_20190820, classification_value, subclass_id, sequence, date, organization_id, doc_type, disamb_inventor_id_20170808, contract_award_number, num_sheets, disamb_assignee_id_20200630, county, ipc_version_indicator, disamb_inventor_id_20171226, symbol_position, inventor_id, number, field_title, state, level_two, series_code, subgroup_id, field_id, location_id, lname, disamb_inventor_id_20190312, county_fips, group_id, disamb_inventor_id_20200630, section_id, term_extension, num, relkind, organization, latlong, exemplary, rawassignee_id, country_transformed, role, country, disamb_inventor_id_20201229, male, name, classification_data_source, disamb_inventor_id_20191231, applicant_type, dependent, group, application_id, status, withdrawn, name_first, rel_id, text, kind, classification_level, term_disclaimer, disamb_inventor_id_20191008, category_id, latin_name, disamb_inventor_id_20200929, variety, lawyer_id, _102_date, ipc_class, doctype, title, level_three | Wed, 11 Oct 2023 03:01:59 GMT | |||||||||||||||||||||||||||||||||
24 | 6f3605ad-5edb-4a73-8b3b-6d6d35064d4c | Microsoft Academic Knowledge Graph | makg | http://ma-graph.org/ | A large RDF data set with over eight billion triples with information about scientific publications and related entities, such as authors, institutions, journals, and fields of study. The data set is based on the Microsoft Academic Graph and licensed under the Open Data Attributions license. Furthermore, we provide entity embeddings for all 210M represented scientific papers. | Michael Färber | @inproceedings{DBLP:conf/semweb/Farber19, author = {Michael F{\"{a}}rber}, title = "{The Microsoft Academic Knowledge Graph: {A} Linked Data Source with 8 Billion Triples of Scholarly Data}", booktitle = "{Proceedings of the 18th International Semantic Web Conference}", series = "{ISWC'19}", location = "{Auckland, New Zealand}", pages = {113--129}, year = {2019}, url = {https://doi.org/10.1007/978-3-030-30796-7\_8}, doi = {10.1007/978-3-030-30796-7\_8} } | Open Data Commons Attribution License (ODC-By) v1.0 | None | citation, scholarly literature | https://github.com/michaelfaerber/makg-linking | Mon, 19 Jun 2023 16:38:15 GMT | |||||||||||||||||||||||||||||||||||||||
25 | 303ce18b-f411-4752-9fe6-d4fcc369f43c | IPRoduct | iproduct | https://iproduct.io/app | The IPRoduct project seeks to link innovative goods to the patents upon which they are based. By directly linking products to patents, this project tracks innovation to the point where it meets consumers, the true commercial end point of investments in Science & Technology. The output of the project is a database of linked product-patent pairs that is made publicly available. The data is sourced from virtual patent marking web pages. Everyone has seen the ‘patent pending’ notice on some products. Sometimes, manufacturers print the actual patent numbers on products -- ‘physical patent marking'. The complete database is composed of 800 companies, 1447 web pages, 24463 products, 19815 U.S. patents and 151176 relationships. | Gaétan de Rassenfosse | These data are currently not available for sale. They are available in exchange of credits, which you earn by contributing to the project. | None | Gaétan de Rassenfosse, Samuel Arnod-Prin | Products, disambiguation, trademarks, physical patent marking | https://iproduct.io/app/#/public/page/about | Mon, 19 Jun 2023 16:35:05 GMT | |||||||||||||||||||||||||||||||||||||||
26 | 50c1e32c-d2f5-4328-be8e-b7f172772a26 | Replication Data for: Government-funded research increasingly fuels innovation | gov_research_fuels_innovation | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRC | This includes patent level metadata, 1926-1975 (OCRed from USPTO Image PDF files), 1976-2017 (parsed from USPTO HTML files), patent meta data, CPC, geography, agencies, entity size of the patent owner etc, government support categories at patent level and finally, aggregate yearly statistics. (2019-06-02) | Lee Fleming, Hillary Green, Guan-Cheng Li, Matt Marx, Dennis Yao | Lee Fleming; Hillary Green; Guan-Cheng Li; Matt Marx; Dennis Yao, 2019, "Replication Data for: Government-funded research increasingly fuels innovation", https://doi.org/10.7910/DVN/DKESRC, Harvard Dataverse, V4, UNF:6:kMIqsh3DCvKiKYgMT6/H8A== [fileUNF] | CC0 - "Public Domain Dedication" | None | Contact maintainer through Dataverse | research funding, United States | 1926-1975 and 1975-2017 | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRC | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DKESRC | Mon, 19 Jun 2023 16:38:41 GMT | ||||||||||||||||||||||||||||||||||||
27 | d24e8a7e-7d27-4280-9d85-c6598a1b9b8e | Google Patents Public Datasets | google_patents_public | https://console.cloud.google.com/marketplace/details/google_patents_public_datasets/google-patents-public-data | Worldwide (100+ countries) bibliographic and USPTO full-text, available via BigQuery. Provided by IFI CLAIMS Patent Services, a worldwide bibliographic and US full-text dataset of patent publications. Updated quarterly. | Google Patents | “Google Patents Public Data” by IFI CLAIMS Patent Services and Google, used under CC BY 4.0 | CC BY 4.0, requires subscription to query API | None | Google Patents https://patents.google.com/ | Google Patents | 1834-present (quarterly) | https://cloud.google.com/blog/topics/public-datasets/google-patents-public-datasets-connecting-public-paid-and-private-patent-data | patent analysis sample code: https://github.com/google/patents-public-data, source code not accessible | https://console.cloud.google.com/bigquery?p=patents-public-data&d=patents&page=dataset | application_kind, family_id, fi, child, uspc, kind_code, spif_application_number, priority_claim, description_localized_html, citation, publication_date, claims_localized_html, ipc, assignee, assignee_harmonized, locarno, parent, abstract_localized, spif_publication_number, pct_number, title_localized, entity_status, art_unit, inventor_harmonized, application_number, application_number_formatted, cpc, claims_localized, country_code, grant_date, examiner, filing_date, fterm, publication_number, priority_date, description_localized, inventor | Sun, 22 Oct 2023 03:10:23 GMT | ||||||||||||||||||||||||||||||||||
28 | ff4ffcf9-5721-4148-ac59-140b9ed4dab5 | Semantic Scholar Open Research Corpus | sem_scholar_open_research | https://api.semanticscholar.org/corpus | Semantic Scholar's records for research papers published in all fields provided as an easy-to-use JSON archive. | Waleed Ammar, Dirk Groneveld | Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL https://www.semanticscholar.org/paper/09e3cf5704bcb16e6657f6ceed70e93373a54618 | ODC-BY | None | Semantic Scholar, feedback@semanticscholar.org | citation, scholarly literature | Fri, 01 Dec 2023 18:12:25 GMT | |||||||||||||||||||||||||||||||||||||||
29 | e80542a8-a9bb-4205-8364-c0e9f3a2b683 | UVA Darden Global Corporate Patent Dataset (disambiguated assignees) | uva_global_corporate_patents | https://patents.darden.virginia.edu/ | The dataset has information on about 3 million USPTO patents, which were granted between 1980 and 2017, assigned to publicly listed companies worldwide, and linked to those assignee companies using the following identifiers: Unique Patent Number, as given by the USPTO, GVKEY, as the firm identifier, from the S&P Compustat Global database. | Jan Bena, Miguel A. Ferreira, Pedro Matos, Pedro Pires | Jan Bena, Miguel A. Ferreira, Pedro Matos, and Pedro Pires. "Are foreign investors locusts? The long-term effects of foreign institutional ownership." Journal of Financial Economics 126, no. 1 (2017): 122-146 | CC BY-NC 4.0 Attribution-NonCommercial 4.0 International | None | GCPD@darden.virginia.edu | United States, disambiguation | 1980-2017 | https://patents.darden.virginia.edu/documents/DataConstructionDetails_v01.pdf | Mon, 20 Nov 2023 06:09:17 GMT | |||||||||||||||||||||||||||||||||||||
30 | f2fcc603-7883-4e18-a82a-6275ffd82e98 | DISCERN: Duke Innovation & SCientific Enterprises Research Network | discern | https://doi.org/10.5281/zenodo.3594642 | Patents (as well as scientific articles, and NPL citations at the aggregate firm-level) matched to U.S. Compustat firms over the period 1980-2015. In extending the match to Compustat up to 2015, we address two major challenges: name changes and ownership changes. Our UO and subsidiary historical standardized firm name lists, including the dynamic reassignment, are publicly available for researches to match to their database of interest. As of 2023, DISCERN has been enriched to extend additional years, broader coverage of subsidiary data, and to add open-access matches to scientific publications. More information about the updates may be found in the following NBER whitepaper: https://conference.nber.org/conf_papers/f193007.pdf | Ashish Arora, Sharon Belenzon, Lia Sheer | Arora Ashish, Belenzon Sharon, and Sheer Lia, 2021. "Knowledge spillovers and corporate investment in scientific research". American Economic Review, 111(3), pp.871-98. Arora Ashish, Belenzon Sharon, and Sheer Lia, 2021. "Matching patents to Compustat firms, 1980–2015: Dynamic reassignment, name changes, and ownership structures". Research Policy, 50(5), p.104217. | None | Lia Sheer | Compustat, Patents, Publications, NPL, Name changes, Dynamic reassignment, GVKEY, Disambiguation | 1980-2015 | https://doi.org/10.5281/zenodo.3594642 | https://doi.org/10.5281/zenodo.4320782 | http://dx.doi.org/10.1257/aer.20171742, http://dx.doi.org/10.1016/j.respol.2022.104550, http://dx.doi.org/10.1287/mnsc.2023.4682, http://dx/doi.org/10.1287/mnsc.2023.00282, http://dx.doi.org/10.1287/mnsc.2023.4830 | Fri, 01 Dec 2023 12:36:44 GMT | ||||||||||||||||||||||||||||||||||||
31 | f1a7dfa7-c1f0-4414-a6b9-5a0f0d0e37f1 | Patent Citation Similarity | patent_citation_similarity | https://storage.googleapis.com/jmk_public/Kuhn-Younge-Marco_Patent_Citation_Similarity_2017-10-23.csv | Many studies of innovation rely on patent citations to measure intellectual lineage and impact. To create this dataset, we use a vector space model of patent similarity to compute the technological similarity between each pair of citing-cited patents. The VSM model analyzes the full text of each document to position it as a vector in a vector space that includes more than 700,000 dimensions and then calculates the angular distance between the two vectors. The dataset includes similarity values for all citations made by patents issued between 1976 and 2017 to issued patents or published patent applications. | Jeffrey Kuhn, Kenneth Younge, Alan Marco | Kuhn, Jeffrey M. and Younge, Kenneth A. and Marco, Alan C., Patent Citations Reexamined (June 24, 2019). RAND Journal of Economics, Forthcoming, Available at SSRN: https://ssrn.com/abstract=2714954 or http://dx.doi.org/10.2139/ssrn.2714954 | These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article. | None | Jeff Kuhn | similarity, citation | 1976-2017 | https://ssrn.com/abstract=2714954 | https://ssrn.com/abstract=2714954 | Mon, 19 Jun 2023 16:38:34 GMT | ||||||||||||||||||||||||||||||||||||
32 | b547441d-efdd-4b30-8c78-852d68c9c2ac | Patent Scope and Examiner Toughness | patent_scope_toughness | https://storage.googleapis.com/jmk_public/Kuhn-Thompson_Patent_Scope_2017-10-23.csv | This dataset includes an easy-to-use measure of patent scope that is grounded both in patent law and in the practices of patent attorneys. Our measure counts the number of words in the patents’ first claim. The longer the first claim, the less scope a patent has. This is because a longer claim has more details – and all those details must be met for another invention to be infringing. Hence, the more details there are in the patent, the greater are the opportunities for others to invent around it. We validate our measure by showing both that patent attorneys’ subjective assessments of scope agree with our estimates, and that the behavior of patenters is consistent with it. To facilitate drawing causal inferences with our measure, we show how it can be used to create an instrumental variable, patent examiner Scope Toughness, which we also validate. | Jeffrey Kuhn, Neil Thompson | Kuhn, Jeffrey M. and Thompson, Neil, How to Measure and Draw Causal Inferences with Patent Scope (October 9, 2017). International Journal of the Economics of Business, 26(1) 5-38 (2019), Kenan Institute of Private Enterprise Research Paper No. 19-29, Available at SSRN: https://ssrn.com/abstract=2977273 or http://dx.doi.org/10.2139/ssrn.2977273 | These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article. | None | Jeff Kuhn | Examiners, patent scope, legal, assessment | Need to check paper https://ssrn.com/abstract=2977273 | https://ssrn.com/abstract=2977273 | USPTO patent claims dataset | Mon, 19 Jun 2023 16:38:37 GMT | ||||||||||||||||||||||||||||||||||||
33 | 2d88904f-056b-4230-96b4-f70c178d9f88 | Patent Citation Timing and Source | patent_citation_timing | https://storage.googleapis.com/jmk_public/Kuhn-Younge-Marco_Patent_Citation_Source_and_Timing_2017-09-25.csv | Innovation studies frequently distinguish between patent citation submitted by the patent examiner and those submitted by the patent application. However, publicly available citations data is often misleading, for instance by attributing a patent citation to the patent examiner when it was in fact first submitted by the patent application. This dataset uses internal USPTO data to identify the date on which each citation was first submitted as well as the party (examiner or applicant) who first submitted it. The dataset includes observations for citations made by patents issued 2001-2014, although some level of leftward truncation is evident due to limitations in internal data availability at the USPTO. | Jeffrey Kuhn, Kenneth Younge, Alan Marco | Kuhn, Jeffrey M. and Younge, Kenneth A. and Marco, Alan C., Patent Citations Reexamined (June 24, 2019). RAND Journal of Economics, Forthcoming, Available at SSRN: https://ssrn.com/abstract=2714954 or http://dx.doi.org/10.2139/ssrn.2714954 | These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article. | None | Jeff Kuhn | timing, citation, United States | 2001-2014 | https://ssrn.com/abstract=2714954 | https://ssrn.com/abstract=2714954 | Mon, 19 Jun 2023 16:38:40 GMT | ||||||||||||||||||||||||||||||||||||
34 | eaee5eaa-985b-4ba5-a13a-797d3cfeef1f | Patent Families Dataset | patent_families | https://storage.googleapis.com/jmk_public/Younge-Kuhn_Patent_Families_2017-09-25.csv | Patent applicants frequently file groups of patent applications linked together by priority claims. These priority claims create families of patent applications that share features such as inventors, priority dates, and technical descriptions. By analyzing these linkages, each patent can be assigned a family identifier that it shares with other patents in the same family. This data set includes two levels of family identifiers (clone for near copies, and extended for more attenuated linkages) for each patent issued 2005-2014 | Kenneth Younge, Jeffrey Kuhn | Younge, Kenneth A. and Kuhn, Jeffrey M., Patent-to-Patent Similarity: A Vector Space Model (July 30, 2016). Available at SSRN: https://ssrn.com/abstract=2709238 or http://dx.doi.org/10.2139/ssrn.2709238 | These datasets are provided to the public subject to the Creative Commons Attribution-NonCommercial-NoDerivatives license. No co‑authorship is required to use the data in academic research — please just cite the supporting article. | None | Jeff Kuhn | patent family, similarity | 2005-2014 | https://ssrn.com/abstract=2709238 | https://ssrn.com/abstract=2709238 | Mon, 19 Jun 2023 16:38:43 GMT | ||||||||||||||||||||||||||||||||||||
35 | f1561d9b-8512-470f-abed-557d6e3e19ad | Patent-to-article intext citations for 244 journals | patent_to_article_intext | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZEZWBX | The data contains all articles in 244 journals as described in "In-Text Patent Citations: A User's Guide", and all front-page and in-text citations as found by the algorithm described in this paper. | Bryan, Kevin, 2019, "In-Text Patent Citation Database Bryan/Ozcan/Sampat Beta version .9", https://doi.org/10.7910/DVN/ZEZWBX, Harvard Dataverse, V2, UNF:6:+28YcwvDoaxFl/9hPXQaSA== [fileUNF] | CC0 - "Public Domain Dedication" | None | Kevin Bryan, http://www.kevinbryanecon.com/ | in-text, scholarly literature, citation, academic science, diffusion | 197?-2015? | http://www.kevinbryanecon.com/UsersGuidetoIntextCitations.pdf | Mon, 19 Jun 2023 16:38:44 GMT | ||||||||||||||||||||||||||||||||||||||
36 | 798f092c-3597-41bb-be5d-e5eb15c2b5d3 | Patent value | patent_value | https://iu.box.com/patents | The data contains all articles in 244 journals as described in "In-Text Patent Citations: A User's Guide", and all front-page and in-text citations as found by the algorithm described in this paper. | Noah Stoffman | None | Noah Stoffman, nstoffma@iu.edu | scientific value, economic growth, United States | 1926-2010 | Mon, 19 Jun 2023 16:35:13 GMT | ||||||||||||||||||||||||||||||||||||||||
37 | 131e13f8-342c-4dd7-a3e6-fbf5a5ba6a5c | PatentCity | patentcity | https://mailchi.mp/e0495246a573/patentcity | PatentCity is a dataset on the location of patentees since the 19th century in Germany, France, Great Britain and the United States of America. Beta available for test! Drop us a mail if you are interested in becoming a beta tester. | Antonin Bergeaud, Cyril Verluise | None | Antonin Bergeaud | location of inventors, geography, Europe, United States | https://github.com/Antoberge/patent_city | Mon, 19 Jun 2023 16:35:14 GMT | ||||||||||||||||||||||||||||||||||||||||
38 | 44f33a6f-5099-4481-abed-af9aadf0bd4f | Patent text: code, data, and new measures | patent_text_new_measures | https://zenodo.org/record/3515985 | Different open access data files related to the text of USPTO patent documents, including 1) for each US patent a list of processed, cleaned and stemmed keywords, 2) for each patent a list of the 1,000 most similar patents (based on cosine similarity) from the entire population of US patents, 3) for each US patent the average cosine similarity with all prior patents from the previous 5 years, and the average cosine similarity with all later patents in the following 5 years, 4) each new keyword (unigram), bigram (sequence of two adjacent keywords), trigram, and pairwise keyword combination introduced for the first time in history by a US patent, the number of the patent introducing it for the first time, and the total number of patents from the entire population using these new keywords, bigrams, trigrams, and new keyword combinations. | Sam Arts, Jianan Hou, Juan Carlos Gomez | Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144) | Open Data Commons Attribution License v1.0 | None | Sam Arts | patent measures, text, natural language processing, novelty, impact, USPTO, technological progress | 1969-2018 | https://zenodo.org/record/3515985 | Yes | https://github.com/sam-arts/respol_patents_code | https://doi.org/10.5281/zenodo.3515985 | Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144) | [{'uuid': '30103a08-e0fb-4a5f-9fc3-25bf48ca2f72', 'shortname': 'patenttext', 'relationship_type': 'supercedes'}] | Fri, 01 Dec 2023 17:56:16 GMT | ||||||||||||||||||||||||||||||||
39 | e22dcf03-9504-48c7-9cb4-468d98ec2bb2 | Matched inventor ages from patents, based on web scraped sources | matched_inventor_ages | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YRLSKU | We use information about U.S. residing inventors from patents which include name and location and search for age and date of death information from publicly available online web directories and build a scoring system to indicate the quality of information that we collect. After applying a variety of heuristics and robustness checks, we find 1,508,676 inventor ages. We also find the death dates of 206,589 inventors, though are not as confident in its accuracy. | Mary Kaltenberg, Adam Jaffe | @article{kaltenberg_matched_2021, title = {Matched inventor ages from patents, based on web scraped sources}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YRLSKU}, doi = {10.7910/DVN/YRLSKU}, abstract = {We use information about U.S. residing inventors from patents which include name and location and search for age and date of death information from...}, language = {en}, urldate = {2021-08-12}, author = {Kaltenberg, Mary and Jaffe, Adam and Lachman, Margie E.}, month = may, year = {2021}, note = {type: dataset}, } | CC0 - "Public Domain Dedication" | Mary Kaltenberg | Inventors, Ages, Gender, Death Dates, Patents, United States | Wed, 12 Jul 2023 03:01:46 GMT | ||||||||||||||||||||||||||||||||||||||||
40 | fddedcfc-9f4e-47c6-bc82-3e04bb3c4262 | Newpaper.com Index | newspaper_com | https://elisabethperlman.net/code.html | Index of newspaper.com articles | Bitsy Perlman | None | Bitsy Perlman | Mon, 19 Jun 2023 16:39:34 GMT | ||||||||||||||||||||||||||||||||||||||||||
41 | 1f556a96-61fc-4d4c-a046-ed711d9807f9 | Long-Term Productivity database | long_term_productivity | http://longtermproductivity.com/download.html | The Long-Term Productivity database was created as a project at the Bank of France in 2013 by Antonin Bergeaud, Gilbert Cette and Remy Lecat. Following the work of Cette, Mairesse and Kocoglu (2009), we extended the database to include 17 countries in the latest version (2016). The latest version of the database includes the following countries -- Australia, Belgium, Canada, Denmark, Germany, Finland, France, Italy, Japan, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom, United States. We offer data on Total Factor Productivity per hour worked, Labor productivity per hour worked, capital intensity and GDP per capita. These series cover at least the period 1890 to present annually. In addition, other data corresponding to each of the papers linked to this project are available. This includes age of capital stock, education attainment, electricity production per capita. | You are free to use the data for non-commercial use. | None | Antonin Bergeaud | productivity, Europe, United States, GDP | 1890-2020 | Mon, 19 Jun 2023 16:39:35 GMT | ||||||||||||||||||||||||||||||||||||||||
42 | 410dd9de-2520-4f57-a409-0ade7ec11b65 | Collection of Historical Data on the Uses of Petroleum International Network | uses_of_petroleum | http://www.longtermproductivity.com/chdupin/ | The research project CH.DUPIN (Collection of Historical Data on the Uses of Petroleum International Network) aims at gathering historical data on oil consumption for many countries. The current dataset contains yearly information on oil consumption, oil consumption per capita and oil consumption per unit of GDP for 16 OECD countries from 1890. | You are free to use the data for non-commercial use. We only ask you to cite the associated articles: Oil data: Bergeaud and Lepetit (2020): Research program CH.DUPIN, a short note (link) GDP data: Bergeaud, A., Cette, G. and Lecat, R. (2016): "Productivity Trends in Advanced Countries between 1890 and 2012," Review of Income and Wealth, vol. 62(3), pages 420–444. | None | Antonin Bergeaud | petroleum, oil consumption | 1890-2012 | [{'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}] | Mon, 19 Jun 2023 16:39:37 GMT | |||||||||||||||||||||||||||||||||||||||
43 | bf073285-5243-4dc6-a990-c8a8c3f79898 | Classification Data for "Classifying Patents Based on their Semantic Content" | classifying_patents_semantic_content | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZULMOY | An open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. | @article{bergeaud_classification_2017, title = {Classification {Data} for "{Classifying} {Patents} {Based} on their {Semantic} {Content}"}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZULMOY}, abstract = {Classification Data for Bergeaud, Potiron and Raimbault, 2017, Classifying Patents Based on their Semantic Content.}, language = {en}, urldate = {2021-08-17}, author = {Bergeaud, Antonin and Yoann, Potiron and Raimbault, Juste}, month = apr, year = {2017}, note = {type: dataset}, } | CC0 1.0 | None | Contact maintainer through Dataverse | United States, patents, similarity | Mon, 19 Jun 2023 16:40:13 GMT | ||||||||||||||||||||||||||||||||||||||||
44 | f61ebc77-4082-43c5-ae60-383a756ce308 | List of USPTO patents from US universities | us_university_patents | https://sites.google.com/site/abergeaudeco/data?authuser=0 | Using cross-state panel and cross-U.S. commuting-zone data to look at the relationship between innovation, top income inequality and social mobility. From the paper "Innovation and Top Income Inequality" (Aghion, Akcigit, Bergeaud, Blundell, Hémous). This dataset lists all USPTO patents from 1969 to 2016 whose assignee is a univeristy and give the name and state of this university (originally taken from USPTO and improved). | CC0 1.0 | None | Contact maintainer through Dataverse | inequality, geography, social mobility, patents | 1969-2016 | https://doi.org/10.1093/restud/rdy027 | https://academic.oup.com/restud/article/86/1/1/5026613 | Mon, 19 Jun 2023 16:40:16 GMT | ||||||||||||||||||||||||||||||||||||||
45 | fb81106d-3933-488b-acd9-aff177f82423 | HistPat International Dataset | histpat_international | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QT4OJS | HistPat International provides the geography of historical patents granted to foreign nationals by the United States Patent and Trademark Office (USPTO) from 1836 to 1975. This historical dataset is constructed using digitalized records of original patent documents that are publicly available. HistPat can be used in different disciplines ranging from geography, economics, history, network science, and science and technology studies. Additionally, it can easily be merged with post-1975 USPTO digital patent data to extend it until today. | @article{petralia_histpat_2019, title = {{HistPat} {International} {Dataset}}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QT4OJS}, doi = {10.7910/DVN/QT4OJS}, abstract = {HistPat International provides the geography of historical patents granted to foreigns by the United States Patent and Trademark Office (USPTO) fro...}, language = {en}, urldate = {2021-08-17}, author = {Petralia, Sergio}, month = mar, year = {2019}, note = {type: dataset}, } | CC0 1.0 | None | Contact maintainer through Dataverse | Historical Patents, Technological Change, Inventions, Geography, Economics | Mon, 19 Jun 2023 16:40:22 GMT | ||||||||||||||||||||||||||||||||||||||||
46 | 40f30ff4-d152-4aa8-89a9-e31dddc812dc | HistPat Dataset | histpat | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BPC15W | HistPat provides the geography of historical patents granted by the United States Patent and Trademark Office (USPTO) from 1790 to 1975. This historical dataset is constructed using digitalized records of original patent documents that are publicly available. HistPat can be used in different disciplines ranging from geography, economics, history, network science, and science and technology studies. Additionally, it can easily be merged with post-1975 USPTO digital patent data to extend it until today. (2016-05-23) | @article{petralia_histpat_2019, title = {{HistPat} {Dataset}}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BPC15W}, doi = {10.7910/DVN/BPC15W}, abstract = {HistPat provides the geography of historical patents granted by the United States Patent and Trademark Office (USPTO) from 1790 to 1975. This histo...}, language = {en}, urldate = {2021-08-24}, author = {Petralia, Sergio and Balland, Pierre-Alexandre and Rigby, David}, month = jan, year = {2019}, note = {type: dataset}, } | CC0 1.0 | None | Contact maintainer through Dataverse | Historical Patents, Technological Change, Inventions, Geography, Economics | 10.7910/DVN/BPC15W | Mon, 19 Jun 2023 16:40:28 GMT | |||||||||||||||||||||||||||||||||||||||
47 | 9651d1f2-3c24-46ef-9ade-e2e31f4ffe12 | BACI | baci | http://www.cepii.fr/CEPII/en/bdd_modele/presentation.asp?id=37 | BACI provides disaggregated data on bilateral trade flows for more than 5000 products and 200 countries. | BACI is freely available to anyone, after a quick registration. | None | Pierre Cotterlaz, baci@cepii.fr | trade, global | http://www.cepii.fr/DATA_DOWNLOAD/baci/doc/DescriptionBACI.html | https://www.whoswho.fr/usr/y/R/X/cepii.png | Mon, 19 Jun 2023 16:40:35 GMT | |||||||||||||||||||||||||||||||||||||||
48 | 1b372a68-18ae-45e3-9a28-a6feecc3e7b8 | Chinese Patent Data Project Dataverse | sipo_matching | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/CF1IXO | Matching SIPO patents to Chinese listed firms ("Main Board"). Please refer to the user documentation "Chinese Patent Database User Documentation: Matching SIPO Patents to Chinese Publicly-Listed Companies and Subsidiaries" for more details about this dataset. | @article{he_matching_2019, title = {Matching {SIPO} patents to {Chinese} listed firms ("{Main} {Board}")}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/CF1IXO}, doi = {10.7910/DVN/CF1IXO}, abstract = {Matching SIPO patents to Chinese listed firms ("Main Board"). Please refer to the user documentation "Chinese Patent Database User Documentation: M...}, language = {en}, urldate = {2021-08-17}, author = {He, Zi-Lin and Tong, Tony and Zhang, Yuchen and He, Wenlong}, month = dec, year = {2019}, note = {type: dataset}, } | CC0 1.0 | None | Contact maintainer through Dataverse | China, SIPO, disambiguation, patents, firms | through 2016? | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QUH8KT | Mon, 19 Jun 2023 16:40:56 GMT | ||||||||||||||||||||||||||||||||||||||
49 | 5ab54caa-f53c-4537-8dac-8bf20cab594e | GPT Indicators | gpt_indicators | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PQGHKA | This database contains yearly technology-level measures of Growth, Use Complementarity (UC) and Innovation Complementarity (IC) since 1920 for all technological classes in the United States Patent and Trademark Office (USPTO) classification system, as described in the article entitled "Mapping General Purpose Technologies with Patent Data". (2020-03-06) | Sergio Petralia | @article{petralia_gpt_2020, title = {{GPT} {Indicators}}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PQGHKA}, doi = {10.7910/DVN/PQGHKA}, abstract = {This database contains yearly technology-level measures of Growth, Use Complementarity (UC) and Innovation Complementarity (IC) since 1920 for all ...}, language = {en}, urldate = {2021-08-17}, author = {Petralia, Sergio}, month = mar, year = {2020}, note = {type: dataset}, } | CC0 - "Public Domain Dedication" | None | Sergio Petralia (contact maintainer through Dataverse) | growth, Use Complementarity, Innovation Complementarity, technology, patents, metrics | 1920-2020 | https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/PQGHKA/KZDEBE&version=1.0 | https://ideas.repec.org/p/egu/wpaper/2027.html | Mon, 19 Jun 2023 16:40:57 GMT | ||||||||||||||||||||||||||||||||||||
50 | fb46d05b-2bd9-41fc-a739-91b77a2e85d6 | Imputation of missing applicant country codes in worldwide patent data | missing_applicant_codes | https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XNTL0W | We present a general method for imputing missing information in the Worldwide Patent Statistical Database (PATSTAT) and make the resulting datasets publicly available. The PATSTAT database is the de facto standard for academic research using patent data. Complete information on patents is essential to obtain an accurate picture of technological activities across countries and over time. However, the coverage of the database is far from complete. Our data imputation method exploits detailed institutional knowledge about the international patent system, and we codify it in a SQL algorithm. We provide two datasets related to the imputation of missing country codes and missing technology classification. We also release the algorithm that can be easily adapted to impute other pieces of information that are missing in PATSTAT. | @article{seliger_imputation_2020, title = {Imputation of missing applicant country codes in worldwide patent data}, url = {https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XNTL0W}, doi = {10.7910/DVN/XNTL0W}, abstract = {The file ctry\_app\_person.txt contains identifiers for patent first filings and the applicant (corresponding to appln\_id and person\_id in PATSTAT) a...}, language = {en}, urldate = {2021-08-17}, author = {Seliger, Florian}, month = oct, year = {2020}, note = {type: dataset}, } | CC0 - "Public Domain Dedication" | None | Contact maintainer through Dataverse | Patents, Location of applicants, PATSTAT, Imputation | https://www.sciencedirect.com/science/article/pii/S2352340920314955 | https://github.com/seligerf/Imputation-of-missing-location-information-for-worldwide-patent-data | https://doi.org/10.7910/DVN/XNTL0W | https://doi.org/10.1016/j.dib.2020.106615 | Mon, 19 Jun 2023 16:40:59 GMT | ||||||||||||||||||||||||||||||||||||
51 | 46a031fd-8827-4bab-91b3-b41ca447f152 | Patent Examination Data System | peds | https://ped.uspto.gov/peds/#!/ | PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.The data can be accessed by anyone using the web interface or the provided Application Programming Interface (API). PEDS is updated daily and mirrors the data available in the Patent Application Location and Monitoring system (PALM). PEDS provides access to public applications including: published patent applications and patents. PCT applications that have not been published by WIPO. Any applications that have not been released by the USPTO will not be available in PEDS. | terms given here: https://www.uspto.gov/sites/default/files/documents/Patent%20Electronic%20System%20Access%20Document_0.pdf | None | USPTO | patents | 1981-2021 | https://ped.uspto.gov/peds/#!/#%2FuserManual | Mon, 19 Jun 2023 16:35:25 GMT | |||||||||||||||||||||||||||||||||||||||
52 | fc72efb0-8b24-4415-9b50-b0b7f33dc8b4 | Indian Patent Advanced Search System | india_patent_database | https://ipindiaservices.gov.in/publicsearch | Platform for accessing indian public patents data | None | Intellectual Property India | India, patents | https://ipindiaservices.gov.in/PublicSearch/PublicationSearch/Help | None | Mon, 19 Jun 2023 16:35:26 GMT | ||||||||||||||||||||||||||||||||||||||||
53 | 5d387b72-6d6c-4479-8626-e9a1a9b693f7 | UK IPO | uk_ipo | https://www.gov.uk/government/publications/ipo-patent-data | Snapshots of British patent/SPC applications received and subsequently published by the Intellectual Property Office. | Open Government License 3.0 https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ | None | UK Intellectual Property Office, https://www.gov.uk/government/organisations/intellectual-property-office | United Kingdom, patents | Mon, 19 Jun 2023 16:41:15 GMT | |||||||||||||||||||||||||||||||||||||||||
54 | a16242e8-fe81-49eb-bf1d-4df0a1927738 | Monthly statistics -- Patents, trade marks, and designs | uk_ipo_monthly | https://www.gov.uk/government/collections/patents-trade-marks-and-designs-monthly-statistics | These statistics include monthly data for designs, patents, trade marks. | Open Government License 3.0 https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ | None | UK Intellectual Property Office, https://www.gov.uk/government/organisations/intellectual-property-office | Trademarks, United Kingdom, design | Mon, 19 Jun 2023 16:41:17 GMT | |||||||||||||||||||||||||||||||||||||||||
55 | 29154d41-30ef-4539-b428-819ca4c66965 | Open Sourced Database for CEO Dismissal 1992-2018 | ceo_dismissal | https://zenodo.org/record/5348198 | This is a database of qualitatively coded reasons for a CEO’s dismissal, for S&P 1500 Companies. The maintainers of this dataset run a mailing list with a signup [here](https://docs.google.com/forms/d/e/1FAIpQLSfiZZHwyeWYEZ5fOT1_RygH-ComG9ltad5IUUY60Fsw9z3hZg/viewform) | @misc{richard_j._gentry_open_2021, title = {Open {Sourced} {Database} for {CEO} {Dismissal} 1992-2018}, url = {https://zenodo.org/record/4618103}, abstract = {There is a newer version of this database - please check the right-hand navigation for the latest version...}, urldate = {2021-09-02}, publisher = {Zenodo}, author = {{Richard J. Gentry} and {Joseph Harrison} and {Timothy Quigley} and {Steven Boivie}}, month = feb, year = {2021}, doi = {10.5281/zenodo.4618103}, note = {type: dataset}, keywords = {CEO Dismissal, Management, Strategic Management}, } | Open Data Commons Attribution License v1.0 | None | Richard Gentry | CEO, Dismissal Management, Strategic Management | 1992-2018 | Documentation included as a .docx on Zenodo | 10.5281/zenodo.4618103 | https://onlinelibrary.wiley.com/doi/abs/10.1002/smj.3278 | Execucomp, https://libguides.uml.edu/wrds/ExecuComp | Mon, 19 Jun 2023 16:41:18 GMT | |||||||||||||||||||||||||||||||||||
56 | 1a7fc85d-38af-4fe6-83b8-0d629e85d418 | A large-scale COVID-19 Twitter chatter dataset for open scientific research | covid_twitter_chatter | https://zenodo.org/record/5595136 | Dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. The first 9 weeks of data (from January 1st, 2020 to March 11th, 2020) contain very low tweet counts as we filtered other data we were collecting for other research purposes, however, one can see the dramatic increase as the awareness for the virus spread. Dedicated data gathering started from March 11th yielding over 4 million tweets a day. The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full dataset, and a cleaned version with no retweets. There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms, the top 1000 bigrams, and the top 1000 trigrams. Some general statistics per day are included for both datasets. | @misc{banda_large-scale_2021, title = {A large-scale {COVID}-19 {Twitter} chatter dataset for open scientific research - an international collaboration}, url = {https://zenodo.org/record/5458943}, abstract = {Version 78 of the dataset...}, urldate = {2021-09-07}, publisher = {Zenodo}, author = {Banda, Juan M. and Tekumalla, Ramya and Wang, Guanyu and Yu, Jingyuan and Liu, Tuo and Ding, Yuning and Artemova, Katya and Tutubalina, Elena and Chowell, Gerardo}, month = sep, year = {2021}, doi = {10.5281/zenodo.5458943}, note = {type: dataset}, keywords = {social media, twitter, nlp, covid-19, covid19}, } | None | Panacea Labs, http://www.panacealab.org/covid19/ | social media, twitter, nlp, covid-19, covid19, twitter, covid, open-source | 2000-2018 | http://www.panacealab.org/covid19/ | https://github.com/thepanacealab/covid19_twitter | 10.5281/zenodo.5458943 | https://doi.org/10.3390/epidemiologia2030024, http://doi.org/10.2196/25108, http://doi.org/10.1002/isaf.1482 | Mon, 19 Jun 2023 16:41:37 GMT | ||||||||||||||||||||||||||||||||||||
57 | fcf09f34-d5a8-483d-94a3-09a03c167100 | Biospolar Antarctic Literature and Patents | biospolar | https://osf.io/py6ve/ | Mapping the scientific and patent landscapes for biodiversity based research and innovation from Antarctica and the Southern Ocean. Created under the Biospolar Project, Research Council of Norway | @article{oldham_biospolar_2019, title = {Biospolar {Antarctic} {Literature} and {Patents}}, url = {https://osf.io/py6ve/}, doi = {10.17605/OSF.IO/PY6VE}, abstract = {Mapping the scientific and patent landscapes for biodiversity based research and innovation from Antarctica and the Southern Ocean. Created under the Biospolar Project, Research Council of Norway (RCN project number 257631/E10) Hosted on the Open Science Framework}, language = {en}, urldate = {2021-09-10}, author = {Oldham, Paul}, month = may, year = {2019}, } | The datasets are made available under a Creative Commons Attribution 4.0 International Licence. When using datasets from the repository please retain the lens_id to meet with the Lens terms of use. The lens_id along with the paperid (Microsoft Academic Graph) are the main unique identifiers for table joins so you will want to keep them anyway. Data from Microsoft Academic Graph is made available under the Open Data Commons Attribution License (ODC-By) v1.0. If using the data in a product. service or data redistribution please include this url https://aka.ms/msracad as described here. If using in a publication please cite the two articles described here. | None | Paul Oldham | antarctic, krill, biodiversity | https://osf.io/py6ve/wiki/home/ | 10.17605/OSF.IO/PY6VE | lens.org | Mon, 19 Jun 2023 16:42:14 GMT | |||||||||||||||||||||||||||||||||||||
58 | 868eaad1-3c6a-4730-a70f-853996962d39 | US Patent Similarity Data | us_patent_similarity | https://zenodo.org/record/3552078 | Pairwise semantic similarity measures for US utility patents. Includes measures for citing/cited patent pairs, 100 most-similar patents for each patent, and doc2vec vectors for each patent. | @misc{whalen_us_2019, title = {{US} {Patent} {Similarity} {Data}}, url = {https://zenodo.org/record/3552078}, abstract = {Pairwise semantic similarity measures for US utility patents. Includes measures for citing/cited patent pairs, 100 most-similar patents for each patent, and doc2vec vectors for each patent.}, urldate = {2021-09-15}, publisher = {Zenodo}, author = {Whalen, Ryan and Lungeanu, Alina and DeChurch, Leslie and Contractor, Noshir}, month = nov, year = {2019}, doi = {10.5281/zenodo.3552078}, note = {type: dataset}, keywords = {patents, intellectual property, innovation, semantic similarity, empirical legal studies}, } | Creative Commons Attribution 4.0 International | None | Ryan Whalen | patents, intellectual property, innovation, similarity, legal, patents | 10.5281/zenodo.3552078 | Mon, 19 Jun 2023 16:42:15 GMT | |||||||||||||||||||||||||||||||||||||||
59 | eb43fc38-8786-4b0f-b3b8-b9d610f456ed | Patstat Register | patstat_register | https://www.epo.org/searching-for-patents/business/patstat.html | This database contains bibliographic and legal event data on published European and Euro-PCT patent applications. Like the core PATSTAT database, it is maintained by the EPO, however PATSTAT Register only contains information about patent applications at the European Patent Office (EPO). The information in PATSTAT Register is, however, considerably deeper and more detailed. | Requires a subscription to access | € 1.420,00 - € 1.460,00 | EPO | Europe, patents, legal, citation | https://www.epo.org/searching-for-patents/business/patstat.html | https://github.com/gderasse/patstat_register | Mon, 19 Jun 2023 16:35:29 GMT | |||||||||||||||||||||||||||||||||||||||
60 | 3360e0a5-ee9b-47d3-91df-9348b86af0cf | PATENTSCOPE | patentscope | https://www.wipo.int/patentscope/en/ | The PATENTSCOPE database provides access to international Patent Cooperation Treaty (PCT) applications in full text format on the day of publication, as well as to patent documents of participating national and regional patent offices. | None | WIPO | patents, legal | 1978-2021 | https://patentscope.wipo.int/search/en/help/help.jsf | Mon, 19 Jun 2023 16:42:23 GMT | ||||||||||||||||||||||||||||||||||||||||
61 | fc08c62e-5eae-4831-9eae-4a59276e29fc | WIPO PATENT REGISTER PORTAL | patent_register | https://www.wipo.int/patent_register_portal/en/index.html | The WIPO's Patent Register Portal gives details of the availability of online patent registers by country / jurisdiction, as well as their search functionalities and the type of information they provide. | None | WIPO | geography, index, patents | Mon, 19 Jun 2023 16:42:23 GMT | ||||||||||||||||||||||||||||||||||||||||||
62 | 7da1dc8e-9e6c-4a53-9571-1b2f527a5dcd | EPO worldwide bibliographic data (DOCDB) | docdb | https://www.epo.org/searching-for-patents/data/bulk-data-sets/docdb.html#tab-1 | DOCDB is the EPO's master documentation database with worldwide coverage. It contains bibliographic data, abstracts, citations and the DOCDB simple patent family, but no full text or images. | available through paid subscription, https://www.epo.org/service-support/ordering/raw-data-terms-and-conditions.html | € 2.700,00 (main dataset), € 9.100,00 (backfile) | EPO | patents, bibliographic data, abstracts | Mon, 19 Jun 2023 16:35:30 GMT | |||||||||||||||||||||||||||||||||||||||||
63 | 1ba76694-1853-4721-88f9-1079418fc3d6 | European Business Performance Database | european_business_performance | https://www.icrios.unibocconi.eu/wps/wcm/connect/Cdr/Icrios/Home/Resources/Databases/EUROPEAN+BUSINESS+PERFORMANCE+database/ | The European Business Performance database describes the performance of the largest enterprises in the twentieth century. It covers eight countries that together consistently account for above 80 per cent of western European GDP: Great Britain, Germany, France, Belgium, Italy, Spain, Sweden, and Finland. Data have been collected for five benchmark years, namely on the eve of WWI (1913), before the Great Depression (1927), at the extremes of the golden age (1954 and 1972), and in 2000. | None | crios@unibocconi.it | Europe, GDP, productivity | 1910-2000 | https://global.oup.com/academic/product/the-performance-of-european-business-in-the-twentieth-century-9780198749776?cc=it&lang=en& | Mon, 19 Jun 2023 16:42:25 GMT | ||||||||||||||||||||||||||||||||||||||||
64 | 5d36b07b-b6c6-4aac-8181-c540a95dc26f | PatentsView Citation data | patentsview_citations | https://patentsview.org/download/data-download-tables | Citation to foreign patents from US patents (foreigncitation), citation to US patent applications from US patents (usapplicationcitation), citation to US patents from US patents (uspatentcitation), non-patent citations in patents (otherreference) | Creative Commons Attribution 4.0 International License. | None | USPTO | United States, citation | Mon, 24 Jul 2023 10:03:12 GMT | |||||||||||||||||||||||||||||||||||||||||
65 | da0edeb0-caef-474c-a7f0-0910aac9b6ab | PatentsView Classification data | patentsview_classifications | https://patentsview.org/download/data-download-tables | CPC classifications, NBER classifications (to 2015), USP classificiations, WIPO technology fields, Lookup tables (CPC, USPC, WIPO, NBER, US gov. organizations), botanic info for plant patents. | Creative Commons Attribution 4.0 International License. | None | USPTO | United States, classifications, identifiers | Mon, 04 Sep 2023 02:51:35 GMT | |||||||||||||||||||||||||||||||||||||||||
66 | 5e147b1f-3a6c-4859-acc5-781154954941 | Lens Labs | lens_labs | https://www.lens.org/lens/labs/datafacilities | Links to datasets, APIs, and tools | Links to other resources, each with its own license. | None | Lens.org (Cambia) | Global, citation, identifiers, product | Mon, 19 Jun 2023 16:35:32 GMT | |||||||||||||||||||||||||||||||||||||||||
67 | dcff88bd-fe6b-4fdb-8159-809bf9d7bc1c | Dimensions | dimensions | https://www.dimensions.ai/products/free/ | Dimensions contains more than 100 million publications, ranging from articles published in scholarly journals, books and book chapters, to preprints and conference proceedings. All publications are contextualized with linked data sets, funding, publications, patents, clinical trials, and policy documents. You can also view associated categories, funders, institutions, and researcher profiles. | Digital Science | Use of both the Dimensions COVID-19 dataset and full Dimensions dataset are subject to the Dimensions Terms of use: https://www.dimensions.ai/policies-terms-legal | Free for personal, non-commercial use. | Digital Science, https://www.digital-science.com/ | scholarly literature, patents, funding, clinical trials, academic profiles, medical | https://docs.dimensions.ai/bigquery/index.html | https://console.cloud.google.com/bigquery?p=covid-19-dimensions-ai&page=table&d=data&t=publications | gender, current_assignee, research_orgs, funding_eur, research_org_state_names, overall_status, publication_ids, external_ids, funding_currency, end_date, funder_orgs, clinical_trial_ids, title, funding_amount, book_title, editors, publisher, repository_name, funding_chf, active_years, research_org_country_names, embargo_date, labels, funder_countries, research_org_countries, date_imported_gbq, conference, funding_cad, priority_date, associated_publication_arxiv_id, links, expiration_date, family_members_ids, type, source_id, mesh_terms, kind, associated_grant_ids, closed, study_outcome_measures, open_access_categories_v2, funding_gbp, primary_completion_date, funder_org, granted_date, federal_support, pages, parent_id, legal_events, application_number, category_icrp_ct, citation_string, funding_nzd, current_assignee_countries, acknowledgements, study_arms, supporting_grant_ids, start_date, category_hra, registry, study_designs, legal_status, cpc, associated_publication_id, funding_details, original_abstract, granted_year, associated_publication_pmid, cited_by_ids, funding_schemes, repository_id, original_assignee_countries, metrics, altmetrics, relationships, study_participants, associated_publication_doi, date, repository_dois, acronym, license, interventions, original_assignee, journal, date_modified, address, researcher_ids, category_for, patent_ids, category_sdg, aliases, study_maximum_age, primary_completion_year, project_numbers, study_minimum_age, repository_url, brief_title, funding_jpy, date_print, figures_amount, filing_status, category_bra, proceedings_title, original_assignee_orgs, reference_ids, conditions, date_normal, funder_org_countries, research_org_state_codes, secondary_ids, category_hrcs_rac, copyright_statement, created_date, family_count, current_assignee_orgs, subtitles, original_title, funder_org_state_codes, year, description, concepts, foa_number, study_eligibility_criteria, jurisdiction, established, publication_year, publication_date, funder_org_cities, wikipedia_url, mesh_headings, end_year, abstract, isbn, status, research_org_city_names, start_year, ipcr, keywords, filing_date, citations_count, issue, categories, category_uoa, authors, redirect, funding_cny, assignee_countries, arxiv_id, eisbn, pmcid, category_rcdc, assignee_orgs, category_hrcs_hc, open_access_categories, filing_year, linkout, resulting_publication_doi, date_inserted, family_id, claims_amount, expiration_year, funding_section, acronyms, organisation_details, date_online, phase, funding_usd, id, funding_aud, doi, investigators, citations, name, volume, language, priority_year, category_icrp_cso, inventor_names, journal_lists, email_address, funder_org_acronyms, study_type, orange_book, book_series_title, types, pmid, document_type, resulting_publication_ids, grant_number, research_org_cities | Fri, 01 Dec 2023 18:12:49 GMT | |||||||||||||||||||||||||||||||||||||
68 | 8bb14de6-ace9-4acb-a1ca-66b6d088a574 | Google Patents Research Data | google_patents_research | https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/google-patents-research-data | Google Patents Research Data contains the output of much of the data analysis work used in Google Patents (patents.google.com), including machine translations of titles and abstracts from Google Translate, embedding vectors, extracted top terms, similar documents, and forward references. | Google Patents, IFI CLAIMS Patent Services | Google Patents Research Data by Google, based on data provided by IFI CLAIMS Patent Services | Creative Commons Attribution 4.0 International License | None | Google Patents https://patents.google.com/ | terms, citation, forward references, similarity | https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=labeled_patents&page=dataset | invention_type, title_line_1, x_relative_min, class_us, issuer, representative_line_1_eu, inventor_line_1, number, language, application_number, gcs_path, filing_date, y_relative_min, publication_date, priority_date_eu, applicant_line_1, x_relative_max, y_relative_max, class_international | Mon, 19 Jun 2023 16:42:32 GMT | |||||||||||||||||||||||||||||||||||||
69 | 0a69b187-6d79-4ee8-999c-3295571e76db | NBER Economic Indicators and Releases | nber_indicators | https://back.nber.org/releases/ | Regularly-updated and archived index of economic indicators, including interest rates, stock reserves, home sales, labour statistics and productivity. This page is updated Monday-Friday. | NBER | None | NBER | metrics, economy, trade, productivity, growth, indicators | Mon, 19 Jun 2023 16:42:45 GMT | |||||||||||||||||||||||||||||||||||||||||
70 | 297f265e-eb23-48aa-b4df-54333ba779ab | Disclosed Standard Essential Patents Database | dsep_data | http://ssopatents.org/ | The OEIDD database provides a full overview of all disclosed IPR at setting organizations world-wide. Based on the archives of thirteen major SSOs as of March 2011, the disclosure data is cleaned, harmonized, and all disclosed USPTO or EPO patents or patent applications are matched against patent identities in the PATSTAT database. Overall, the database contains 46,906 disclosed patents, patent applications or blankets, from 969 different firms, with 14057 USPTO or EPO patents or patent applications identified in PATSTAT, belonging to 4814 different INPADOC patent families and 5337 different DOCDB patent families. | Rudi Bekkers, Christian Catalini, Arianna Martinelli, Timothy Simcoe, Cesare Righi | Bekkers, R., Catalini, C., Martinelli, A., & Simcoe, T. (2012). Intellectual Property Disclosure in Standards Development. Proceedings from NBER conference on Standards, Patents & Innovation, Tucson (AZ), January 20 and 21, 2012. | Anyone is free to use this data, provided that any paper or report published that uses this data includes the following literature citation: "Bekkers, R., Catalini, C., Martinelli, A., & Simcoe, T. (2012). Intellectual Property Disclosure in Standards Development. Proceedings from NBER conference on Standards, Patents & Innovation, Tucson (AZ), January 20 and 21, 2012." | None | disclosure, standards, patents | Included with files | codebook included in excel files | https://console.cloud.google.com/bigquery?p=patents-public-data&d=dsep&page=dataset | family_id, date, committee_project, third_party, blanket_scope, blanket_type, sso, serial_cleaned, tc_name, patent_owner_unharmonized, sc_name, record_id, copyright, wg_name, patent_owner_harmonized, pub_cleaned, reciprocity, licensing_commitment, disclosure_event, standard | Mon, 19 Jun 2023 16:42:41 GMT | ||||||||||||||||||||||||||||||||||||
71 | 4342caa7-23af-420c-b2f6-6088f133df6a | USPTO OCE Patent Examination Research Data (PatEx) | patex | https://www.uspto.gov/ip-policy/economic-research/research-datasets/patent-examination-research-dataset-public-pair | The latest version of PatEx (referred to below as the 2020 release) contains detailed information on nearly 11.9 million publicly-viewable provisional and non-provisional patent applications to the USPTO and over 4.6 million Patent Cooperation Treaty (PCT) applications. It is based on data that OCE downloaded from the Patent Examination Data System (PEDS) in April, 2021. The PEDS data are sourced from Public PAIR. The first time that OCE used PEDS as the basis of PatEx was for the 2019 release. We took the PEDS data and organized it into the familiar PatEx data files, which are based on the organization of the Public PAIR portal. The data files include information on each application’s characteristics, prosecution history, continuation history, claims of foreign priority, patent term adjustment history, publication history, and correspondence address information. | Stuart J.H. Graham, Alan C. Marco, Richard Miller | Graham, S. Marco, A., and Miller, A. (2015). “The USPTO Patent Examination Research Dataset: A Window on the Process of Patent Examination.” | USPTO’s online databases are not designed or intended to be a source for bulk downloads of USPTO data when accessed through the website’s interfaces. Individuals, companies, IP addresses, or blocks of IP addresses who, in effect, deny or decrease service by generating unusually high numbers of database accesses (searches, pages, or hits), whether generated manually or in an automated fashion, may be denied access to USPTO servers without notice. Bulk data products may be separately obtained from the USPTO, either for free or at the cost of dissemination. For details, see information on Electronic Bulk Data Products: https://www.uspto.gov/learning-and-resources/electronic-bulk-data-products | None | EconomicsData@uspto.gov | patents, legal, history | For the 2019 and later releases, new technical documentation is available https://www.uspto.gov/sites/default/files/documents/PatEx-2019-Technical-Doc.pdf A document describing the 2014-2017 data sets is available and can be cited as: Graham, Stuart J.H. and Marco, Alan C. and Miller, Richard, The USPTO Patent Examination Research Dataset: A Window on the Process of Patent Examination (November 30, 2015). Available at SSRN: https://ssrn.com/abstract=2702637. | https://console.cloud.google.com/bigquery?p=patents-public-data&d=uspto_oce_pair&page=dataset | https://ssrn.com/abstract=29956744, https://ssrn.com/abstract=2702637 | inventor_name_middle, inventor_address_type, file_location, file_location_date, continuation_type, examiner_id, invention_subject_matter, child_application_number, event_code, examiner_name_first, small_entity_indicator, status_description, inventor_region_code, correspondence_country_name, parent_country, uspc_class, earliest_pgpub_number, examiner_name_last, inventor_country_code, inventor_rank, wipo_pub_date, recorded_date, status_code, appl_status_code, correspondence_country_code, correspondence_street_line_2, examiner_name_middle, child_filing_date, sequence_number, patent_issue_date, correspondence_name_line_2, uspc_subclass, foreign_parent_date, appl_status_date, disposal_type, correspondence_region_name, inventor_name_last, inventor_country_name, application_type, parent_application_number, application_number, atty_docket_number, parent_filing_date, correspondence_region_code, earliest_pgpub_date, patent_number, invention_title, event_description, customer_number, application_number_pair, correspondence_street_line_1, parent_country_code, examiner_art_unit, inventor_name_first, correspondence_city, foreign_parent_id, correspondence_postal_code, correspondence_name_line_1, filing_date, confirm_number, aia_first_to_file, abandon_date, wipo_pub_number | Fri, 01 Dec 2023 18:14:00 GMT | |||||||||||||||||||||||||||||||||||
72 | 8c2b2faf-df08-45f9-9ad1-ddf3ca722b12 | SureChEMBL | surechembl | https://www.surechembl.org/search/ | SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus. Currently, the database contains 17 million compounds extracted from 14 million patent documents. | G. Papadatos, M. Davies, N. Dedman, J. Chambers, A. Gaulton, J. Siddle, R. Koks, S. A. Irvine, J. Pettersson, N. Goncharoff, A. Hersey, J. P. Overington | “SureChEMBL” by the European Bioinformatics Institute (EMBL-EBI), used under CC BY-SA 3.0. G. Papadatos, M. Davies, N. Dedman, J. Chambers, A. Gaulton, J. Siddle, R. Koks, S. A. Irvine, J. Pettersson, N. Goncharoff, A. Hersey, J. P. Overington (2016). SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Research Database Issue, 44, D1220-D1228, DOI:10.1093/nar/gkv1253, PMID:26582922. | https://www.surechembl.org/terms/ | None | EMBL-EBI, an outstation of European Molecular Biology Laboratory | biotechnology, health, chemical, bioinformatics, medical | http://chembl.blogspot.com/ | https://console.cloud.google.com/bigquery?p=patents-public-data&d=ebi_surechembl&page=dataset | https://doi.org/10.1093/nar/gkv1253 | [{'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}, {'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}, {'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}, {'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}, {'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}] | publication_number, patent_id, inchi_key, corpus_frequency, smiles, schembl_id, publication_date, field, field_frequency | Mon, 19 Jun 2023 16:35:34 GMT | ||||||||||||||||||||||||||||||||||
73 | 5f17a3b2-ecd2-4c45-8d1a-cebd28f41a64 | MatrixWare Research Collection | marec | http://www.ifs.tuwien.ac.at/imp/marec.shtml | MAREC Data is a static collection of over 19 million patent applications and granted patents in a unified file format normalized from EP, WO, US, and JP sources, spanning a range from 1976 to June 2008. In MAREC, the documents from different countries and sources are normalized to a common XML format with a uniform patent numbering scheme and citation format. The standardized fields include dates, countries, languages, references, person names, and companies as well as rich subject classifications. It is a comparable corpus, where many documents are available in similar versions in other languages. | Creative Commons Attribution NonCommercial ShareAlike 3.0 Unported License | None | marec@fandan.net | global, patents | 1976-2008 | https://console.cloud.google.com/bigquery?p=patents-public-data&d=marec&page=dataset | publication_number, publication_number_original, xml, truncated | Mon, 19 Jun 2023 16:35:35 GMT | ||||||||||||||||||||||||||||||||||||||
74 | 7d8cda0b-9ee1-47b9-9dca-8adb93206024 | USPTO OCE Patent Claims Research Data | uspto_patent_claims | https://www.uspto.gov/ip-policy/economic-research/research-datasets/patent-claims-research-dataset | The Patent Claims Research Dataset contain detailed information on claims from U.S. patents granted between 1976 and 2014 and U.S. patent applications published between 2001 and 2014. The dataset is derived from the Patent Application Publication Full-Text and Patent Grant Full Text files, available at https://bulkdata.uspto.gov/, to which the Office of Chief Economist (OCE) applied a Python algorithm to identify individual claims as well as the dependency relationship between claims. From the parsed claims text, OCE created six data files containing individually-parsed claims, claim-level statistics, and document-level statistics, including newly-developed measures of patent scope. | Marco, Alan C. and Sarnoff, Joshua D. and deGrazia, Charles, Patent Claims and Patent Scope (October 2016). USPTO Economic Working Paper 2016-04. Available at: SSRN: https://ssrn.com/abstract=2844964 | USPTO’s online databases are not designed or intended to be a source for bulk downloads of USPTO data when accessed through the website’s interfaces. Individuals, companies, IP addresses, or blocks of IP addresses who, in effect, deny or decrease service by generating unusually high numbers of database accesses (searches, pages, or hits), whether generated manually or in an automated fashion, may be denied access to USPTO servers without notice. Bulk data products may be separately obtained from the USPTO, either for free or at the cost of dissemination. For details, see information on Electronic Bulk Data Products: https://www.uspto.gov/learning-and-resources/electronic-bulk-data-products | None | EconomicsData@uspto.gov | financial services, scope, economics | 1976-2014 | Available at source, including documentation of variables | https://console.cloud.google.com/bigquery?p=patents-public-data&d=uspto_oce_claims&page=dataset | http://dx.doi.org/10.2139/ssrn.2844964 | https://ssrn.com/abstract=2844964 | pat_dep_wrd_ct, appl_id, cns_ct, claim_no, ind_flg, claim_txt, pub_no, pub_dep_wrd_avg, pub_dep_wrd_ct, dependencies, pub_wrd_ct, pat_dep_wrd_min, word_ct, pub_wrd_min, or_ct, pat_wrd_min, pat_wrd_ct, pat_no, char_ct, pat_clm_ct, pat_dep_clm_ct, pat_dep_wrd_avg, sf_ct, pub_wrd_avg, pub_dep_clm_ct, pub_clm_ct, publication_number, pat_wrd_avg, pub_dep_wrd_min | Mon, 19 Jun 2023 16:35:35 GMT | ||||||||||||||||||||||||||||||||||
75 | 7c697eb3-2d99-4b44-87cb-d3c7bb0568e1 | USPTO OCE Patent Assignment Data | uspto_patent_assignment | https://www.uspto.gov/ip-policy/economic-research/research-datasets/patent-assignment-dataset | The USPTO allows parties to record assignments of patents and patent applications to, as much as possible, maintain a complete history of claimed interests in a patent. The USPTO also permits recording of other documents that affect title (such as certificates of name change and mergers of businesses) or are relevant to patent ownership (such as licensing agreements, security interests, mortgages, and liens). The 2020 update to the Patent Assignment Dataset contains detailed information on 8.97 million patent assignments and other transactions recorded at the USPTO since 1970 and involving roughly 15.1 million patents and patent applications. It is derived from the recording of patent transfers by parties with the USPTO. | Alan C. Marco, Stuart J.H. Graham, Amanda F. Myers, Paul A. D'Agostino, Kirsten Apple | "USPTO OCE Patent Assignment Data" by the USPTO, for public use. Marco, Alan C., Graham, Stuart J.H., Myers, Amanda F., D'Agostino, Paul A and Apple, Kirsten, "The USPTO Patent Assignment Dataset: Descriptions and Analysis" (July 27, 2015). | USPTO’s online databases are not designed or intended to be a source for bulk downloads of USPTO data when accessed through the website’s interfaces. Individuals, companies, IP addresses, or blocks of IP addresses who, in effect, deny or decrease service by generating unusually high numbers of database accesses (searches, pages, or hits), whether generated manually or in an automated fashion, may be denied access to USPTO servers without notice. Bulk data products may be separately obtained from the USPTO, either for free or at the cost of dissemination. For details, see information on Electronic Bulk Data Products: https://www.uspto.gov/learning-and-resources/electronic-bulk-data-products | None | EconomicsData@uspto.gov | patents, claims, assignment | 1970-2020 | https://console.cloud.google.com/bigquery?p=patents-public-data&d=uspto_oce_assignment&page=dataset | http://ssrn.com/abstract=2636461 | reel_no, pgpub_country, admin_pat_no_for_appno, caddress_2, grant_doc_num, ee_address_1, ee_city, ee_name, appno_date, caddress_4, exec_dt, caddress_3, error, file_id, record_dt, last_update_dt, ee_postcode, grant_country, employer_assign, caddress_1, purge_in, page_count, lang, admin_appl_id_for_grant, ack_dt, grant_date, frame_no, convey_ty, rf_id, ee_address_2, cname, ee_state, appno_doc_num, pgpub_date, title, or_name, pgpub_doc_num, appno_country, publication_number, ee_country, convey_text | Fri, 01 Dec 2023 18:14:47 GMT | |||||||||||||||||||||||||||||||||||
76 | 76d0ee06-c78e-4a5a-ba1a-f0b41378b3cd | USPTO Patent Trial and Appeal Board (PTAB) API Data | ptab | https://developer.uspto.gov/ptab-web/#/search/decisions | USPTO Patent Trial and Appeal Board (PTAB) API Data contains data from the PTAB E2E (end-to-end) system making public America Invents Action (AIA) Trials information and documents available. This dataset is hosted as a RESTful API with an easy to use search interface. You can easily browse USPTO PTAB public documents, search for specific content, and request a bulk download of PTAB content. The PTAB API synchronizes close to real time with the PTAB E2E (end-to-end) system. | “USPTO PTAB API” by the USPTO, for public use. | None | USPTO | legal, trials, appeals | 1997-2020 | https://developer.uspto.gov/ptab-api/swagger-ui.html | https://console.cloud.google.com/bigquery?p=patents-public-data&d=uspto_ptab&page=dataset | InstitutionDecisionDate, AccordedFilingDate, PatentOwnerName, LastModifiedDatetime, PatentNumber, ApplicationNumber, TrialNumber, application_number, Documents, PetitionerPartyName, FilingDate, ProsecutionStatus, publication_number, InventorName | Mon, 19 Jun 2023 16:43:15 GMT | |||||||||||||||||||||||||||||||||||||
77 | 984374a7-16e9-4b35-9445-458daceb01bf | Cooperative Patent Classification Data | cooperative_patent_classification | https://www.cooperativepatentclassification.org/index | Cooperative Patent Classification Data contains the scheme and definitions of the Cooperative Patent Classification system for classifying patent documents. The CPC is the result of a partnership between the EPO and the USPTO in their joint effort to develop a common, internationally compatible classification system for technical documents, in particular patent publications, which will be used by both offices in the patent granting process | EPO, USPTO | “Cooperative Patent Classification” by the EPO and USPTO, for public use. | None | USPTO, EPO | patents, science | https://www.cooperativepatentclassification.org/cpcSchemeAndDefinitions | https://console.cloud.google.com/bigquery?p=patents-public-data&d=cpc&page=dataset | level, sizeCache, glossary, date_revised, breakdownCode, titleFull, title_full, parents, title_part, definition, children, informativeReferences, breakdown_code, application_references, residualReferences, symbol, scopeLimitingReferences, residual_references, titlePart, applicationReferences, notAllocatable, not_allocatable, synonyms, rules, limitingReferences, ipcConcordant, childGroups, additional_only, child_groups, dateRevised, status, ipc_concordant, limiting_references, precedenceLimitingReferences, informative_references | Mon, 19 Jun 2023 16:43:10 GMT | |||||||||||||||||||||||||||||||||||||
78 | e232a192-965c-4ec9-904c-155b6dfe56c5 | ChEMBL | chembl | https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/chembl | ChEMBL Data is a manually curated database of small molecules used in drug discovery, including information about existing patented drugs. | European Bioinformatics Institute | "The ChEMBL database in 2017." Anna Gaulton, Anne Hersey, Michał Nowotka, A Patrícia Bento, Jon Chambers, David Mendez, Prudence Mutowo, Francis Atkinson, Louisa J Bellis, Elena Cibrián-Uhalte, Mark Davies, Nathan Dedman, Anneli Karlsson, María Paula Magariños, John P Overington, George Papadatos, Ines Smit, Andrew R Leach Nucleic acids Research (2017) 45 (Database Issue), D945-D954 | CC BY-SA 3.0 | None | EMBL-EBI, an outstation of European Molecular Biology Laboratory | biotechnology, health, chemical, bioinformatics, medical | schema: https://www.ebi.ac.uk/chembl/db_schema | https://console.cloud.google.com/bigquery?p=patents-public-data&d=ebi_chembl&page=dataset | ChEMBL: towards direct deposition of bioassay data. Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños MP, Mosquera JF, Mutowo P, Nowotka M, Gordillo-Marañón M, Hunter F, Junco L, Mugumbate G, Rodriguez-Lopez M, Atkinson F, Bosc N, Radoux CJ, Segura-Cabrera A, Hersey A, Leach AR. — Nucleic Acids Res. 2019; 47(D1):D930-D940. doi: 10.1093/nar/gky1075 | [{'uuid': '640ed301-691a-45c6-aa9d-5f8364424044', 'shortname': 'unichem', 'relationship_type': 'similar'}, {'uuid': '8c2b2faf-df08-45f9-9ad1-ddf3ca722b12', 'shortname': 'surechembl', 'relationship_type': 'similar'}, {'uuid': 'b9602dde-b508-4e6a-9620-0e20e95104ff', 'shortname': 'chembl_ntd', 'relationship_type': 'similar'}] | rgid, activity_count, le, ddd_comment, target_type, downgraded, actsm_id, ddd_value, drug_product_flag, standard_flag, black_box_warning, max_phase_for_ind, mol_hrac_id, cx_most_apka, src_id, src_short_name, oral, component_synonym, std_act_id, patent_no, molfile, bao_endpoint, warning_country, ref_id, full_mwt, label, l5, target_desc, protclasssyn_id, res_stem_id, data_validity_comment, homologue, warning_id, level2, mecref_id, ddd_admr, record_id, level2_description, qed_weighted, publication_number, abstract, withdrawn_flag, stem_class, compound_key, sequence, db_version, warning_class, as_id, selectivity_comment, accession, withdrawn_year, indication_class, targrel_id, domain_description, molecular_mechanism, prediction_method, src_description, assay_param_id, submission_date, withdrawn_class, usan_stem_id, l1, standard_units, end_position, binding_site_comment, assay_test_type, orig_description, atc_code, activity_comment, comments, component_type, first_approval, description, assay_type, substrate_record_id, cidx, l2, published_value, warning_type, hba, cx_logd, acd_most_bpka, caloha_id, component_id, standard_type, tid_fixed, updated_on, mechanism_comment, domain_type, therapeutic_flag, cx_logp, cell_source_tissue, aidx, aromatic_rings, level3_description, synonyms, research_stem, withdrawn_reason, clo_id, ridx, direct_interaction, acd_logd, type, chebi_par_id, pubmed_id, indref_id, ad_type, cx_most_bpka, isoform, irac_class_id, parent_id, approval_date, ro3_pass, tid, year, last_active, site_id, assay_category, standard_text_value, assay_tax_id, nda_type, start_position, assay_id, cell_description, major_class, subgroup, rtb, prod_pat_id, assay_cell_type, warning_description, domain_name, sitecomp_id, who_extra, product_id, set_name, mol_atc_id, metref_id, assay_strain, l7, efo_term, withdrawn_country, mc_tax_id, bao_format, warning_year, aspect, protein_class_synonym, targcomp_id, result_flag, cell_source_organism, cell_ontology_id, go_id, upper_value, prodrug, text_value, mc_target_type, smarts, cellosaurus_id, value, ap_id, assay_source, lle, alert_name, species_group_flag, qudt_units, curated_by, definition, chembl_id, path, hrac_class_id, active_molregno, published_units, who_name, title, variant_id, relationship_desc, mechanism_of_action, first_page, parameter_type, short_name, entity_type, assay_desc, authors, curation_comment, ref_url, alogp, chirality, natural_product, published_type, source_domain_id, ddd_units, site_residues, issue, cell_name, cell_source_tax_id, mc_target_accession, mec_id, usan_substem, mutation, level1, doi, doc_id, co_stem_id, comp_go_id, compd_id, uberon_id, updated_by, last_page, disease_efficacy, warnref_id, sei, potential_duplicate, standard_inchi, normal_range_min, mesh_id, acd_most_apka, syn_type, level1_description, assay_tissue, predbind_id, mc_organism, confidence, toid, parent_type, max_phase, parameter_value, volume, irac_code, assay_class_id, activity_id, met_id, molecular_species, delist_flag, mw_freebase, domain_id, applicant_full_name, normal_range_max, level4_description, molregno, creation_date, class_level, oc_id, dosage_form, assay_subcellular_fraction, assay_organism, mesh_heading, level4, protein_class_id, doc_type, hba_lipinski, annotation, drug_record_id, trade_name, ass_cls_map_id, site_name, metabolite_record_id, priority, log_id, mol_frac_id, met_conversion, usan_stem_definition, parent_molregno, enzyme_tid, hbd_lipinski, topical, patent_expire_date, published_relation, usan_year, met_comment, biocomp_id, availability_type, parenteral, country, standard_inchi_key, formulation_id, compsyn_id, drugind_id, name, level5, alert_id, status, full_molformula, src_compound_id, l8, strength, l3, heavy_atoms, level3, enzyme_name, helm_notation, stem, tax_id, hrac_code, patent_use_code, ref_type, num_alerts, standard_upper_value, comp_class_id, route, polymer_flag, target_mapping, frac_code, bto_id, bei, version, pref_name, patent_id, stat, class_type, psa, pathway_id, canonical_smiles, db_source, mw_monoisotopic, confidence_score, dosed_ingredient, sequence_md5sum, src_assay_id, first_in_class, frac_class_id, job_id, structure_type, journal, tbl, cl_lincs_id, parent_go_id, company, mol_irac_id, relationship, acd_logp, hbd, drug_substance_flag, relation, source, pathway_key, pchembl_value, tissue_id, l4, idx, efo_id, uo_units, previous_company, compound_name, alert_set_id, smid, usan_stem, protein_class_desc, inorganic_flag, cell_id, cpd_str_alert_id, standard_value, entity_id, num_lipinski_ro5_violations, action_type, related_tid, organism, molecule_type, num_ro5_violations, units, standard_relation, active_ingredient, bao_id, innovator_company, mc_target_name, ingredient, relationship_type, molsyn_id, l6, ddd_id | Mon, 19 Jun 2023 16:42:59 GMT | ||||||||||||||||||||||||||||||||||
79 | 640ed301-691a-45c6-aa9d-5f8364424044 | UniChem | unichem | https://www.ebi.ac.uk/unichem/beta/ | UniChem is large-scale non-redundant database of pointers between chemical structures and EMBL-EBI chemistry resources. Its purpose is to optimise the efficiency with which structure-based hyperlinks may be built and maintained between chemistry-based resources, and is particularly suitable for creating such links 'on the fly' (by use of REST web services). Primarily, this service has been designed to maintain cross references between EBI chemistry resources. These include primary chemistry resources (ChEMBL, ChEBI and SureChEMBL), and other resources where the main focus is not small molecules, but which may nevertheless contain some small molecule information (eg: Gene Expression Atlas, PDBe). | European Bioinformatics Institute | biotechnology, health, chemical, bioinformatics, medical | https://chembl.gitbook.io/unichem/unichem-2.0/unichem-2.0-beta | Mon, 19 Jun 2023 16:35:37 GMT | ||||||||||||||||||||||||||||||||||||||||||
80 | b9602dde-b508-4e6a-9620-0e20e95104ff | ChemBL-NTD | chembl_ntd | https://chembl.gitbook.io/chembl-ntd/ | CHEMBL-NTD is a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases - endemic tropical diseases of the developing regions of the Africa, Asia, and the Americas. The primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data. ChEMBL-NTD is a subset of the data in the free medicinal chemistry and drug discovery database ChEMBL. | EMBL-EBI at Hinxton in the United Kingdom | use the citation associated with the deposited dataset | We encourage all users to download, copy and redistribute these data as needed. However, in the spirit of open collaboration and to enable rapid development of new therapeutics for neglected disease, we encourage following these basic principles: Users who annotate, add to, or modify these data in a way that adds significant value are encouraged to release their work to the public domain, ideally by re-contributing their findings to ChEMBL-NTD. When these data are used or cited in a paper or other scholarly work please reference the citation provided in each deposition set. Access to the ChEMBL-NTD data is under the EMBL-EBI's standard terms: http://www.ebi.ac.uk/Information/termsofuse.html | biotechnology, health, chemical, bioinformatics, medical, neglected diseases | [{'uuid': '8c2b2faf-df08-45f9-9ad1-ddf3ca722b12', 'shortname': 'surechembl', 'relationship_type': 'similar'}, {'uuid': 'e232a192-965c-4ec9-904c-155b6dfe56c5', 'shortname': 'chembl', 'relationship_type': 'similar'}] | Mon, 19 Jun 2023 16:43:23 GMT | ||||||||||||||||||||||||||||||||||||||||
81 | b8b008d6-43ba-49e1-92cb-59a9dcffaf87 | World Bank Development Indicators | world_bank_development_indicators | https://datacatalog.worldbank.org/search/dataset/0037712 | World Development Indicators Data is the primary World Bank collection of development indicators, compiled from officially-recognized international sources. It presents the most current and accurate global development data available, and includes national, regional and global estimates. | World Bank | “World Development Indicators” by the World Bank | Creative Commons Attribution 4.0 | None | data@worldbank.org | development, growth, global | https://datahelpdesk.worldbank.org/knowledgebase/topics/125589 | https://console.cloud.google.com/bigquery?p=patents-public-data&d=worldbank_wdi&page=dataset | country_name, indicator_code, year, indicator_value, country_code, indicator_name | Mon, 19 Jun 2023 16:43:26 GMT | ||||||||||||||||||||||||||||||||||||
82 | 289055b8-4e07-4d52-9f5a-7d35fa0d942b | CPA Global Technical Standards ETSI Data | technical_standards_etsi | https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/cpa-global-technical-standards-etsi | European Telecommunications Standards Institute (ETSI) IPR dataset for technical standards. These are the US assets disclosed by companies as related to technical standards in ETSI. The two major ones included are 3GPP and LTE. | CPA Global | “CPA Global Technical Standards ETSI Data” by CPA Global (through ETSI IPR) is licensed under a Creative Commons Attribution 4.0 International License. | Creative Commons Attribution 4.0 | None | Google Patents Public Data | standards, technology | https://github.com/google/patents-public-data/blob/master/tables/dataset_CPA%20Global.md | https://console.cloud.google.com/bigquery?p=innography-174118&d=technical_standards&page=dataset&project=sheets-management-319211&ws=!1m4!1m3!3m2!1sinnography-174118!2stechnical_standards | PublicationNumber, StandardBody, TechnicalStandard | Fri, 01 Dec 2023 18:14:56 GMT | ||||||||||||||||||||||||||||||||||||
83 | 2721f5ec-e599-4890-9265-9706719fc71e | 337Info - Unfair Import Investigations Information System | unfair_import_investigations | https://pubapps2.usitc.gov/337external/ | US International Trade Commission 337Info Unfair Import Investigations Information System contains data on investigations done under Section 337. Section 337 declares the infringement of certain statutory intellectual property rights and other forms of unfair competition in import trade to be unlawful practices. Most Section 337 investigations involve allegations of patent or registered trademark infringement. | US International Trade Comission | US International Trade Commission 337Info Unfair Import Investigations Information System | None | US International Trade Comission | import, legal, trade | 2008-2021 (prior to 2008 downloadable as a JSON file) | FAQ and tutorial available on the site | https://console.cloud.google.com/bigquery?p=patents-public-data&d=usitc_investigations&page=dataset&project=sheets-management-319211 | teoProceedingInvolved, lastUpdated, finalIdOnViolationDue, gcAttorney, aljAssigned, patentNumber, teoIdIssueDate, scheduledStartDateEvidHear, issueDateOtherNonFinal, docketNo, patentNumbers, internalRemand, finalDetNoViolation, id, copyrightNumbers, finalDetViolation, dateComplaintFiled, teoIdDueDate, currentStatus, currentActiveALJ, invUnfairAct, endDateMarkmanHearing, scheduledEndDateEvidHear, trademarkNumbers, respondent, actualStartDateEvidHear, htsNumbers, investigationType, investigationTermDate, dateCreated, actualEndDateEvidHear, investigationNo, ouiiAttorney, cafcAppeals, finalIdOnViolationIssue, startDateMarkmanHearing, markmanHearing, ouiiParticipation, title, dateOfPublicationFrNotice, publication_number, teoReliefGranted, complainant, targetDate | Mon, 19 Jun 2023 16:35:39 GMT | ||||||||||||||||||||||||||||||||||||
84 | 10fc1bad-8a80-4c3c-8803-8d33246fc659 | IFI Claims Patent Data Enrichments | ifi_claims_enrichments | https://www.ificlaims.com/product/product-data-enrichments.htm | IFI CLAIMS Patent Data Enrichments includes standardized assignee/applicant names and integrated legal status information. | IFI CLAIMS | variable | Costs to access via IFI, Google Patents Public Datasets hosts a core public version on BigQuery | IFI CLAIMS | analytics, patents | https://www.ificlaims.com/news/view/blog-posts/public-patent-data-now.htm | https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/ifi-claims-patent-data-enrichments | Mon, 19 Jun 2023 16:35:39 GMT | ||||||||||||||||||||||||||||||||||||||
85 | 3f98a0ed-4f5d-43d9-9bdb-4cef4e1ae46f | USPTO OCE Cancer Moonshot Patent Data | uspto_cancer | https://www.uspto.gov/ip-policy/economic-research/research-datasets/cancer-moonshot-patent-data | The USPTO Cancer Moonshot Patent Data contains detailed information on published patent applications and granted patents relevant to cancer research and development (R&D). We generate the dataset using USPTO examiner tools to execute a series of queries designed to identify cancer-specific patents and patent applications. We apply several approaches to ensure coverage of the various fields and subject matter that cancer-related innovations encompass. These include drugs, diagnostics, surgical devices, data analytics, and genomic-based inventions. The final dataset consist of roughly 270,000 patent documents spanning the 1976 to 2016 period. | Jesse Frumkin, Amanda F. Myers | Frumkin, Jesse and Myers, Amanda F., Cancer Moonshot Patent Data (August, 2016). | The OCE developed these data files for public use and encourage users to identify fixes and improvements. | None | economicsData@uspto.gov | health, cancer, drug discovery, biotechnology | 1976-2016 | https://bulkdata.uspto.gov/data/patent/cancer/moonshot/2016/cancer_patent_data_doc_v15.docx | https://console.cloud.google.com/bigquery?p=patents-public-data&d=uspto_cancer&page=dataset&project=sheets-management-319211 | Mon, 19 Jun 2023 16:46:04 GMT | ||||||||||||||||||||||||||||||||||||
86 | e3d20ecd-fa26-4572-9c1f-2b26aa47e15d | UCB Fung Institute Patent Data | ucb_fung | https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/ucb-fung-patent | Drawing upon recent advances in machine learning and natural language processing, we introduce new tools that automatically ingest, parse, disambiguate and build an updated database using United States patent data. The tools identify unique inventor, assignee, and location entities mentioned on each granted US patent from 1976 to 2016. We describe data flow, algorithms, user interfaces, descriptive statistics, a novelty measure based on the first appearance of a word in the patent corpus, and an automated co-inventor network mapping tool. | B. Balsmeier, M. Assaf, T. Chesebro, G. Fierro, K. Johnson, S. Johnson, G. Li, W.S. Lueck, D. O’Reagan, W. Yeh, G. Zang, L. Fleming | Balsmeier, B., Assaf, M., Chesebro, T., Fierro, G., Johnson, K., Johnson, S., Li, G., W.S. Lueck, O’Reagan, D., Yeh, W., Zang, G., Fleming, L. “Machine learning and natural language processing applied to the patent corpus.” Forthcoming at Journal of Economics and Management Strategy. | Creative Commons Attribution 4.0 International license | None | patents, machine learning, disambiguation, metrics, novelty | 1976-2016 | https://funginstitute.berkeley.edu/wp-content/uploads/2016/11/Machine_learning_and_natural_language_processing_on_the_patent_corpus.pdf | https://console.cloud.google.com/bigquery?p=erudite-marker-539&d=JEMS16&page=dataset | https://doi.org/10.1111/jems.12259 | Country, PatentNo, string_field_1, Geography, AssistExaminer, Type, GovernmentInterests, CPC_Layer_2, LastName, Self_Citation_Flag, sequence, id, int64_field_0, Title, ApplDate, Company, City, InventorID, Abstract, assignee_disambiguated, FirstMiddleName, CPC_Full, PrimaryExaminer, FamilyID, IssueDate, CurrentUse, string_field_2, CPC_Layer_1, ApplNo, PatentNoOrNPL_cited, LawFirm, pdpass, FullName, InventorFullname, PatentNo_citing, FutureUse, Word, Sequence, State, CountryCodeOrNPL_cited | Fri, 01 Dec 2023 18:16:37 GMT | |||||||||||||||||||||||||||||||||||
87 | 8b8da8ff-2b09-4e1f-9523-c0c549c5cfa1 | Patent PDF Samples with Extracted Structured Data | patent_pdf_samples | https://console.cloud.google.com/marketplace/product/global-patents/labeled-patents | The dataset consists of PDFs in Google Cloud Storage from the first page of select US and EU patents, and BigQuery tables with extracted entities, labels, and other properties, including a link to each file in GCS. The structured data contains labels for eleven patent entities (patent inventor, publication date, classification number, patent title, etc.), global properties (US/EU issued, language, invention type), and the location of any figures or schematics on the patent's first page. The structured data is the result of a data entry operation collecting information from PDF documents, making the dataset a useful testing ground for benchmarking and developing AI/ML systems intended to perform broad document understanding tasks like extraction of structured data from unstructured documents. This dataset can be used to develop and benchmark natural language tasks such as named entity recognition and text classification, AI/ML vision tasks such as image classification and object detection, as well as more general AI/ML tasks such as automated data entry and document understanding. Google is sharing this dataset to support the AI/ML community because there is a shortage of document extraction/understanding datasets shared under an open license. | Google Patents | CC BY 4.0 | None | Google Cloud Public Datasets Program | machine learning, OCR, document recognition, benchmarking, validation | At site | https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=labeled_patents&page=dataset | invention_type, title_line_1, x_relative_min, class_us, issuer, representative_line_1_eu, inventor_line_1, number, language, application_number, gcs_path, filing_date, y_relative_min, publication_date, priority_date_eu, applicant_line_1, x_relative_max, y_relative_max, class_international | Thu, 27 Jul 2023 09:52:46 GMT | |||||||||||||||||||||||||||||||||||||
88 | fbd6c408-e2b1-4581-8cdb-e1bca46146f7 | GRID: Global Database of Research Institutes | grid | https://www.grid.ac/ | GRID is a free and openly available global database of over 100,000 research-related organisations, including healthcare organizations, companies, governments, non-profits, each provided with a unique and persistent identifier. In addition to IDs and names, the data is augmented with with locations, addresses, hierarchical structures and much more. Open IDs such as GeoNames IDs, NUTS3 regions, WikiData IDs, CrossRef Open Funder Registry IDs, ISNI and link to country specific IDs like UCAS codes, UKPRN numbers, HESA codes are used. | Digital Science & Research Solutions Ltd | CC0 Creative Commons license | None | contact@grid.ac, Digital Science | disambiguation, geography, institutions | https://www.grid.ac/pages/policies | https://console.cloud.google.com/bigquery?p=grid-ac&page=table&d=data&t=research_orgs&project=sheets-management-319211 | Mon, 19 Jun 2023 16:35:41 GMT | ||||||||||||||||||||||||||||||||||||||
89 | 95ed0b8b-1d47-4386-9ff1-6b09028323ef | Transportation Economics in the 21st Century | transportation_economics | https://www.nber.org/research/data/transportation-economics-21st-century-data-resources | Improving access to data sets related to transportation economics and facilitating research with these datasets are cental objectives of this project. Post-doctoral researcher Caitlin Gorback, with advice from from a steering committee including Nathaniel Baum-Snow of the University of Toronto, Leah Brooks of George Washington University, Edward Glaeser, Harvard University and NBER, Stephen Redding, Princeton University and NBER, and Matthew Turner of Brown University and NBER, has collected information on a number of data sets that are available from the Department of Transportation (DOT) or that have been created by researchers who have made them available for folllow-on study. These data have been organized into several major categories below. The DOT data span a wide range of transportation modes and include information about the transportation infrastructure, the delivery of transportation services, and the demand for these services. | This project is supported by the U.S. Department of Transportation through an inter-agency agreement with the National Science Foundation, which has extended a grant to the NBER. | None | Caitlin Gorback, gorback@nber.org | geography, transportation, trade, logistics, infrastructure | Mon, 19 Jun 2023 16:46:15 GMT | |||||||||||||||||||||||||||||||||||||||||
90 | 9b37a63b-4bfd-43e9-815e-3fd84cd29301 | NBER Macrohistory Database | nber_macrohistory | https://www.nber.org/research/data/nber-macrohistory-database | During the first several decades of its existence, the National Bureau of Economic Research (NBER) assembled an extensive data set that covers all aspects of the pre-WWI and interwar economies, including production, construction, employment, money, prices, asset market transactions, foreign trade, and government activity. Many series are highly disaggregated, and many exist at the monthly or quarterly frequency. The data set has some coverage of the United Kingdom, France and Germany, although it predominantly covers the United States. | Daniel Feenberg, Jeff Miron, NBER | None | Daniel Feenberg (feenberg at nber dot org) Jeff Miron (jmiron@bu.edu) data@nber.org | Improving the Accessibility of the NBER's Historical Data, by Daniel Feenberg and Jeff Miron (NBER Working Paper 5186). Published in the Journal of Business and Economic Statistics, Volume 15 Number 3 (July 1997) pages 293-299. | Mon, 19 Jun 2023 16:46:18 GMT | |||||||||||||||||||||||||||||||||||||||||
91 | 6520861b-6600-4dcc-9ef2-2f0984283d7c | American Business Cycle | american_business_cycle | https://www.nber.org/research/data/tables-american-business-cycle | Presented here are the tables of quarterly data from Appendix B of "The American Business Cycle: Continuity and Change" Edited by Robert J. Gordon. National Bureau of Economic Research Studies in Business Cycles Volume 25, Univerisity of Chicago Press 1986. For information about sources and methods please see that volume. A feature of that volume is an extensive data appendix, compliled as a project independent of the conference in collaboration with Nathan S. Balke. The unique value of this data set is the fact that it is the only existing source for the pre-1947 quarterly data, as NIPA quarterly data series do not otherwise exist before 1947. These files include the components of GDP back from 1941 to 1919 and the quarterly real GDP back to 1875. | Robert J. Gordon, Nathan S. Balke | None | Daniel Feenberg (feenberg at nber dot org) | "The American Business Cycle: Continuity and Change" Edited by Robert J. Gordon. National Bureau of Economic Research Studies in Business Cycles Volume 25, Univerisity of Chicago Press 1986, https://www.nber.org/books-and-chapters/american-business-cycle-continuity-and-change | [] | Mon, 19 Jun 2023 16:46:21 GMT | ||||||||||||||||||||||||||||||||||||||||
92 | 5e6dc621-57a3-4374-b558-8b7c8ca3e252 | Census Block Distance Database | census_block_distance | https://www.nber.org/research/data/block-distance-database | Census Block Distances are great-circle distances calculated using the Haversine formula based on internal points in the geographic area. Census Blocks are from Census 2000 SF1 and Census 2010 SF1 files. Census Blocks "are statistical areas bounded by visible features, such as streets, roads, streams, and railroad tracks, and by nonvisible boundaries, such as selected property lines and city, township, school district, and county limits and short line-of-sight extensions of streets and roads." | NBER | data@nber.org | population, geography | Mon, 19 Jun 2023 16:46:30 GMT | ||||||||||||||||||||||||||||||||||||||||||
93 | 602ecd9b-4b5d-45f6-9ee2-16c6d83aeb9f | Historical Cross-Country Technology Adoption (HCCTA) Dataset | historical_cross_county | https://www.nber.org/research/data/historical-cross-country-technology-adoption-hccta-dataset | This Historical Cross Country Technology Adoption Dataset is a dataset that was collected to allow for the analysis of the adoption patterns of some of the major technologies introduced in the past 250 years across the World's leading industrialized economies. | NBER | Comin, D. and Hohijn B., "Cross-Country Technological Adoption: Making the Theories Face the Facts". Journal of Monetary Economics, January 2004, pp. 39-83. | None | Diego A. Comin, diego.comin@nyu.edu, Bart Hobijin, bart.hobijn@ny.frb.org | geography, technology, adoption, metrics | https://www.nber.org/hccta/hcctadhelp.pdf | Comin, D. and Hohijn B., "Cross-Country Technological Adoption: Making the Theories Face the Facts". Journal of Monetary Economics, January 2004, pp. 39-83. | Mon, 19 Jun 2023 16:46:41 GMT | ||||||||||||||||||||||||||||||||||||||
94 | 0ab62e80-2e3a-4289-8abf-0995489f5f0c | Computer Retrieval of Information on Scientific Projects | crisp | https://www.nber.org/research/data/computer-retrieval-information-scientific-projects | The NIH CRISP (Computer Retrieval of Information on Scientific Projects) is a searchable database of federally funded biomedical research projects conducted at universities, hospitals, and other research institutions. This dataset has not been updated since 2007, but is relevant to historic research | NBER | data@nber.org | 1972-1995 | Mon, 19 Jun 2023 16:46:40 GMT | ||||||||||||||||||||||||||||||||||||||||||
95 | f5c60657-0ea0-4954-8794-ea7ebadca57c | CMS's SSA to FIPS CBSA and MSA County Crosswalk | cms_ssa_fips_county_crosswalk | https://data.nber.org/data/cbsa-msa-fips-ssa-county-crosswalk.html | CMS periodically produces SSA to FIPS CBSA to county crosswalk files. They released a CBSA to MSA to FIPS county crosswalk as well. Some CMS data files have SSA state and county codes or county name rather than FIPS state and county codes. Jean Roth processed the data files below for greater ease of use. | NBER, CMS | geography, crosswalk, united states | 2005-2017 | Mon, 19 Jun 2023 16:46:38 GMT | ||||||||||||||||||||||||||||||||||||||||||
96 | ade8e030-cc95-4ea8-a52b-4063688bd02e | OpenAlex | openalex | https://docs.openalex.org/download-snapshot | OpenAlex is a free and open catalog of the world's scholarly papers, researchers, journals, and institutions — along with all the ways they're connected to one another. It is maintained by the non-profit OurResearch. | MAG, Crossref, OurResearch, Heather Piwowar, Jason Priem | CC0 | None | info@ourresearch.org | citation, scholarly literature | through 2021 | https://docs.openalex.org/ | 200Gb | Mon, 19 Jun 2023 16:45:47 GMT | |||||||||||||||||||||||||||||||||||||
97 | 08826b49-31e3-4487-a2c7-302b71f23a88 | Breadth and RETech (Text-based) | breadth | https://bowen.finance/bfh_data/ | Patent-level variables that provide researchers a new way to characterize innovation within public firms, startups, places and more. Importantly, they are distinct from existing measures and do not have look-ahead bias: they only use information available in the patent itself. RETech is higher for patents related to waves of innovation. | Donald Bowen, Laurent Fresard, Gerard Hoberg | Donald E. Bowen, III, Laurent Frésard, Gerard Hoberg (2022) Rapidly Evolving Technologies and Startup Exits. Management Science 0(0). https://doi.org/10.1287/mnsc.2022.4362 | None | Donald Bowen | patents, breakthrough, evolution, waves, productivity | 1930-2018 | https://bowen.finance/bfh_data/ | 210MB | https://github.com/donbowen/Patent-Text-Variables | https://doi.org/10.1287/mnsc.2022.4362 | Donald E. Bowen, III, Laurent Frésard, Gerard Hoberg (2022) Rapidly Evolving Technologies and Startup Exits. Management Science 0(0). https://doi.org/10.1287/mnsc.2022.4362 | https://github.com/donbowen/Patent-Text-Variables/raw/main/code/updated_graphs/RETech-1930.png | Mon, 19 Jun 2023 16:46:46 GMT | |||||||||||||||||||||||||||||||||
98 | 9e127516-e7f7-41c5-b033-eedab5433dba | Indicators on firm level innovation activities from web scraped data | bigprod | https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/CI5XRR | This data sample (in support the article "Indicators on firm level innovation activities from web scraped data" https://ssrn.com/abstract=3938767) contains data on companies' innovative behavior measured at the firm-level based on web scraped firm-level data derived from medium-high and high-technology companies in the European Union and the United Kingdom. The data are retrieved from individual company websites and contains in total data on 96,921 companies. The data provide information on various aspects of innovation, most significantly the research and development orientation of the company at the company and product level, the company’s collaborative activities, company’s products, and use of standards. In addition to the web scraped data, the dataset aggregates a variety firm-level indicators including patenting activities. In total, the dataset includes 28 variables with unique identifiers which enables connecting to other databases such as financial data. (2021-10-04) | VTT Technical Research Centre of Finland Ltd, Fraunhofer Institute for Systems and Innovation Research ISI, Public Policy and Management Institute, United Nations University Maastricht Economic and Social Research Insitute on Innovation and Technology, Delft University of Technology, University of Strathclyde | Ashouri, S., Suominen, A., Hajikhani, A., Pukelis, L., Schubert, T., Türkeli, S., Van Beers, C. and Cunningham, S., 2022. Indicators on firm level innovation activities from web scraped data. Data in brief, 42, p.108246. | None | Sajad Ashouri: sajad.ashouri@vtt.fi, Arash Hajikhani: arash.hajikhani@vtt.fi | innovation, digitalization, web mining, big data | 2021 | https://www.sciencedirect.com/science/article/pii/S2352340922004486 | https://doi.org/10.34894/CI5XRR | Ashouri, S., Suominen, A., Hajikhani, A., Pukelis, L., Schubert, T., Türkeli, S., Van Beers, C. and Cunningham, S., 2022. Indicators on firm level innovation activities from web scraped data. Data in brief, 42, p.108246. | Fri, 20 Oct 2023 10:06:43 GMT | ||||||||||||||||||||||||||||||||||||
99 | 30103a08-e0fb-4a5f-9fc3-25bf48ca2f72 | PatentText | patenttext | https://dataverse.harvard.edu/dataverse/patenttext | We provide open access to the code and data to calculate the text-based similarity between any two utility patents granted by the United States Patent and Trademark Office between 1976 and 2013, or between any two patent portfolios | Sam Arts, Bruno Cassiman, Juan Carlos Gomez | None | patent, keywords, matching, text mining, patent classification, technological similarity | https://github.com/sam-arts/smj_code | https://onlinelibrary.wiley.com/doi/epdf/10.1002/smj.2699 | [{'uuid': '44f33a6f-5099-4481-abed-af9aadf0bd4f', 'shortname': 'patent_text_new_measures', 'relationship_type': 'superceded by'}] | Thu, 27 Jul 2023 10:46:42 GMT | |||||||||||||||||||||||||||||||||||||||
100 | 6086fec7-049f-4295-9bc1-5f18cd6a3b29 | NBER Orange Book Dataset | orangebook_nber | https://www.nber.org/research/data/orange-book-patent-and-exclusivity-data-1985-2016 | Each edition of the Orange Book provides a snapshot of unexpired patent protection at a moment in time. As patents on a drug expire and new patents are issued, these changes are reflected in later editions. The Orange Book also provides a snapshot of unexpired regulatory exclusivity granted by the FDA. For example, certain novel drugs receive five years of regulatory exclusivity that blocks the entry of generic competition, even in the absence of any patents. Combining multiple editions reveals a comprehensive picture of patent and regulatory protection as it evolves over a drug’s lifecycle. These data files provide digital versions of the US Food and Drug Administration (FDA)'s Orange Book patent and exclusivity tables for years 1985-2016 (no Orange Book was published in 1986). PDF versions of the Orange Books were obtained via a Freedom of Information Act (FOIA) request, and data from these PDF files was either hand-entered or parsed in order to create the digital files. | Prof. Heidi Williams, Maya Durvasula, C. Scott Hemphill, Lisa Larrimore Ouellette, Bhaven N. Sampat | The NBER Orange Book Dataset: A User’s Guide Maya Durvasula, C. Scott Hemphill, Lisa Larrimore Ouellette, Bhaven N. Sampat, and Heidi L. Williams NBER Working Paper No. 30628 November 2022, Revised April 2023 JEL No. O0,O3 | free | Heidi Williams: hlwill@stanford.edu | drugs, pharmaceuticals, us, exclusivity | 1985-2016 | https://www.nber.org/system/files/working_papers/w30628/w30628.pdf | [{'uuid': 'dc9b201c-5f40-4aca-9f40-b7e1dcee5c67', 'shortname': 'orangebook_fda', 'relationship_type': 'child'}] | Mon, 19 Jun 2023 16:46:55 GMT |