ABCDEFGHIJKLMNOPQRSTUVWXYZAAABAC
1
TitleURLData URLFormatDescriptionOwner / ProviderType of MaterialSizeOpenLicenseCostBulkAPINotesNotes 2
2
title of site / serviceprimary urlRDF, NT, CSV, JSON etc Use all that applyShort description of what it isBooks, Articles, Authors (authority records). Can have multipleRough estimate of data sizeIs it open dataDoes it costCan you get the data in bulkDoes it have an an API
3
OCLC WorldCathttps://www.oclc.org/data/data-sets-services.en.htmlhttps://www.worldcat.org/partnership/harvestset/default.jspThe WorldCat Record Harvesting Set contains XML-format metadataWorldCat as a whole does not seem to openly available or available in bulk at all. 1 GB 2012 dump of most widely held records is available http://purl.oclc.org/dataset/WorldCat/datadumps/WorldCatMostHighlyHeld-2012-05-15.nt.gz

In theory WorldCat data is available record by record by appending suffixes as described on the web page but at the moment the links provided 404 suggesting this service is no longer functional. See examples in second section on https://www.oclc.org/data/data-sets-services.en.html. Example 404 of a sample URL: http://www.worldcat.org/entity/work/id/1151002411.ttl
OCLCB, A, Au1GbY / N

(Yes for linked data dump but no for WorldCat as whole)
ODC-ByNYY
4
OCLC Viafhttp://viaf.org/viaf/data/http://viaf.org/viaf/data/viaf-20160215-clusters-rdf.nt.gzRDF/NT, Dumps available in different formats.

RDF dump is 8.5Gb
OCLCAuYODC-ByNYY
5
OCLC FASThttp://fast.oclc.org/searchfast/http://www.oclc.org/research/themes/data-science/fast/download.htmlRDF, MARC XML, ISO MARCOCLCB, A, AuYODC-ByNYY
6
Crossrefhttp://www.crossref.org/http://crossref.org/04intermediaries/index.htmlDOI , Registry of online,citation linking backbone for all scholarly information in electronic form. Crossref is a collaborative reference linking service that functions as a sort of digital switchboard. It holds no full text content, but rather effects linkages through Crossref Digital Object Identifiers (Crossref DOI), which are tagged to article metadata supplied by the participating publishers. The end result is an efficient, scalable linking system through which a researcher can click on a reference citation in a journal and access the cited article. - See more at: http://crossref.org/01company/02history.html#sthash.fcRVzr2Y.dpuf

CrossRef Metadata API will also allow publishers to register URIs pointing to licensing information in their CrossRef DOI metadata using a new <license_ref> element. This URI can, in turn, be used by researchers to learn the conditions under which they are allowed to perform text and data mining on the content they retrieve. Open access publishers will be able to use this element to record the URIs of well-known open licenses like those of the Creative Commons.
Publishers International Linking Association Inc. (PILA)B,A,Au and much more?Y -
http://www.crossref.org/04intermediaries/34affiliate_fees.html#CMS_2012_Fees
YYhttp://tdmsupport.crossref.org/researchers/
7
OpenLibrary and Internet archive https://archive.org/details/ol_datahttps://openlibrary.org/dev/docs/jsondumpJSON, XML, HTML, CSV,non-profit, building a digital library of Internet sites and other cultural artifacts in digital formB,A,Au and much moreOver 1.000.000 ebooks YNYY
8
BL British National Bibliographyhttp://bnb.bl.uk/http://www.bl.uk/bibliographic/download.html#lodbnbRDF, XML, CSV, NT, TTL,British National Bibliography (BNB) published as Linked Data by the British Library, linked to external sources including: VIAF, ISNI, LCSH, Lexvo, GeoNames, MARC country, and language, Dewey.info, RDF Book Mashup. Published to this data model for books and this data model for serials.PDDL / CC0NYY
9
British Library Integrated Catalogue - Bookshttp://www.bl.uk/bibliographic/datafree.htmlhttp://www.bl.uk/bibliographic/downloads/BLICBasicBooks_201602_rdf.zipRDF, XMLBooks in our collections not eligible for BNB. Updated monthly. Zipped folders include multiple files and a PDF document.
Please note: these versions of the datasets are structured using a basic form of RDF/XML (e.g. they do not contain URIs) and are not primarily intended for use in a linked data environment.
BLIC Basic Books1.329.476 kb YPDDL / CC0NYY
10
British Library Printed Musichttp://www.bl.uk/subjects/musichttp://www.bl.uk/bibliographic/datafree.html#csvRDF, XML, CSVDataset comprising records for printed music held at the British Library. Zipped folders include multiple files and a PDF document.
Please note: these versions of the datasets are structured using a basic form of RDF/XML (e.g. they do not contain URIs) and are not primarily intended for use in a linked data environment.
BL Printed music 88.070 KBYNYY
11
NY Public Libraryhttp://www.nypl.org/help/about-nypl/legal-notices/open-metadatahttp://api.repo.nypl.org/not sure Books, Articles, photos etcYCCO 1.0NYYhttp://www.nypl.org/blog/2016/01/05/share-public-domain-collections
12
OpenLibrary Databasehttps://openlibrary.org/developers/dumpshttp://openlibrary.org/data/ol_dump_latest.txt.gzThis contains dump of latest editions of all the records in Open Library. This is a tab separated file with the following columns:

type - type of record (/type/edition, /type/work etc.)
key - unique key of the record. (/books/OL1M etc.)
revision - revision number of the record
last_modified - last modified timestamp
JSON - the complete record in JSON format

This dump can be downloaded from:

http://openlibrary.org/data/ol_dump_latest.txt.gz

For convenience, this dump is split into multiple files based on type.

editions dump - http://openlibrary.org/data/ol_dump_editions_latest.txt.gz
works dump - http://openlibrary.org/data/ol_dump_works_latest.txt.gz
authors dump - http://openlibrary.org/data/ol_dump_authors_latest.txt.gz
Internet ArchiveB, Au8.000.000 texts, 2.100.000 books Free to read, download, print, and enjoy. Some have restrictions on bulk re-use and commercial use, please see the collection or the sponsor of a book. By providing near-unrestricted access to these texts, we hope to encourage widespread use of texts in new contexts by people who might not have used them before.Some have restrictions on bulk re-use and commercial use, please see the collection or the sponsor of a book. By providing near-unrestricted access to these texts, we hope to encourage widespread use of texts in new contexts by people who might not have used them before.
13
NIH US National Library of Medicinehttps://www.nlm.nih.gov/api/https://wwwcf2.nlm.nih.gov/nlm_eresources/eresources/search_database.cfmHTML, XML, MARC, API, TXT, PDF, FTP,EXEPublicly available data USA.govUSA.govYNYYftp://ftp.nlm.nih.gov/online/mesh/The National Library of Medicine is pleased to announce a second beta release of MeSH® (Medical Subject Headings) RDF around June 18, 2015. The first beta release of MeSH RDF was published in November 2014 for selected institutions and organizations. Since that time, NLM Linked Data Infrastructure Working Group has been working with the selected institutions to test and evaluate MeSH RDF
14
SiteSeerhttp://citeseer.ist.psu.edu/index;jsessionid=4374B3D076ED5186AA876AEBA9EE17B4
http://citeseerx.ist.psu.edu/oai2
CiteSeerx is an evolving scientific literature digital library and search engine that has focused primarily on the literature in computer and information science. CiteSeerx aims to improve the dissemination of scientific literature and to provide improvements in functionality, usability, availability, cost, comprehensiveness, efficiency, and timeliness in the access of scientific and scholarly knowledge.
CC BY-NC-SA 3.0
NYYhttp://citeseerx.ist.psu.edu/oai2
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100