Resource - List of Interesting Very Large Datasets of Images
Comments
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
$
%
123
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Still loading...
ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Approx.#OfImages
URLRemarkgolan's notes
2
79,302,017http://www.historicnewengland.org/collections-archives-exhibitions/collections-access/highlights/wallpaper1
3
67,000,000http://www.kleegestaltungslehre.zpk.org/ee/ZPK/BF/2012/01/01/001/no bulk download0
4
48,000,000http://www.vision.caltech.edu/visipedia/CUB-200.html200 species of birds, categorized0
5
14,000,000http://www1.cs.columbia.edu/CAVE/software/softlib/coil-100.php128x128px images of 100 objects rotated in 5-degree increments1
6
10,298,323http://aaronsfiles.com/SheepProcessing/SheepView_Processing.zipProcessing sketch with vectors for handdrawn sheep
7
10,000,000http://www.robots.ox.ac.uk/~vgg/data/flowers/1
8
7,000,000https://web.archive.org/web/20150703060412/http://137.189.35.203/WebUI/CatDatabase/catData.htmlOriginal URL on google search is down; go Internet Archive1
9
5,000,000http://www.vision.caltech.edu/Image_Datasets/Caltech101/This is older and smaller than Caltech 256
http://research.microsoft.com/pubs/80582/ECCV_CAT_PROC.pdf
1
10
4,763,691https://github.com/KanjiVG/kanjivg/releases1
11
3,000,000http://pfid.rit.albany.edu/0
12
2,500,000http://places.csail.mit.edu/
13
1,200,000http://vis-www.cs.umass.edu/lfw/1
14
1,100,000https://www.flickr.com/photos/projectapolloarchive/albums/with/721576567027242841
15
1,000,000http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/1
16
1,000,000http://images.ikea.com/assetbank-ikea/action/viewHome1
17
1,000,000http://cybertron.cg.tu-berlin.de/eitz/projects/classifysketch/1
18
1,000,000http://www1.cs.columbia.edu/CAVE/databases/facetracer/Links to online images of faces, w/metadata, but some links dead.
19
671,628http://biometrics.idealtest.org/dbDetailForUser.do?id=71
20
600,000http://human-pose.mpi-inf.mpg.de/0
21
561,628http://www.ifnenit.com/download.htm1
22
250,000http://ballads.bodleian.ox.ac.uk
See also imagematch.bodleian.ox.ac.uk for CBIR on a subset (docs at http://balladsblog.bodleian.ox.ac.uk/blog/570 and http://balladsblog.bodleian.ox.ac.uk/blog/1069)
23
250,000https://github.com/WaltersArtMuseum/walters-apiSee image API docs: https://github.com/WaltersArtMuseum/walters-api/blob/master/images.md
24
223,128http://www.vision.caltech.edu/Image_Datasets/Caltech256/
25
202,599http://digital.library.pitt.edu/images/pittsburgh/1
26
201,544http://openplaques.org/about/datacommemorative plaques. a mix of close up and context photos
27
200,000http://www.bottlecapclub.org/index.phpbottle caps. would need to scrape.1
28
180,000https://images.nga.gov/en/page/show_home_page.html0
29
112,039http://www.libcrowds.com/data/Various languages. Example subset: https://www.flickr.com/photos/132066275@N04/sets/72157657517602031
30
102,212https://github.com/dimatura/getpubfig
31
93,000http://memorability.csail.mit.edu/explore.htmlLarge-scale Image Memorability. images and memorability metadata1
32
80,000http://www.davidrumsey.com/no bulk download
33
80,000http://www.cs.toronto.edu/~kriz/cifar.html0.5
34
78,000https://github.com/tategallery/collectionMuseum collection, some works are under copyright protection.0
35
70,000http://lipitk.sourceforge.net/datasets/tamilchardata.htm1
36
70,000http://yann.lecun.com/exdb/mnist/meta data 0.5
37
67,000https://github.com/cmoa/collectionIn particular, check out the Teenie Harris collection—the image resolution is better for those.
38
60,000https://staff.fnwi.uva.nl/t.e.j.mensink/rijks/Also see https://www.rijksmuseum.nl/en/api
39
60,000https://www.flickr.com/photos/biodivlibrary/For more info see http://www.biodiversitylibrary.org/ 0
40
60,000http://www.progettosnaps.net/arcade machine game screenshot 'snaps' and marquees0
41
50,000http://benchmark.ini.rub.de/?section=gtsrb&subsection=datasetGerman Signs1
42
50,000http://www.nypl.org/research/collections/digital-collections/public-domainPublic Domain part of NYPL labs1
43
45,000https://news.artnet.com/art-world/art-uk-public-accessible-online-434489no bulk download0
44
37,882https://collection.cooperhewitt.org/api/
45
36,000http://mmlab.ie.cuhk.edu.hk/projects/CelebA.htmllarge-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations
46
32,750http://ukiyo-e.org/Created by John Resig. Check out his work using this database
47
30,607http://www.metmuseum.org/collection/the-collection-online
48
30,000http://collections.vam.ac.uk/
49
30,000http://herbarium.bgbm.org/objectCaution: URL is zip file of catalogue
50
26,459http://ufldl.stanford.edu/housenumbers/1
51
25,000http://digitalcollections.nypl.org/Not sure if there is a way to batch download :( Also note that these images may be (will most likely be) copy righted.1
52
20,000https://www.flickr.com/photos/britishlibrary/http://www.bl.uk/collection-guides/datasets-for-image-analysis1
53
20,000https://archive.org/details/audio-covers1
54
20,000http://vision.cs.stonybrook.edu/~vicente/sbucaptions/script to download from Flickr1
55
15,000http://landsat.gsfc.nasa.gov/?page_id=2Bulk download: http://landsat.gsfc.nasa.gov/?p=12750
56
14,000http://www.iapr-tc11.org/mediawiki/index.php/Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_(HIT-OR3C)1
57
13,000https://archive.org/details/geographarchiveaims to collect geographically representative photographs and information for every square kilometre of Great Britain and Ireland1
58
11,000https://www.flickr.com/photos/internetarchivebookimagesAlso see https://blog.archive.org/2015/10/23/zoom-in-to-9-3-million-internet-archive-books-and-images-through-iiif/ for API access notes
59
10,000http://image-net.org/explore1
60
10,000http://chroniclingamerica.loc.gov/about/api/Historic newspaper pages from Library of Congress.
61
10,000http://lsun.cs.princeton.edu/0.5
62
10,000http://www.nlpr.ia.ac.cn/databases/handwriting/Home.htmlRequires permission via email to get ftp link. They gave it to me for dubious purposes, so don't think this is hard. The download is very slow though.
63
8,121http://pillbox.nlm.nih.gov/developer.html0
64
7,200http://www.europeana.eu/portal/Not sure if there is a good way for bulk download
65
6,033http://horatio.cs.nyu.edu/mit/tiny/data/index.htmlhttp://groups.csail.mit.edu/vision/TinyImages/1
66
6,000http://webscope.sandbox.yahoo.com/catalog.php?datatype=i&did=671
67
5,640http://www.robots.ox.ac.uk/~vgg/data/dtd/1
68
5,600http://amos.cse.wustl.edu/datasetArchive of webcam feeds1
69
5,364http://press.liacs.nl/mirflickr/#sec_downloadLink page to several face sets1
70
5,353http://erikbern.com/2016/01/21/analyzing-50k-fonts-using-deep-neural-networks/1
71
5,062
50k bitmapped font sets, most complete with 62 characters (upper and lower case letters and numbers). download link to HDF5 at the bottom of the page
72
5,000https://www.flickr.com/commons/institutions/Starting point for multiple public domain collections0
73
3,915https://www.flickr.com/photos/121003427@N03/0
74
3,900https://nik.bot.nu/0
75
3,500http://deeplearning.net/datasets/this is a list of datasets, contains some of the above, and loads more1
76
1,814http://webscope.sandbox.yahoo.com/catalog.php?datatype=i1
77
1,620https://uwdc.library.wisc.edu/collections/WI/BrittinghamImgs/not VERY large but different - intimate collection of Wisconsin family1
78
1,620https://github.com/rev3rend/instadownload
79
1,000http://openglam.org/open-collections/List of open digital collections from GLAM institutions, many European
80
http://ddd.unil.ch/drawings of "God"/deities by children -- see also http://arxiv.org/pdf/1511.03466v1.pdf1
81
https://sites.google.com/site/pornographydatabase/Signed Agreement required for access0
82
http://www.vision.caltech.edu/html-files/archive.html1
83
http://www.algaterra.org/default.htmhave not figured put how to scrape, but its on the to do list
84
http://eol.jsc.nasa.gov/Tools/Batch Download Tools
85
https://aws.amazon.com/nasa/nex/1
86
http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.htmlRGBD
87
http://www.rrpicturearchives.net/rsRRList.aspx?id=19
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...
 
 
 
Sheet1
Sheet2