ABCDEFGHIJKLMNOPQRSTUVWX
1
Last Updated: May 1, 2025. For sales, contact sales@emetresearch.ai
2
Dataset DescriptionDataset TypeAvailabilityDataset Tags
3
180 Million Human Ranked Photos - Featuring ranked, high-quality photography with comprehensive EXIF metadata, and human annotation, this dataset supports generative AI, computer vision, and smart automation. A powerful resource for training AI models and visual recognition systems.Images/VideoImmediateUser personalization preferences; content metadata (tags, ratings); usage telemetry (download counts, time spent); AI prompt-to-image pairs
4
Comprehensive vehicle telemetry data from 10,000 vehicles, updated monthly via subscription, can scale to 50,000. This dataset provides real-time insights into vehicle performance, maintenance patterns, and operational metrics, ideal for automotive research, insurance fleet management, and machine learning applications. Vehicle TelemetryImmediateReal-time vehicle telemetry (location, diagnostics); trip behavior (speed, braking); charging patterns; driver identity metadata (anonymized)
5
Continously updated 20,000 weekly survey responses from a representative U.S. population sample. Covering consumer sentiment, economic outlook, and behavioral trends, it’s ideal for AI training, market forecasting, and business intelligence with structured, metadata-rich survey data.Market ResearchImmediateStructured survey metadata; response timestamps; demographic segmentation; market research
6
Millions of Hours of Dashcam Footage – A massive, real-world video dataset designed to accelerate the development of autonomous vehicles, crash detection systems, and advanced driver-assistance systems (ADAS). Captured from a wide range of environments—urban, suburban, and highway—this dataset includes synchronized video, GPS, and sensor metadata.DashcamImmediateWide-angle dashcam video; GPS metadata; traffic signal tags; road condition labels
7
16 Million Hours of First-Person Video and Conversational Audio – This unparalleled dataset captures rich, multimodal, first-person interactions across diverse real-world environments. Sourced from wearable devices and mobile-mounted cameras, it includes high-fidelity video paired with synchronized, natural conversational audio. Video, Audio, GPSImmediateMultimodal video/audio streams; speaker segmentation; environment labels; conversation transcripts
8
Geospatial Data for Storefront Analysis – A location-rich dataset built for analyzing retail presence, competitive positioning, and urban foot traffic dynamics. Featuring millions of storefront coordinates across major metropolitan and suburban areas, it includes metadata such as business category, signage visibility, opening hours, customer density estimates, and historical changes in occupancyGeoSpatialImmediateStorefront coordinates; POI metadata; business classification; foot traffic estimates
9
Specialize in high-quality AI frontier data collection and annotation across multiple strategic verticals including cryptocurrency, healthcare, robotics, and e-commerce etc. General2-3 weeksOn-chain address annotations (entity labels, AML risk tags); multi-party cross-referenced metadata; zero-knowledge-protected attribute packs (demographics, regional trends)
10
Real-World Hand, Eye, and Task Videos for Robotics TrainingRobotics ImageryImmediateMultimodal embodied-AI datasets: video, audio, tactile sensor streams; robotic motion logs; environment scans; interaction transcripts
11
Social Sentiment Scanning and Monitoring DataSocial Scraping2-3 weeksAnonymized user behavioral events; device fingerprinting data; transaction and session logs; flagged fraud patterns (labels)
12
Soccer and Sports Videos, Annotated. Video (Sports)ImmediateTagged sports media assets; metadata (player, event, timestamps); usage/distribution stats; automated highlights catalogs
13
Personal webscapping (Venmo etc) - Worldcoin user dataSocial ScrapingImmediateParticipant assessment data (surveys, reflections); physiological/biometric logs (if used); session transcripts; engagement metrics
14
Blockchain analytics, Wallet and Transaction Data (Finance)Blockchain DataImmediateOn-chain analytics (smart contract events, token transfers); subgraph query results; community platform metrics (Discord, Telegram); dashboard usage logs
15
Synthetic Code DatasetsCodeImmediateGame foundation model outputs; character animation logs; user interaction/playtest data; procedural content generation parameters
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100