All the Places
Gathering scraped places
data for OpenStreetMap
Ian Dees
What is All the Places?
Places data is a particularly hard data problem in a universe of hard geo problems.
It changes frequently, can’t easily be surveyed from aerials, and is tedious to capture at ground level.
What is All the Places?
There are were no recent, open places data sets
Overture’s places data is great, but has fairly shallow attribute info
What is All the Places?
Organizations want people to know where their places are, though!
Most chains will post locations on their website
What is All the Places?
All the Places scrapes those pages and outputs structured data
Some extra magic to do this once a week and make developers’ lives easier
All the Places History
Started off as a scraper for Culver’s to keep OSM up to date
Scaled up through Mapzen as a source for search
Similar/related to OpenAddresses
Years of contributions from the community
All the Places Now
A busy GitHub repo with almost daily updates
Weekly run of all spiders
Downloadable collection of data
A lightweight API, pmtiles output/map
Data Schema
One GeoJSON FeatureCollection per spider
Each feature is a scraped item
Features have identifier and spider name at minimum
Address information, geometry (usually point), phone, etc. are also included
Recently added OSM tags for categories
License
Spiders are MIT licensed
You should use the spiders for your own work!
Data produced by weekly run is CC-0 licensed
It’s ok to use with OSM, but…
What Do We Do With It?
Original usecase: use it for place search
More recent interest: keep OpenStreetMap up to date
Train models, help validate your place data, and more!
How do we help?
Readme has clear instructions for getting started
Check spider report for fluctuating/zero data counts