1 of 11

All the Places

Gathering scraped places

data for OpenStreetMap

Ian Dees

2 of 11

What is All the Places?

Places data is a particularly hard data problem in a universe of hard geo problems.

It changes frequently, can’t easily be surveyed from aerials, and is tedious to capture at ground level.

3 of 11

What is All the Places?

There are were no recent, open places data sets

Overture’s places data is great, but has fairly shallow attribute info

4 of 11

What is All the Places?

Organizations want people to know where their places are, though!

Most chains will post locations on their website

5 of 11

What is All the Places?

All the Places scrapes those pages and outputs structured data

Some extra magic to do this once a week and make developers’ lives easier

6 of 11

All the Places History

Started off as a scraper for Culver’s to keep OSM up to date

Scaled up through Mapzen as a source for search

Similar/related to OpenAddresses

Years of contributions from the community

7 of 11

All the Places Now

A busy GitHub repo with almost daily updates

Weekly run of all spiders

Downloadable collection of data

A lightweight API, pmtiles output/map

8 of 11

Data Schema

One GeoJSON FeatureCollection per spider

Each feature is a scraped item

Features have identifier and spider name at minimum

Address information, geometry (usually point), phone, etc. are also included

Recently added OSM tags for categories

9 of 11

License

Spiders are MIT licensed

You should use the spiders for your own work!

Data produced by weekly run is CC-0 licensed

It’s ok to use with OSM, but…

10 of 11

What Do We Do With It?

Original usecase: use it for place search

More recent interest: keep OpenStreetMap up to date

MapRoulette challenge

Chain Reaction

Train models, help validate your place data, and more!

11 of 11

How do we help?

  1. Add a spider for your local chain

Readme has clear instructions for getting started

  • Fix an existing spider

Check spider report for fluctuating/zero data counts