Geocoding
without Programming
Workshop Overview
What is Geocoding?
Obtain geographic coordinates for a place name, address or code
city, building, mountain, landmark,
street address, intersection,
hazardous waste sites,
crime or other event location,
zip code, etc.
Geocoding
Get geographic coordinates for a
place name, address or code
Sather Gate, UC, Berkeley
37.87014, -122.2595
GEOCODING SOFTWARE
Sather gate
Try Geocoding
Web Maps use geocoding sofware when you do a place name, code or address search.
Enter a place name in the search bar at www.openstreetmap.org
Then, copy and paste the returned coordinates in maps.google.com, and see how well the geocoder works.
Reverse Geocoding
Get the place name, code or address for geographic coordinates
Reference Database
Geocoding requires a reference database against which place names, codes and addresses are matched.
Input data are compared to the reference database and best matches are retrieved.
Reference database includes:
Geographic Coordinates
Latitude:
+- 90 degrees
How far N/S of Equator
Longitude
+-180 degrees
E/W of Prime Meridian
DMS vs Decimal Degrees
37° 52' 12"N, 122° 15' 36" W
37.870145, -122.25952
Geographic Coordinate Reference Systems
Lines of latitude and longitude are part of a geographic coordinate reference system (GCS).
There are several widely used GCSs including
Map Projections / Projected Coord Systems
Map = flat representation of the non-flat Earth
Map projection = mathematical transformation from 3D surface to 2D plane.
Coordinate System & Map Projections
A working knowledge of geographic coordinate systems and map projections is very important.
Good online references
Geographic Coordinates
Try it - enter the decimal degree coordinates in Google Maps search bar.
What order are the coordinates in?
How many decimals can you delete and still be at same location?
37.870058, -122.257944
Why Geocode?
Web Scraping
http://www.co.contra-costa.ca.us/DocumentCenter/Home/View/2462
http://www.meganslaw.ca.gov/
How to Geocode
The Street Address Geocoding Process
Addresses Geocoding
Input street addresses
Compare to reference database
Parse addresses
to primary components
Output
coordinates & metadata for best match
Addresses Geocoding
Input street addresses
Compare to reference database
Parse addresses
to primary components
Output
coordinates & metadata for best match
7305 Edgewater Dr, Oakland, CA, 94621
Addresses Geocoding
Input street addresses
Compare to reference database
Parse addresses
to primary components
Output
coordinates & metadata for best match
7305 Edgewater Dr, Oakland, CA, 94621
7305| Edgewater Dr|Oakland | CA | 94621
Addresses Geocoding
Input street addresses
Compare to reference database
Parse addresses
to primary components
Output
coordinates & metadata for best match
7305 Edgewater Dr, Oakland, CA, 94621
7305| Edgewater Dr|Oakland | CA | 94621
Addresses Geocoding
Input street addresses
Compare to reference database
Parse addresses
to primary components
Output
coordinates & metadata for best match
7305 Edgewater Dr, Oakland, CA, 94621
7305| Edgewater Dr|Oakland | CA | 94621
37.7446,-122.2063
7315 Edgewater Dr, Oakland, CA, 94621
Compare to reference database
Parse addresses
to primary components
Output Coordinates
& metadata
7305 Edgewater Dr, Oakland, CA, 94621
7305| Edgewater Dr|Oakland | CA | 94621
Prepare Input Addresses
Review output & Repeat
You
Geocoding Software
37.7446,-122.2063
Geocoding reference data set
7305
real location
Try It - Geocode this address
2700 Bancroft Way, Berkeley, CA 94704
Google Maps - maps.google.com
OpenStreetMaps -www.openstreetmap.org
Try making spelling mistakes or leaving out parts to see how well the geocoder can deal with less than perfect data.
Geocoding & You
Input street addresses
Compare to reference database
Parse addresses
Output
coordinates & metadata for best match
Clean addresses
Geocoding & You
Input street addresses
Compare to reference database
Parse addresses
Output
coordinates & metadata for best match
Clean addresses
7305 Edgewater Drive # D, Oakland, 94621
7305 Edgewater Drive #D, Oakland, 94621
7305 Edgewater Drive, Oakland, CA, 94621
remove unnecessary components
add necessary components
Geocoding & You
Most geocoders will be able to geocode ~ 80% of your addresses with little cleaning and get you within a block of the actual location.
Comparing output quality
| TAMU | OSM | Bing | Yahoo | Here | ||
% Matched | 100% | 80% | 70% | 100% | 100% | 90% | 100% |
Average ∆ (ft) | 24,265 | 3,439 | 894 | 104 | 166 | 70 | 92 |
| | | | | | | |
Free
Freemium
Source: SmartyStreets.com, 2016
Geocoding & You
If you need high quality output you need to understand
Geocoding & You
WARNING:
If you are working with restricted use data, your options will be much more limited and the costs of getting high quality output much greater.
Cleaning & Standardizing Addresses
Lots you can do.
Brief review - get slides for reference.
Cleaning & Standardizing Addresses
Cleaning & Standardizing Addresses
Intersections
“&” and “AND” are most common for US Streets
Directional Prefixes / Suffixes
Should be in form: N, S, E, W, NW, etc.
No periods, dashes, or full words!
preferred
PO Box
Change the following to PO Box
BX, P O Box, POBOX, PO BOX, P OBOX, POB, PMB, PO Drawer, POST OFFICE DRAWER, PBS Box ZIP
PO Box 123 (456 Main St) > 456 Main St
PO Box 123 or 456 Main St > 456 Main St
http://www.albany.edu/faculty/ttalbot/Geocoding_Lecture_2015.pdf
Data entry errors are hard to find & fix!
Preventing Problems
If you are collecting address data via surveys try to create surveys that minimize the likelihood of address input errors.
Problems in Reference Data
Errors
Other
Oversimplified Street Features
http://www.albany.edu/faculty/ttalbot/Geocoding_Lecture_2015.pdf
26
Geocoding match rates can vary by location type
http://www.albany.edu/faculty/ttalbot/Geocoding_Lecture_2015.pdf
Cleaning & Standardize Addresses
Address Formatting
Prepare data in format required by geocoder
Address Formats
Single field format
Multi field input
Tips for processing lots of addresses
You may need still programming!
Criteria for Selecting a Geocoder
Free/Freemium Online Geocoding Services
That do not require programming!
Free Local (On Premise) Geocoders
You need a local reference database to be completely offline.
Geocoding without Programming
Google Fusion Tables
When you just want to make a map!
Google Fusion Tables
Google Fusion Tables
Requires location data in one column
ID,Store,Address
1,Wah Fay Liquors,2101 8th Ave Oakland CA 94606
2,Vision Liquor,1615 Macarthur Blvd Oakland CA 94602
3,Souza's Liquors,394 12th St Oakland CA 94607
4,Tk Liquors,1500 23th Ave Oakland CA 94606
5,Quadriga Wines Inc,6193 Ridgemont Dr Oakland CA 94619
Google Fusion Tables
You can input location as place name or street address.
Google Fusion Tables
What we like:
Not so much:
Try Geocoding with Google Fusion Tables
Use the sample address data.
Try oak_liq_w_ids.csv file.
oak_liq_gfusion_format.csv
US Census Geocoder
http://geocoding.geo.census.gov/
Census Geocoder
Two options - (1) Find Locations & (2) Find Geographies
(2) let’s returns codes to link to census data!
Census Geocoder
Let’s try it
DEMO: http://geocoding.geo.census.gov/
https://www.census.gov/geo/maps-data/data/geocoder.html
oak_liq_stores.csv
20 Oakland Liquor Store Addresses (subset - grabbed online)
Census Geocoder Output
Your input address
Census Geocoder Output
Geocoder output - note match quality metadata.
Census Geocoder Output
Need to split coordinates into to columns (lon and lat) before you can map.
Census Geocoder Output
Census FIPS codes!
Post-processing in Google Sheets
View geocoded output in
geojson.io
Map & Analyze output in QGIS, ArcGIS
Census Geocoder - Likes
Census Geocoder - Not so much
Try Census Geocoder
Geocode sample data oak_liq_w_ids.csv file.
Google Earth Pro Geocoder
Input format - Requires header row
Save output to KML or KMZ
Open Google Earth KML in QGIS or ArcGIS
You can open in geojson.io & save to csv
Now you can process in
QGIS, ArcGIS, R, Python, Stata,
Google Earth Pro - Cities, Global Coverage
cities.csv
ID,CITY,STATE
1,Boston,MA
2,New York,NY
3,Ipswich,MA
4,Paris,France
addresses.csv
ID,STREET,CITY,STATE
1,18 Grove St,Boston,MA
2,727 5th Ave,New York,NY
3,246 High St,Ipswich,MA
4,42 Rue d’Anjou,Paris,France
Geocoding with Google Earth Pro
Like 💝💖💖
Not so much
Try Geocoding with Google Earth Pro
Geocode sample data oak_liq_w_ids.csv file.
Install the software if needed or partner with someone.
ArcGIS esri.com
Why ArcGIS?
Why ArcGIS?
BUT ArcGIS….
First 5,000 free - $4,000 for 1,000,000 addresses
ArcGIS Online Geocoder - ESRI credits
https://developers.arcgis.com/en/features/geocoding/
ArcGIS Geocoder with ESRI NA Streets data
ArcGIS Business Analyst 2015
2014 Street Data & Geocoding Software
Requires
Why Use ArcGIS & ESRI Local Streets data
Geocoding in ArcMap
STEP 1: Load your address data in ArcMap
STEP 2: Right-click on the file name in the layer list to access geocoder
NA Streets data - several Address Locators
Browse to the folder streetmap_na/data
and choose Street_Addresses_US
Review Geocoding Output
Red = Census
Blue = Google
Purple = ArcGIS
Red = Census
Blue = Google
Purple = ArcGIS
Red = Census
Blue = Google
Purple = ArcGIS
392K addresses in ~ 15 minutes
What to do about the 29K unmatched?
Post-process results
Review unmatched addresses
Poor match because missing street address!
Iterative Process!
Exploring unmatched records helps you get a sense of categories of problems with your data some of which you can go back and fix in batch.
Some errors will remain with large datasets.
Always Review Sample of Output
Which Geocoder?
Questions?
Links to materials on D-Lab website.
Thank you!
Extras!
Additional info on:
Geocoding with Your Reference Data
Geocoding with Your Reference Data
Assumption
Why?
Example: Buenos Aires, Argentina
...
First Try ArcGIS Online Geocoder
Select the most precise Address Locator Style supported by the data.
ç
In order to create a US Address Dual Ranges style locator, you need to identify columns for:
You must change the Role to Primary Table!
Save your address locator
Custom locator with user supplied reference data
AGOL World Geocoder
Linking Geocoded Addresses to Census Data
A few slides on that.
Be sure to select correct version of the census TIGER files for your analysis needs! See census website for details.
Download Census Data
Census Tracts - CA Alameda County
http://www2.census.gov/geo/tiger/TIGER2014/TRACT/tl_2014_06_tract.zip
Block Groups
http://www2.census.gov/geo/tiger/TIGER2014/BG/tl_2014_06_bg.zip
Blocks
http://www2.census.gov/geo/tiger/TIGER2014/TABBLOCK/tl_2014_06_tabblock10.zip
Add Census data to map
Intersect
Tool