Introduction to �PostGIS
http://postgis.net/workshops/postgis-intro
PostGIS
Section 1 - Welcome
Section 1 - Welcome
PostGIS
Questions, questions, so many ...
Section 1 - Welcome
PostGIS
Open workshop materials online
Download data bundle
Download PgAdmin
Section 1 - Welcome
PostGIS
Download QGIS?
Section 1 - Welcome
PostGIS
Ready?
Section 1 - Welcome
PostGIS
Section 2 - Introduction
Section 2 - Introduction
PostGIS
Section 2 - Introduction
PostGIS
What is a database?
System for storage and random access of relationally (tables of rows and columns) structured data, providing the following capabilities for that data.
Section 2 - Introduction
PostGIS
What is a spatial database?
System for storage and random access of relationally (tables of rows and columns) structured data, providing the following capabilities for that data.
Section 2 - Introduction
PostGIS
Spatial databases �store and manipulate �spatial objects �like any other object �in the database.
Section 2 - Introduction
PostGIS
Spatial Types
Section 2 - Introduction
PostGIS
Spatial Indexes
Section 2 - Introduction
This R-Tree organizes the spatial objects so that a spatial search is a quick walk through the tree.
To find what object contains ?�
Only 8 boxes have to be tested. A full table scan would require all 13 boxes to be tested. The larger the table, the more powerful the index is.
PostGIS
Spatial Functions
For example:
Section 2 - Introduction
PostGIS
What is PostGIS?
Section 2 - Introduction
CREATE �EXTENSION
postgis;
PostGIS
What is PostgreSQL?
Section 2 - Introduction
PostGIS
Why not files?
Section 2 - Introduction
PostGIS
PostGIS history
BC Watershed Atlas v1
“we need a spatial type!”
BC Corporate Watersheds
“lightweight” geometry (v1.0)
BC Digital Road Atlas
distance()�indexes (v0.1)
&
2000
2004
2001
Section 2 - Introduction
PostGIS
PostGIS reference users - government
Section 2 - Introduction
PostGIS
PostGIS reference users - private sector
Section 2 - Introduction
PostGIS
PostGIS 3rd party integration
Desktop
Middle
Language
and more...
and more...
and more...
Section 2 - Introduction
PostGIS
Section 3 - Installation
Section 3 - Installation
PostGIS
Microsoft Windows - postgresql.org/download/windows/
Section 3 - Installation
PostGIS
Apple MacOS - postgresapp.com
Section 3 - Installation
PostGIS
Linux - postgresql.org/download/
The postgresql.org site will walk you through the many options for selecting a Linux binary package, based on the version you want and the distribution.
Make sure you install the appropriate “postgis” package!
yum search postgis
apt-cache search postgis
Section 3 - Installation
PostGIS
Cloud - DBaaS includes PostGIS by default
Section 3 - Installation
PostGIS
PgAdmin - pgadmin.org
Section 3 - Installation
PostGIS
DBeaver - dbeaver.io
Section 3 - Installation
PostGIS
Section 4 - Creating a �Spatial Database
Section 4 - Creating a Spatial Database
PostGIS
Section 4 - Creating a Spatial Database
Login
PostGIS
Connection parameters
Section 4 - Creating a Spatial Database
PostGIS
Section 4 - Creating a Spatial Database
New Database
PostGIS
Section 4 - Creating a Spatial Database
New Database
PostGIS
Section 4 - Creating a Spatial Database
Open "nyc" database
PostGIS
Section 4 - Creating a Spatial Database
Run SQL Query
CREATE EXTENSION postgis;
Tools > Query Tool
PostGIS
Section 4 - Creating a Spatial Database
Run SQL Query
SELECT postgis_full_version();
PostGIS
Section 5 - Loading �Spatial Data
Section 5 - Loading Spatial Data
PostGIS
postgis-workshop/data/nyc_data.backup
Section 5 - Loading Spatial Data
PostGIS
Section 5 - Loading Spatial Data
PostGIS
Section 5 - Loading Spatial Data
PostGIS
Section 5 - Loading Spatial Data
PostGIS
Loading “GIS data files”
Section 5 - Loading Spatial Data
ogr2ogr
shp2pgsql
PostGIS
ogr2ogr
Section 5 - Loading Spatial Data
� ogr2ogr \
-nln nyc_census_blocks_2000 \
-nlt PROMOTE_TO_MULTI \
-lco GEOMETRY_NAME=geom \
-lco FID=gid \
-lco PRECISION=NO \
Pg:'dbname=nyc host=localhost user=postgres port=5432' \
nyc_census_blocks_2000.shp�
new layer name
new layer type
geometry column name
pk name
full precision?
destination data
source data
PostGIS
What is a “shape file”?
Section 5 - Loading Spatial Data
PostGIS
nyc_streets.prj (EPSG:26918)
PROJCS["NAD_1983_UTM_Zone_18N",
GEOGCS["GCS_North_American_1983",
DATUM["D_North_American_1983",
SPHEROID["GRS_1980",6378137,298.257222101]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-75],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["Meter",1]]
Section 5 - Loading Spatial Data
PostGIS
nyc_streets.prj (EPSG:26918)
POINT(586020 4513147)
Universal Transverse Mercator�EPSG:26918
POINT(-73.9808 40.7648)
WGS 1984�EPSG:4326
Section 5 - Loading Spatial Data
PostGIS
shp2pgsql
Section 5 - Loading Spatial Data
� shp2pgsql \
-D \
-I \
-s 26918 \
nyc_census_blocks_2000.shp \
nyc_census_blocks_2000 \
| psql dbname=nyc user=postgres host=localhost
dump format
build spatial index
spatial reference id of data
source file
destination table name
destination database connection
Section 5 - Loading Spatial Data
PostGIS
Section 6 - About �our data
Section 6 - About our data
PostGIS
nyc_census_blocks
blkid
popn_total
popn_white
popn_black
popn_nativ
popn_asian
popn_other
boroname
geom
Section 6 - About our data
PostGIS
nyc_neighborhoods
name
boroname
geom
Section 6 - About our data
PostGIS
nyc_streets
name
oneway
type
geom
Section 6 - About our data
PostGIS
nyc_subway_stations
name
borough
routes
transfers
express
geom
Section 6 - About our data
PostGIS
nyc_census_sociodata
tractid | transit_total | transit_public |
transit_private | transit_other | transit_time_mins |
family_county | family_income_median | family_income_aggregate |
edu_total | edu_no_highschool_dipl | edu_highscool_dipl |
edu_college_dipl | edu_graduate_dipl | |
Section 6 - About our data
PostGIS
Section 7 - Simple SQL
Section 7 - Simple SQL
PostGIS
SELECT name
FROM nyc_neighborhoods;
Section 7 - Simple SQL
PostGIS
PgAdmin and Maps
geographics
SELECT name, � ST_Transform(geom, 4326)
FROM nyc_neighborhoods;
Section 7 - Simple SQL
PgAdmin includes a magic “map button” which is fun to use to visualize geometric outputs. If the data are in “geographics” (EPSG:4326) then it will place a base map underneath the data. This can be pretty.
To see a map with your output, wrap it in �ST_Transform(geom, 4326). This will make more �sense as we work through the materials.
PostGIS
57
Section 7 - Simple SQL
PostGIS
Four Verbs of SQL
Section 7 - Simple SQL
PostGIS
What are ...�the names of all the neighborhoods in Brooklyn?
SELECT name
FROM nyc_neighborhoods
WHERE boroname = 'Brooklyn';
Section 7 - Simple SQL
PostGIS
What is the number of letters in the names of all the neighborhoods in Brooklyn?
SELECT char_length(name)
FROM nyc_neighborhoods
WHERE boroname = 'Brooklyn';
Section 7 - Simple SQL
PostGIS
What is the average number of letters and standard deviation of number of letters in the names of all the neighborhoods in Brooklyn?
SELECT avg(char_length(name)),� stddev(char_length(name))
FROM nyc_neighborhoods
WHERE boroname = 'Brooklyn';
Section 7 - Simple SQL
PostGIS
What is the average number of letters in the names of all the neighborhoods in New York City, reported by borough?
SELECT boroname,� avg(char_length(name))
FROM nyc_neighborhoods
GROUP BY boroname;
Section 7 - Simple SQL
PostGIS
Section 8 - Simple SQL Exercises
Section 8 - Simple SQL Exercises
PostGIS
How many records are in the nyc_streets table?
SELECT Count(*)
FROM nyc_streets;
Section 8 - Simple SQL Exercises
PostGIS
How many streets in NYC start with ‘B’?
SELECT Count(*)
FROM nyc_streets� WHERE name LIKE 'B%';
Section 8 - Simple SQL Exercises
PostGIS
What is the population of NYC?
SELECT Sum(popn_total) AS population
FROM nyc_census_blocks;
Section 8 - Simple SQL Exercises
PostGIS
What is the population of ‘The Bronx’?
SELECT Sum(popn_total)
FROM nyc_census_blocks� WHERE boroname = 'The Bronx';
Section 8 - Simple SQL Exercises
PostGIS
How many neighborhoods are in each borough?
SELECT boroname, Count(*)
FROM nyc_neighborhoods
GROUP BY boroname;
Section 8 - Simple SQL Exercises
PostGIS
For each borough in NYC, what is percentage of the population is “white”?
SELECT � boroname, � 100.0 * Sum(popn_white) /
Sum(popn_total) AS pct
FROM nyc_census_blocks�GROUP BY boroname;
Section 8 - Simple SQL Exercises
PostGIS
Section 9 - Geometries
Section 9 - Geometries
PostGIS
Spatial Types
Section 9 - Geometries
PostGIS
Creating a table with geometry
CREATE TABLE geometries
(
name varchar,
geom geometry
);
Section 9 - Geometries
PostGIS
Creating a table with geometry
INSERT INTO geometries (name, geom) VALUES
('Point', 'POINT(0 0)'),
('Linestring', 'LINESTRING(0 0, 1 1, 2 1, 2 2)'),
('Polygon', 'POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))'),
('PolygonWithHole', 'POLYGON((...))'),
('Collection', 'GEOMETRYCOLLECTION(...)');
Section 9 - Geometries
PostGIS
Creating a table with geometry
SELECT name, ST_AsText(geom)
FROM geometries;
Section 9 - Geometries
PostGIS
Table Relationships
Section 9 - Geometries
Every table with a geometry column will automatically appears in the geometry_columns view.
The SRID numbers on geometries and columns are converted into coordinate transforms using data from the spatial_ref_sys table.
PostGIS
geometry_columns
SELECT * �FROM geometry_columns
Section 9 - Geometries
PostGIS
geometry_columns
Section 9 - Geometries
PostGIS
Metadata functions
SELECT
name,
ST_GeometryType(geom),
ST_NDims(geom),
ST_SRID(geom)
FROM geometries;
Section 9 - Geometries
PostGIS
Metadata functions
Section 9 - Geometries
name | st_geometrytype | st_ndims | st_srid
-----------------+-----------------------+----------+---------
Point | ST_Point | 2 | 0
Linestring | ST_LineString | 2 | 0
Polygon | ST_Polygon | 2 | 0
PolygonWithHole | ST_Polygon | 2 | 0
Collection | ST_GeometryCollection | 2 | 0
PostGIS
Points
Section 9 - Geometries
“Point” or “MultiPoint”, representing one or more 0-dimensional locations.
New York city subway stations, stop signs, man holes, address points, current locations of vehicles, might all use a “Point” geometry type.
PostGIS
Points
SELECT ST_AsText(geom)
FROM geometries
WHERE name = 'Point';���POINT(0 0)
Section 9 - Geometries
PostGIS
Points
SELECT
ST_X(geom),
ST_Y(geom)
FROM geometries
WHERE name = 'Point'��0 0
Section 9 - Geometries
PostGIS
Points
SELECT
name,
ST_AsText(geom)
FROM nyc_subway_stations
LIMIT 1;��Cortlandt St | POINT(583521 4507077)
Section 9 - Geometries
PostGIS
LineStrings
Section 9 - Geometries
“LineString” or “MultiLineString”, representing one or more 1-dimensional objects.
Streets, streams, bus routes, power lines, driven routes, highways, might all use a “LineString” geometry type.
PostGIS
LineStrings
Section 9 - Geometries
SELECT ST_AsText(geom)
FROM geometries
WHERE name = 'Linestring';���LINESTRING(0 0,1 1,2 1,2 2)
PostGIS
LineStrings
Section 9 - Geometries
SELECT ST_Length(geom)
FROM geometries
WHERE name = 'Linestring';
3.41421356237309
PostGIS
LineStrings
Section 9 - Geometries
PostGIS
Polygons
Section 9 - Geometries
“Polygon” or “MultiPolygon”, representing one or more 2-dimensional objects.
Census areas, parcels, counties, countries, neighborhoods, zoning areas, watersheds, and more.
PostGIS
Polygons
Section 9 - Geometries
PostGIS
Polygons
Section 9 - Geometries
SELECT ST_AsText(geom)
FROM geometries
WHERE name LIKE 'Polygon%';
POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))
POLYGON((0 0, 10 0, 10 10, 0 10, 0 0),� (1 1, 1 2, 2 2, 2 1, 1 1))
PostGIS
Polygons
Section 9 - Geometries
PostGIS
Polygons
Section 9 - Geometries
SELECT name, ST_Area(geom)
FROM geometries
WHERE name LIKE 'Polygon%';
Polygon | 1
PolygonWithHole | 99
PostGIS
Geometry Formats
ST_As...
Text, EWKT, GML, KML, SVG, GeoJSON, �Binary, EWKB
�ST_GeomFrom...
Text, EWKT, GML, KML, GeoJSON, �Binary, EWKB
Section 9 - Geometries
PostGIS
Geometry Formats
Section 9 - Geometries
SELECT ST_AsText(
ST_GeometryFromText(
'LINESTRING(0 0 0,1 0 0,1 1 2)'
)
);
LINESTRING Z (0 0 0,1 0 0,1 1 2)
PostGIS
Geometry Formats
Section 9 - Geometries
SELECT ST_AsGeoJSON(
ST_GeomFromGML(
'<gml:Point>
<gml:coordinates>
1,1
</gml:coordinates>
</gml:Point>'
));
{"type":"Point","coordinates":[1,1]}
PostGIS
All roads lead to Rome … (Geometry construction)
Section 9 - Geometries
SELECT ST_AsEWKT(
ST_GeomFromText('POINT(1 1)', 4326)
);
��
SRID=4326;POINT(1 1)
PostGIS
All roads lead to Rome … (Geometry construction)
Section 9 - Geometries
SELECT ST_AsEWKT(
ST_SetSRID(
ST_GeomFromText('POINT(1 1)'),
4326
)
);
SRID=4326;POINT(1 1)
PostGIS
All roads lead to Rome … (Geometry construction)
Section 9 - Geometries
SELECT ST_AsEWKT(
ST_SetSRID(
ST_MakePoint(1, 1),
4326
)
);
SRID=4326;POINT(1 1)
PostGIS
All roads lead to Rome … (Geometry construction)
Section 9 - Geometries
SELECT ST_AsEWKT(
ST_SetSRID(
'POINT(1 1)'::geometry,
4326
)
);
SRID=4326;POINT(1 1)
PostGIS
All roads lead to Rome … (Geometry construction)
Section 9 - Geometries
SELECT ST_AsEWKT(
'SRID=4326;POINT(1 1)'::geometry
);
���
SRID=4326;POINT(1 1)
PostGIS
All roads lead to Rome … (Geometry construction)
Section 9 - Geometries
SELECT ST_AsEWKT(� ST_Point(1, 1, 4326)�);����
SRID=4326;POINT(1 1)
PostGIS
Section 10 - Geometry Exercises
Section 10 - Geometry Exercises
PostGIS
What is the area of the ‘West Village’ neighborhood?
SELECT� ST_Area(geom)
FROM nyc_neighborhoods
WHERE name = 'West Village';
1044614.5296486
Section 10 - Geometry Exercises
PostGIS
What is the geometry type of ‘Pelham St’? The length?
SELECT� ST_GeometryType(geom),� ST_Length(geom)
FROM nyc_streets
WHERE name = 'Pelham St';
ST_MultiLineString | 50.323
Section 10 - Geometry Exercises
PostGIS
What is the GeoJSON representation of ‘Broad St’ subway station?
SELECT� ST_AsGeoJSON(geom, 0)
FROM nyc_subway_stations
WHERE name = 'Broad St';
{"type":"Point",� "crs":{"type":"name",� "properties":{"name":"EPSG:26918"}},� "coordinates":[583571,4506714]}
Section 10 - Geometry Exercises
PostGIS
What is the total length of streets (in kilometers) in New York City?
SELECT Sum(ST_Length(geom)) / 1000
FROM nyc_streets;
�10418.904717199996
Section 10 - Geometry Exercises
PostGIS
What is the area of Manhattan in acres?
SELECT� Sum(ST_Area(geom)) / 4047
FROM nyc_census_blocks
WHERE boroname = 'Manhattan';���14601.3987215548
Section 10 - Geometry Exercises
PostGIS
What is the most westerly subway station?
SELECT� ST_X(geom), name
FROM nyc_subway_stations
ORDER BY ST_X(geom)�LIMIT 1;
��563292.1 | Tottenville
Section 10 - Geometry Exercises
PostGIS
Section 11 - Spatial �Relationships
Section 11 - Spatial Relationships
PostGIS
Spatial Relationship Functions
Common
Uncommon
Section 11 - Spatial Relationships
PostGIS
ST_Equals(A, B)�ST_OrderingEquals(A, B)
Section 11 - Spatial Relationships
Equals tests that A and B cover the same space, regardless of representation differences (extra vertices, order of vertices). OrderingEquals insists on structural identity.
PostGIS
What is geometry of Broad Street subway station?
SELECT name, geom
FROM nyc_subway_stations
WHERE name = 'Broad St';
���0101000020266900000EEBD4CF27CF2141BC17D69516315141
Section 11 - Spatial Relationships
PostGIS
What subway station record matches that geometry?
SELECT name
FROM nyc_subway_stations
WHERE ST_Equals(
geom, � '0101000020266900000EEBD4CF27CF2141BC17D69516315141'�);
�Broad St
Section 11 - Spatial Relationships
PostGIS
ST_Intersects(A, B)�ST_Disjoint(A, B)
Section 11 - Spatial Relationships
Intersects and disjoint are opposites. Any kind of interactions between two shapes is an intersection, and implies the pair are not disjoint, and vice versa.
A intersects B ⇒ A not disjoint B
A disjoint B ⇒ A not intersects B
PostGIS
What is the well-known text (WKT) of Broad Street subway station?
SELECT name, ST_AsText(geom, 0)
FROM nyc_subway_stations
WHERE name = 'Broad St';
���POINT(583571 4506714)
Section 11 - Spatial Relationships
PostGIS
What neighborhood intersects that subway station?
Section 11 - Spatial Relationships
SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(
geom, � ST_GeomFromText(� 'POINT(583571 4506714)',� 26918));
�Financial District | Manhattan
PostGIS
ST_Crosses(A, B)
Section 11 - Spatial Relationships
Mostly used to test linestrings, which can be said to cross when their interiors have interactions.
When linestrings cross polygon boundaries, the crosses condition is also true.
PostGIS
ST_Overlaps(A, B)
Section 11 - Spatial Relationships
Shapes overlap when their interiors interact with each other and also with the exterior of the shape.
So objects that are contained or within do not overlap, overlaps is what normal people might call “partial overlap”.
PostGIS
ST_Touches(A, B)
Section 11 - Spatial Relationships
Shapes touch when their boundaries interact but their interiors do not. End points for lines, exterior rings for polygons. Usually used for testing that polygons have ring-touching only.
PostGIS
ST_Within(A, B)�ST_Contains(B, A)
Section 11 - Spatial Relationships
Within and contains are about objects being fully inside. One important caveat, for both functions an object on the boundary is not considered within. So a point on the outer ring of a polygon is not within the polygon.
PostGIS
ST_Distance(A, B)
Returns the shortest distance between the two geometries, in this case the distance from the point to the line mid-point.
Section 11 - Spatial Relationships
PostGIS
ST_Distance(A, B)
SELECT ST_Distance(
'POINT(0 5)'::geometry,
'LINESTRING(-2 2, 2 2)'::geometry� );
3
Section 11 - Spatial Relationships
PostGIS
ST_DWithin(A, B, R)
Section 11 - Spatial Relationships
Index-enabled radius search function. True when the distance from geometry A to geometry B is less than radius R. False otherwise.
Use instead of ST_Distance(A, B) < R, in order to get benefit of spatial index.
PostGIS
What streets are within 10 meters of Broad Street subway station?
SELECT name
FROM nyc_streets
WHERE ST_DWithin(
geom,
ST_GeomFromText('POINT(583571 4506714)',26918),
10
);
Section 11 - Spatial Relationships
PostGIS
Section 11 - Spatial Relationships
PostGIS
Section 12 - Spatial Relationship Exercises
Section 12 - Spatial Relationship Exercises
PostGIS
What is the well-known text for the street 'Atlantic Commons'?
SELECT ST_AsText(geom, 0)
FROM nyc_streets
WHERE name = 'Atlantic Commons';��MULTILINESTRING((� 586782 4504202,� 586864 4504216))
Section 12 - Spatial Relationship Exercises
PostGIS
What neighborhood and borough is �'POINT(586782 4504202)' in?
SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(
geom,
ST_GeomFromText(� 'POINT(586782 4504202)', 26918)
);
Section 12 - Spatial Relationship Exercises
PostGIS
How many people live within 50 meters of �'POINT(586782 4504202)'?
SELECT Sum(popn_total)
FROM nyc_census_blocks
WHERE ST_DWithin(
geom,
ST_GeomFromText('POINT(586782 4504202)', � 26918),
50 );
Section 12 - Spatial Relationship Exercises
PostGIS
Intersects, Touches, Contains, Disjoint, Overlaps, Crosses, Within
A
B
For ‘LINESTRING(0 0, 2 2)’ and ‘POINT(1 1)’ which of these relationships are true?
Section 12 - Spatial Relationship Exercises
PostGIS
Point and Line Relationships
SELECT
ST_Intersects(l,p), ST_Touches(l,p),
ST_Contains(l,p), ST_Disjoint(l,p),
ST_Overlaps(l,p), ST_Crosses(l,p),
ST_Within(l,p)
FROM (
SELECT
'LINESTRING(0 0, 2 2)'::geometry AS l,
'POINT(1 1)'::geometry AS p
) AS subquery;
Section 12 - Spatial Relationship Exercises
PostGIS
Point and Line Relationships
st_intersects | t
st_touches | f
st_contains | t
st_disjoint | f
st_overlaps | f
st_crosses | f
st_within | f
Section 12 - Spatial Relationship Exercises
PostGIS
How far apart are 'Columbus Cir' and 'Fulton Ave'?
SELECT ST_Distance(a.geom, b.geom)
FROM nyc_streets a,
nyc_streets b
WHERE a.name = 'Fulton Ave' � AND b.name = 'Columbus Cir';
Section 12 - Spatial Relationship Exercises
PostGIS
Section 13 - Spatial Joins
Section 13 - Spatial Joins
PostGIS
What neighborhood is the 'Broad St' station in?
Section 13 - Spatial Joins
PostGIS
Remember...
SELECT name, boroname
FROM nyc_neighborhoods
WHERE
ST_Intersects(
geom,
ST_GeomFromText(
'POINT(583571 4506714)',
26918));
Section 13 - Spatial Joins
PostGIS
Do it in one step, with a spatial join!
SELECT s.name, n.name, n.boroname
FROM nyc_neighborhoods AS n
JOIN nyc_subway_stations AS s
ON ST_Contains(
n.geom,
s.geom
)
WHERE s.name = 'Broad St';
Section 13 - Spatial Joins
PostGIS
What is the population and racial make-up of the neighborhoods of Manhattan?
Section 13 - Spatial Joins
PostGIS
SELECT� n.name AS neighborhood_name,
SUM(c.popn_total) AS population,
100*SUM(c.popn_white)/SUM(c.popn_total) AS white_pct,
100*SUM(c.popn_black)/SUM(c.popn_total) AS black_pct
FROM nyc_neighborhoods AS n
JOIN nyc_census_blocks AS c
ON ST_Intersects(
n.geom,
c.geom
)
WHERE n.boroname = 'Manhattan'
GROUP BY n.name
ORDER BY white_pct DESC;
Section 13 - Spatial Joins
PostGIS
Section 13 - Spatial Joins
neighborhood_name | popn | white % | black %
---------------------+--------+---------+---------
Carnegie Hill | 18763 | 90.1 | 1.4
West Village | 26718 | 87.6 | 2.2
North Sutton Area | 22460 | 87.6 | 1.6
Upper East Side | 203741 | 85.0 | 2.7
Soho | 15436 | 84.6 | 2.2
Greenwich Village | 57224 | 82.0 | 2.4
Central Park | 46600 | 79.5 | 8.0
Tribeca | 20908 | 79.1 | 3.5
Gramercy | 104876 | 75.5 | 4.7
Murray Hill | 29655 | 75.0 | 2.5
Chelsea | 61340 | 74.8 | 6.4
Upper West Side | 214761 | 74.6 | 9.2
Midtown | 76840 | 72.6 | 5.2
Battery Park | 17153 | 71.8 | 3.4
neighborhood_name | popn | white % | black %
---------------------+--------+---------+---------
Financial District | 34807 | 69.9 | 3.8
Clinton | 32201 | 65.3 | 7.9
East Village | 82266 | 63.3 | 8.8
Garment District | 10539 | 55.2 | 7.1
Morningside Heights | 42844 | 52.7 | 19.4
Little Italy | 12568 | 49.0 | 1.8
Yorkville | 58450 | 35.6 | 29.7
Inwood | 50047 | 35.2 | 16.8
Washington Heights | 169013 | 34.9 | 16.8
Lower East Side | 96156 | 33.5 | 9.1
East Harlem | 60576 | 26.4 | 40.4
Hamilton Heights | 67432 | 23.9 | 35.8
Chinatown | 16209 | 15.2 | 3.8
Harlem | 134955 | 15.1 | 67.1
PostGIS
Let's explore the racial geography of New York City...
Section 13 - Spatial Joins
PostGIS
Overall Racial Make-up of NYC
SELECT
100.0*SUM(popn_white)/SUM(popn_total) AS white_pct,
100.0*SUM(popn_black)/SUM(popn_total) AS black_pct,
SUM(popn_total) AS popn_total
FROM nyc_census_blocks;
white_pct | black_pct | popn_total
-------------------+--------------------+------------
44.00395007628105 | 25.546578900241613 | 8175032
Section 13 - Spatial Joins
PostGIS
Section 13 - Spatial Joins
You, must take the train.
To, go to Sugar Hill way up in Harlem.
PostGIS
What is the racial make-up of the areas served by the train?
Section 13 - Spatial Joins
PostGIS
First, we must determine where the train stops.
Section 13 - Spatial Joins
PostGIS
Our routes are comma-separated strings!
SELECT DISTINCT routes
FROM nyc_subway_stations;
Section 13 - Spatial Joins
4,5
N,Q,R,W
J
B,M,Q,R
D,F,N,Q
J,M
E,F
...
PostGIS
Postgres string function: strpos()
strpos(routes,'A') returns a non-zero number if 'A' is in the routes field
Check out https://www.postgresql.org/docs/current/functions-string.html
Section 13 - Spatial Joins
PostGIS
Find all routes with an “A”
SELECT DISTINCT routes
FROM nyc_subway_stations AS subways
WHERE strpos(subways.routes,'A') > 0;
Section 13 - Spatial Joins
A,C
A,B,C,D
A,C,E,L
A,C,F
A,B,C
A,S
A,C,E
...
PostGIS
Section 13 - Spatial Joins
The route of the A train.
What is the racial makeup within 200 meters of each stop? Who is served by the A train?
PostGIS
Summarize population 200m from A train stops
SELECT
100*SUM(c.popn_white)/SUM(c.popn_total) AS white_pct,
100*SUM(c.popn_black)/SUM(c.popn_total) AS black_pct,
SUM(popn_total) AS popn_total
FROM nyc_census_blocks AS c
JOIN nyc_subway_stations AS s
ON ST_DWithin(
c.geom,
s.geom,
200
)
WHERE strpos(s.routes,'A') > 0;
Section 13 - Spatial Joins
PostGIS
Section 13 - Spatial Joins
New York | Train |
44.00% white | 45.59% white |
25.55% black | 22.09% black |
PostGIS
Section 13 - Spatial Joins
You, must take the train.
To, go to Sugar Hill way up in Harlem.
PostGIS
Section 14 - Spatial Joins Exercises
Section 14 - Spatial Joins Exercises
PostGIS
What subway station is in ‘Little Italy’? What subway route is it on?
SELECT s.name, s.routes
FROM nyc_subway_stations AS s
JOIN nyc_neighborhoods AS n
ON ST_Contains(n.geom, s.geom)
WHERE n.name = 'Little Italy';
�Spring St | 6
Section 14 - Spatial Joins Exercises
PostGIS
What are all the neighborhoods served by the 6 train?
SELECT DISTINCT n.name, n.boroname
FROM nyc_subway_stations AS s
JOIN nyc_neighborhoods AS n
ON ST_Contains(n.geom, s.geom)
WHERE strpos(s.routes,'6') > 0;
� Midtown
Hunts Point
…
Section 14 - Spatial Joins Exercises
PostGIS
After 9/11, the ‘Battery Park’ neighborhood was off limits for several days. How many people had to be evacuated?
SELECT Sum(popn_total)
FROM nyc_neighborhoods AS n
JOIN nyc_census_blocks AS c
ON ST_Intersects(n.geom, c.geom)
WHERE n.name = 'Battery Park';
�17153
Section 14 - Spatial Joins Exercises
PostGIS
What neighborhood has the highest population density (persons/km2)?
SELECT n.name,
1000000 * Sum(c.popn_total) / � ST_Area(n.geom) AS popn_per_sqkm
FROM nyc_census_blocks AS c
JOIN nyc_neighborhoods AS n
ON ST_Intersects(c.geom, n.geom)
GROUP BY n.name, n.geom
ORDER BY popn_per_sqkm DESC;��North Sutton Area | 68435.13283772678
Section 14 - Spatial Joins Exercises
PostGIS
Section 15 - Spatial Indexing
Section 15 - Spatial Indexing
PostGIS
A spatial database has...
Section 15 - Spatial Indexing
PostGIS
A spatial index speeds spatial query ...
Section 15 - Spatial Indexing
Without Index | With Index |
10,000 * 10,000 = 100,000,000 comparisons | 10,000 + 10,000 = 20,000 comparisons |
PostGIS
To prove it… remove the index.
Run a spatial join.
DROP INDEX nyc_census_blocks_geom_idx;
Section 15 - Spatial Indexing
SELECT b.blkid
FROM nyc_census_blocks b
JOIN nyc_subway_stations s
ON ST_Contains(b.geom, s.geom)
WHERE s.name LIKE 'B%';
~300 ms
PostGIS
Run the join again.
Create the index again.
CREATE INDEX nyc_census_blocks_geom_idx�ON nyc_census_blocks USING GIST (geom);
Section 15 - Spatial Indexing
SELECT blocks.blkid
FROM nyc_census_blocks b
JOIN nyc_subway_stations s
ON ST_Contains(b.geom, s.geom)
WHERE s.name LIKE 'B%';
~90 ms
PostGIS
Spatial Index Cliff Notes
Section 15 - Spatial Indexing
PostGIS
Spatial Index Internals
Section 15 - Spatial Indexing
Some spatial objects (like the star) are quite large and complex. Comparing complex objects is expensive!
Instead of indexing objects directly, spatial indexes work on the bounding boxes of the objects.
The boxes are of uniform size, and can be compared to determine spatial relationships very quickly.
PostGIS
The boxes can be arranged in a hierarchy, so that a query can quickly discard portions of the search space that will not interact with a query box. Depending on the algorithm, different hierarchies can be build. PostGIS uses an “R*tree” algorithm.
Section 15 - Spatial Indexing
PostGIS
What green objects intersect the yellow query shape?
Section 15 - Spatial Indexing
PostGIS
Use index to quickly finds the objects with bounding box intersection.
Section 15 - Spatial Indexing
PostGIS
Exactly compute relationships in index result to find true intersection.
Section 15 - Spatial Indexing
PostGIS
Index-only queries
Section 15 - Spatial Indexing
PostGIS
Index-enabled Spatial Functions
Section 15 - Spatial Indexing
PostGIS
Index-only queries
geom_a && geom_b
The “&&” operator is the “bounding boxes overlap” operator.
It returns “true” when the bounds of the left and right arguments overlap.
Operators like “=” or “>” are symbols that express relationships between the left- and right-hand side arguments. “&&” is just another operator like any other.
Section 15 - Spatial Indexing
PostGIS
What is the population of the West Village?
SELECT Sum(blk.popn_total)
FROM nyc_neighborhoods nh
JOIN nyc_census_blocks blk
ON nh.geom && blk.geom
WHERE nh.name = 'West Village';
49821
Section 15 - Spatial Indexing
PostGIS
What is the population of the West Village?
SELECT Sum(blk.opn_total)
FROM nyc_neighborhoods nh
JOIN nyc_census_blocks blk
ON ST_Intersects(nh.geom, blk.geom)
WHERE nh.name = 'West Village';
26718
Section 15 - Spatial Indexing
PostGIS
Spatial Indexes
VACUUM ANALYZE nyc_census_blocks;
The database planner can only know when to use indexes if the tables have been “analyzed”.
The database executor will run more efficiently if dead tuple bloat has been removed with “vacuum”. Most important on tables with bulk data changes (insert/update/delete).
Section 15 - Spatial Indexing
PostGIS
Section 16 - Projecting Data
Section 16 - Projecting Data
PostGIS
Section 16 - Projecting Data
The earth is not flat, and there is no simple way of putting it down on a flat paper map (or computer screen), so people have come up with all sorts of ingenious solutions, each with pros and cons.
PostGIS
f(θ,Φ) → (x,y)
f -1(x,y) → (θ,Φ)
Forward projection converts spherical coordinates (longitude, latitude) to cartesian coordinates (x and y)
Inverse projection converts cartesian coordinates (x, y) to spherical coordinates (longitude, latitude)
Section 16 - Projecting Data
PostGIS
Section 16 - Projecting Data
PostGIS
What is the SRID of our subways?
SELECT ST_SRID(geom)
FROM nyc_subway_stations �LIMIT 1;
26918
Section 16 - Projecting Data
PostGIS
What does SRID 26918 mean though?
SRID is a foreign key relating to spatial_ref_sys
Section 16 - Projecting Data
PostGIS
What does SRID 26918 mean though?
Also, see: https://epsg.io/26918
SELECT srtext
FROM spatial_ref_sys�WHERE srid = 26918;
Section 16 - Projecting Data
PostGIS
What does SRID 26918 mean though?
PROJCS["NAD83 / UTM zone 18N",
GEOGCS["NAD83",
DATUM["North_American_Datum_1983",
SPHEROID["GRS 1980",6378137,298.257222101],
TOWGS84[0,0,0,0,0,0,0],
AUTHORITY["EPSG","6269"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4269"]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-75],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["metre",1],
AXIS["Easting",EAST],� AXIS["Northing",NORTH]]
Section 16 - Projecting Data
PostGIS
What are coordinates of the “Broad St” subway station in geographic?
SELECT
ST_AsText(ST_Transform(geom,4326))
FROM nyc_subway_stations
WHERE name = 'Broad St';���POINT(-74.0106714 40.7071048)
Section 16 - Projecting Data
PostGIS
Section 17 - Projection Exercises
Section 17 - Projection Exercises
PostGIS
What is the length of all streets in New York, as measured in UTM 18?
SELECT � Sum(ST_Length(geom))
FROM nyc_streets;����10418904.7172
Section 17 - Projection Exercises
PostGIS
What is the WKT definition of SRID 2831?
SELECT srtext
FROM spatial_ref_sys
WHERE SRID = 2831;
���PROJCS["NAD83(HARN) / New York Long Island",�...
Section 17 - Projection Exercises
PostGIS
What is the length of all streets in New York, as measured in SRID 2831 (Stateplane Long Island)?
SELECT Sum(ST_Length(� ST_Transform(geom, 2831)� ))
FROM nyc_streets;
�10421993.706374
Section 17 - Projection Exercises
PostGIS
How many streets cross the 74th meridian?
SELECT Count(*)
FROM nyc_streets
WHERE ST_Intersects(
ST_Transform(geom, 4326),
'SRID=4326;LINESTRING(-74 20, -74 60)'
);
�223
Section 17 - Projection Exercises
PostGIS
Section 18 - Geography
Section 18 - Geography
PostGIS
Geographic Coordinate Systems
Section 18 - Geography
PostGIS
What is the distance between Los Angeles and Paris using ST_Distance(geometry, geometry)?
SELECT ST_Distance(� -- Los Angeles (LAX)
'SRID=4326;POINT(-118.4079 33.9434)'::geometry,� -- Paris (CDG)
'SRID=4326;POINT(2.5559 49.0083)'::geometry
);
��121.898285970107
Section 18 - Geography
PostGIS
Section 18 - Geography
PostGIS
Degrees are not units of distance
Degrees are not units of area
Section 18 - Geography
PostGIS
What is the distance between Los Angeles and Paris using ST_Distance(geography, geography)?
SELECT ST_Distance(� -- Los Angeles (LAX)
'SRID=4326;POINT(-118.4079 33.9434)'::geography,� -- Paris (CDG)
'SRID=4326;POINT(2.5559 49.0083)'::geography
);
��9124665.27317673
Section 18 - Geography
PostGIS
Section 18 - Geography
PostGIS
How close will a flight from Los Angeles to Paris come to Iceland?
SELECT ST_Distance(
-- LAX-CDG
'SRID=4326;LINESTRING(� -118.4079 33.9434,� 2.5559 49.0083)'::geography,
-- Iceland
'SRID=4326;POINT(-21.8628 64.1286)'::geography
);
�531773.75711106
Section 18 - Geography
PostGIS
Section 18 - Geography
PostGIS
What is the shortest great-circle route from Los Angeles to Tokyo?
SELECT ST_Distance(
'SRID=4326;POINT(-118.408 33.943)'::geometry, -- LAX
'SRID=4326;POINT( 139.733 35.567)'::geometry) -- NRT
AS geometry_distance,
ST_Distance(
'POINT(-118.408 33.943)'::geography, -- LAX
'POINT( 139.733 35.567)'::geography) -- NRT
AS geography_distance;
geometry_distance: 258.14610835�geography_distance: 8833973.30246194
Section 18 - Geography
PostGIS
Geometry Distance
Section 18 - Geography
PostGIS
Geographic Distance
Section 18 - Geography
PostGIS
Using Geography - Casting
CREATE TABLE nyc_subway_stations_geog AS
SELECT
ST_Transform(geom, 4326)::geography AS geog,
name,
routes
FROM nyc_subway_stations;
Section 18 - Geography
PostGIS
Using Geography - Indexing
CREATE INDEX nyc_subway_stations_geog_gix
ON nyc_subway_stations_geog
USING GIST (geog);
Section 18 - Geography
PostGIS
Using Geography - Querying
WITH empire_state_building AS (� SELECT 'POINT(-73.98501 40.74812)'::geography AS geog
)�SELECT name,
ST_Distance(esb.geog, ss.geog) AS distance,
degrees(ST_Azimuth(esb.geog, ss.geog)) AS direction
FROM nyc_subway_stations_geog ss,
empire_state_building esb
WHERE ST_DWithin(ss.geog, esb.geog, 500);
Section 18 - Geography
PostGIS
443m / 224°
287m / 299°
328m / 125°
Section 18 - Geography
PostGIS
Using Geography - From Scratch
CREATE TABLE airports (
code VARCHAR(3),
geog GEOGRAPHY(Point)
);
INSERT INTO airports � VALUES ('LAX', 'POINT(-118.4079 33.9434)');
INSERT INTO airports � VALUES ('CDG', 'POINT(2.5559 49.0083)');
INSERT INTO airports � VALUES ('KEF', 'POINT(-22.6056 63.9850)');
Section 18 - Geography
PostGIS
Using Geography - From Scratch
SELECT * FROM geography_columns;
Section 18 - Geography
f_table_name | f_geography_column | srid | type |
nyc_subway_stations_geog | geog | 0 | Geometry |
airports | geog | 4326 | Point |
PostGIS
Casting to Geometry
SELECT
code,
ST_X(geog::geometry) AS longitude
FROM airports;
Section 18 - Geography
The “::” syntax tells PostgreSQL to attempt to coerce the data into the new data type, if there is an available path.
PostGIS
Geography Native Functions
Section 18 - Geography
PostGIS
Geography is the Magic Solution?
The complexity of dealing with planar projections (choosing one, getting used to it) drives some users to fixate on the geography type as a simple cure-all.
However:
Section 18 - Geography
PostGIS
double R = 6371000; /* meters */
double d_lat = lat2-lat1; /* radians */
double d_lon = lon2-lon1; /* radians */
double sin_lat = sin(d_lat/2);
double sin_lon = sin(d_lon/2);
double a = sin_lat * sin_lat +
cos(lat1) * cos(lat2) *
sin_lon * sin_lon;
double c = 2 * atan2(sqrt(a),
sqrt(1-a));
double d = R * c;
double dx = x2 - x1;
double dy = y2 - y1;
double d2 = dx * dx +� dy * dy;
double d = sqrt(d2);
geography�distance
geometry�distance
Section 18 - Geography
PostGIS
Section 19 - Geography Exercises
Section 19 - Geography Exercises
PostGIS
How far is New York from Seattle? What are the units of the answer?
SELECT ST_Distance(
'POINT(-74.0064 40.7142)'::geography,� 'POINT(-122.3331 47.6097)'::geography� );���3875538.57141352
Section 19 - Geography Exercises
PostGIS
What is the total length of all streets in New York, calculated on the spheroid?
SELECT Sum(
ST_Length(ST_Transform(geom, 4326)::geography)� ) �FROM nyc_streets;���10421999.666
Section 19 - Geography Exercises
PostGIS
What is the total length of all streets in New York, calculated on the plane?
SELECT Sum(
geom� ) �FROM nyc_streets;���10418904.717
Section 19 - Geography Exercises
PostGIS
How good are projections?
215
Projection | Total Length | Difference |
Geography | 10421999.666032 | 0 / 0% |
Long Island State Plane | 10421993.706374 | -5.9m / 0.00006% |
UTM Zone 18 | 10418904.717199 | -3095m / 0.03% |
Section 19 - Geography Exercises
PostGIS
Does ‘POINT(1 2.0001)’ intersect with �‘POLYGON((0 0, 0 2, 2 2, 2 0, 0 0))’ in geography? �In geometry? Why the difference?
SELECT ST_Intersects(
'POINT(1 2.0001)'::geography,
'POLYGON((0 0,0 2,2 2,2 0,0 0))'::geography�)�
SELECT ST_Intersects(
'POINT(1 2.0001)'::geometry,
'POLYGON((0 0,0 2,2 2,2 0,0 0))'::geometry�)
Section 19 - Geography Exercises
PostGIS
Does ‘POINT(1 2.0001)’ intersect with �‘POLYGON((0 0, 0 2, 2 2, 2 0, 0 0))’ in geography? �In geometry? Why the difference?
Section 19 - Geography Exercises
straight line
great circle
geometry
geography
PostGIS
Section 20 - Geometry �Constructing Functions
Section 20 - Geometry Constructing Functions
PostGIS
Functions so far...
Section 20 - Geometry Constructing Functions
PostGIS
Geometry constructing functions!
Section 20 - Geometry Constructing Functions
PostGIS
ST_Centroid
ST_PointOnSurface
Section 20 - Geometry Constructing Functions
PostGIS
ST_Buffer
Section 20 - Geometry Constructing Functions
PostGIS
“What would a 500 meter marine traffic zone �around Liberty Island look like?”
Section 20 - Geometry Constructing Functions
PostGIS
“What would a 500 meter marine traffic zone �around Liberty Island look like?”
-- New table with a Liberty Island
-- 500m buffer zone
CREATE TABLE liberty_island_zone AS
SELECT
ST_Buffer(geom, 500)::Geometry(Polygon,26918) AS geom
FROM nyc_census_blocks
WHERE blkid = '360610001001001';
Section 20 - Geometry Constructing Functions
PostGIS
“What would a negative 50 meter marine traffic zone �around Liberty Island look like?”
Section 20 - Geometry Constructing Functions
PostGIS
ST_Intersection(A, B)
Section 20 - Geometry Constructing Functions
PostGIS
“What is the area these two circles have in common?”
SELECT ST_AsText(ST_Intersection(
ST_Buffer('POINT(0 0)', 2),
ST_Buffer('POINT(3 0)', 2)
));
Section 20 - Geometry Constructing Functions
PostGIS
“What is the area these two circles have in common?”
POLYGON((
2 0,
1.96157056080646 -0.390180644032256,
1.84775906502257 -0.765366864730179,
1.66293922460509 -1.1111404660392,
1.5 -1.30968248567708,
1.33706077539491 -1.11114046603921,
1.15224093497743 -0.765366864730185,
1.03842943919354 -0.390180644032262,
1 -6.46217829773035e-15,
1.03842943919354 0.39018064403225,
1.15224093497742 0.765366864730173,
1.33706077539491 1.1111404660392,
1.5 1.30968248567708,
1.66293922460509 1.11114046603921,
1.84775906502257 0.765366864730184,
1.96157056080646 0.390180644032261,
2 0))
Section 20 - Geometry Constructing Functions
PostGIS
ST_Union(A, B)
Section 20 - Geometry Constructing Functions
PostGIS
Terminology
Forms
Section 20 - Geometry Constructing Functions
PostGIS
“How would you make a county map from �census blocks?”
Section 20 - Geometry Constructing Functions
PostGIS
Census Block ID
US Census Block IDs encode the geographic hierarchy used by the census.
360610001001001 = 36 061 000100 1 001
36 = State of New York
061 = New York County (Manhattan)
000100 = Census Tract
1 = Census Block Group
001 = Census Block
Section 20 - Geometry Constructing Functions
PostGIS
“How would you make a county map from
census blocks?”
-- An nyc_census_counties table �-- by merging census blocks
CREATE TABLE nyc_census_counties AS
SELECT
ST_Union(geom) AS geom,
SubStr(blkid,1,5) AS countyid
FROM nyc_census_blocks
GROUP BY countyid;
Section 20 - Geometry Constructing Functions
PostGIS
Section 21 - Geometry Constructing Exercises
Section 21 - Geometry Constructing Exercises
PostGIS
How many census blocks don’t contain their own centroid?
SELECT Count(*)
FROM nyc_census_blocks
WHERE NOT
ST_Contains(
geom,
ST_Centroid(geom)
);��481
Section 21 - Geometry Constructing Exercises
PostGIS
Union all the census blocks into a single output. �What kind of geometry is it? How many parts does it have?
CREATE TABLE nyc_census_blocks_merge AS
SELECT ST_Union(geom) AS geom
FROM nyc_census_blocks;
Section 21 - Geometry Constructing Exercises
SELECT ST_GeometryType(geom), � ST_NumGeometries(geom) �FROM nyc_census_blocks_merge;
PostGIS
“What is the area of a one unit buffer around the origin? How different is it from what you would expect? Why?”
SELECT ST_Area(ST_Buffer('POINT(0 0)', 1));
3.121445152258052
�
SELECT pi();
3.141592653589793
Section 21 - Geometry Constructing Exercises
PostGIS
The Brooklyn neighborhoods of Park Slope and Carroll Gardens are going to war! Construct a polygon delineating a 100 meter wide DMZ on the border between the neighborhoods. What is the area of the DMZ?
Section 21 - Geometry Constructing Exercises
PostGIS
CREATE TABLE brooklyn_dmz AS
SELECT
ST_Intersection(
ST_Buffer(ps.geom, 50),
ST_Buffer(cg.geom, 50))
AS geom
FROM
nyc_neighborhoods ps,
nyc_neighborhoods cg
WHERE ps.name = 'Park Slope'
AND cg.name = 'Carroll Gardens';
Wrap in ST_Area(...) for area
Wrap in ST_Transform(..., 4326) to see base map in PgAdmin map
Section 21 - Geometry Constructing Exercises
PostGIS
240
Section 21 - Geometry Constructing Exercises
PostGIS
Section 22 - More �Spatial Joins!
Section 22 - More Spatial Joins
PostGIS
Load the nyc_census_sociodata.sql table
-- 2167�SELECT
Count(*)
FROM nyc_census_sociodata;
Section 22 - More Spatial Joins
PostGIS
How would you make �a census tract map �from census blocks?
Section 22 - More Spatial Joins
PostGIS
Recall...
Liberty Island blkid
360610001001001 = 36 061 000100 1 001
36 = State of New York
061 = New York County (Manhattan)
000100 = Census Tract
1 = Census Block Group
001 = Census Block
Section 22 - More Spatial Joins
PostGIS
Blocks
Tracts
Section 22 - More Spatial Joins
PostGIS
ST_Union() Blocks into Tracts
-- Make the tracts table
CREATE TABLE nyc_census_tract_geoms AS
SELECT
ST_Union(geom) AS geom,
substr(blkid,1,11) AS tractid
FROM nyc_census_blocks
GROUP BY tractid;
-- Index the tractid
CREATE INDEX nyc_census_tract_geoms_tractid_idx
ON nyc_census_tract_geoms (tractid);
Section 22 - More Spatial Joins
PostGIS
How can you associate census data with your census tract map?
Section 22 - More Spatial Joins
PostGIS
ST_Union() Blocks into Tracts
-- Make the tracts table
CREATE TABLE nyc_census_tracts AS
SELECT g.geom, a.*
FROM nyc_census_tract_geoms g
JOIN nyc_census_sociodata a
ON g.tractid = a.tractid;
-- Index the geometries
CREATE INDEX nyc_census_tract_gidx �ON nyc_census_tracts USING GIST (geom);
Section 22 - More Spatial Joins
PostGIS
“List top 10 New York neighborhoods ordered by the proportion of people who have graduate degrees?”
Section 22 - More Spatial Joins
PostGIS
Graduate Degree Population %
SELECT
100.0 * Sum(t.edu_graduate_dipl) /
Sum(t.edu_total) AS graduate_pct,
n.name, n.boroname
FROM nyc_neighborhoods n
JOIN nyc_census_tracts t
ON ST_Intersects(n.geom, t.geom)
WHERE t.edu_total > 0
GROUP BY n.name, n.boroname
ORDER BY graduate_pct DESC
LIMIT 10;
Section 22 - More Spatial Joins
PostGIS
Graduate Degree Population %
graduate_pct | name | boroname
---------------------+-------------------+-----------
47.6469321851453175 | Carnegie Hill | Manhattan
42.1632365492235696 | Upper West Side | Manhattan
41.0656645950763598 | Battery Park | Manhattan
39.5611557679774060 | Flatbush | Brooklyn
39.3409549428379287 | Tribeca | Manhattan
39.2188240872451399 | North Sutton Area | Manhattan
38.6922550118291620 | Greenwich Village | Manhattan
38.6054942073575506 | Upper East Side | Manhattan
37.8834795573140662 | Murray Hill | Manhattan
37.3714416181491744 | Central Park | Manhattan
Section 22 - More Spatial Joins
PostGIS
252
PostGIS
253
The otherwise-empty “Flatbush” neighborhood polygon (which mostly covers Prospect Park) just grazes one high-education tract polygon, resulting in a spurious high measurement for the neighborhood.
PostGIS
What if a tract falls on the border between two neighborhoods?
Section 22 - More Spatial Joins
PostGIS
Centroid as proxy for polygons
Section 22 - More Spatial Joins
PostGIS
Join on ST_Centroid
SELECT
100.0 * Sum(t.edu_graduate_dipl) / � Sum(t.edu_total) AS graduate_pct,
n.name,
n.boroname
FROM nyc_neighborhoods n
JOIN nyc_census_tracts t
ON ST_Contains(n.geom, ST_Centroid(t.geom))
WHERE t.edu_total > 0
GROUP BY n.name, n.boroname
ORDER BY graduate_pct DESC
LIMIT 10;
Section 22 - More Spatial Joins
PostGIS
Join on intersects vs centroid
ST_Intersects()
ST_Centroid()
graduate_pct | name | boroname
--------------+---------------------+-----------
47.6 | Carnegie Hill | Manhattan
42.2 | Upper West Side | Manhattan
41.1 | Battery Park | Manhattan
39.6 | Flatbush | Brooklyn
39.3 | Tribeca | Manhattan
39.2 | North Sutton Area | Manhattan
38.7 | Greenwich Village | Manhattan
38.6 | Upper East Side | Manhattan
37.9 | Murray Hill | Manhattan
37.4 | Central Park | Manhattan
graduate_pct | name | boroname
--------------+---------------------+-----------
48.0 | Carnegie Hill | Manhattan
44.2 | Morningside Heights | Manhattan
42.1 | Greenwich Village | Manhattan
42.0 | Upper West Side | Manhattan
41.4 | Tribeca | Manhattan
40.7 | Battery Park | Manhattan
39.5 | Upper East Side | Manhattan
39.3 | North Sutton Area | Manhattan
37.4 | Cobble Hill | Brooklyn
37.4 | Murray Hill | Manhattan
Section 22 - More Spatial Joins
PostGIS
How many people live within 500m of a subway station?
Section 22 - More Spatial Joins
PostGIS
ST_Centroid
How many people in New York?
SELECT
Sum(popn_total)
FROM nyc_census_blocks;
Section 22 - More Spatial Joins
PostGIS
ST_Centroid
How many people 500m from a subway station?
SELECT
Sum(popn_total)
FROM nyc_census_blocks census
JOIN nyc_subway_stations subway
ON ST_DWithin(
census.geom,
subway.geom,
500
);
Section 22 - More Spatial Joins
PostGIS
ST_Centroid
How many people in New York?
8,175,032
How many people 500m from a subway station?
10,855,873 ?!?!!?
Section 22 - More Spatial Joins
PostGIS
Overlapping query areas
Section 22 - More Spatial Joins
PostGIS
Overlapping query areas
How many people 500m from a subway station?
WITH distinct_blocks AS (
SELECT DISTINCT ON (blkid) popn_total
FROM nyc_census_blocks census
JOIN nyc_subway_stations subway
ON ST_DWithin(
census.geom,
subway.geom,
500)
)
SELECT Sum(popn_total)
FROM distinct_blocks;
Section 22 - More Spatial Joins
PostGIS
Section 23 - Validity
Section 23 - Validity
PostGIS
POLYGON((0 0, 0 1, 2 1, 2 2, 1 2, 1 0, 0 0))
Section 23 - Validity
PostGIS
Why does validity matter?
SELECT ST_Area(ST_GeomFromText(
'POLYGON((0 0, 0 1, 1 1, � 2 1, 2 2, 1 2, � 1 1, 1 0, 0 0))'
));
� 0
-1
1
Geometry algorithms rely on properties enforced by validity: ring orientation, self-crossing, self-touching. All can confuse different algorithms.
Section 23 - Validity
PostGIS
Can we test validity?
SELECT ST_IsValid(ST_GeomFromText(
'POLYGON((0 0, 0 1, 1 1, � 2 1, 2 2, 1 2, � 1 1, 1 0, 0 0))'
));
� false
ST_IsValid() returns a boolean for validity, and ST_IsValidReason() returns a coordinate of where the error is, and a textual reason.
Self-intersection[1 1]
Section 23 - Validity
PostGIS
Can we test validity in bulk?
SELECT name, boroname,
ST_IsValidReason(geom)
FROM nyc_neighborhoods
WHERE NOT ST_IsValid(geom);��� Howard Beach | Queens | Self-intersection[596394 4500899]
Corona | Queens | Self-intersection[595483 4513817]
Red Hook | Brooklyn | Self-intersection[582655 4500908]
Steinway | Queens | Self-intersection[593198 4515125]
Section 23 - Validity
PostGIS
Can we fix validity in bulk?
The “banana polygon” is a polygon with a hole, formed by a ring that touches itself at a single point.
POLYGON((0 0,0 4,4 4,4 0,2 0,0 0),(3 1,2 2,1 1,2 0,3 1))
ST_MakeValid() juggles the components of an invalid polygon to form a “best guess” valid interpretation of the rings. The “banana polygon” gets turns into a traditional exterior/interior ring polygon.
SELECT ST_AsText(ST_MakeValid(� 'POLYGON((0 0, 2 0, 1 1, 2 2, 3 1, 2 0, 4 0, 4 4, 0 4, 0 0))'))
Section 23 - Validity
PostGIS
Can we fix validity in bulk?
ST_MakeValid(geom, options)
UPDATE nyc_neighborhoods�SET geom = ST_MakeValid(geom)
WHERE NOT ST_IsValid(geom);
'method=linework'
'method=structure keepcollapsed=false'
PostGIS 3.2+ includes text options to change the repair algorithm.
Section 23 - Validity
PostGIS
Section 24 - Equality
Section 24 - Equality
PostGIS
272
Section 24 - Equality
PostGIS
CREATE TABLE polygons (id integer, name varchar, poly geometry);
INSERT INTO polygons VALUES
(1, 'Polygon 1', 'POLYGON((-1 1.732,1 1.732,2 0,1 -1.732,
-1 -1.732,-2 0,-1 1.732))'),
(2, 'Polygon 2', 'POLYGON((-1 1.732,-2 0,-1 -1.732,1 -1.732,
2 0,1 1.732,-1 1.732))'),
(3, 'Polygon 3', 'POLYGON((1 -1.732,2 0,1 1.732,-1 1.732,
-2 0,-1 -1.732,1 -1.732))'),
(4, 'Polygon 4', 'POLYGON((-1 1.732,0 1.732, 1 1.732,1.5 0.866,
2 0,1.5 -0.866,1 -1.732,0 -1.732,-1 -1.732,-1.5 -0.866,
-2 0,-1.5 0.866,-1 1.732))'),
(5, 'Polygon 5', 'POLYGON((-2 -1.732,2 -1.732,2 1.732,
-2 1.732,-2 -1.732))');
Create the test polygons
Section 24 - Equality
PostGIS
Ways of testing equality!
ST_OrderingEquals(A, B)
ST_Equals(A, B)
A = B
A ~= B
Section 24 - Equality
PostGIS
SELECT a.name, b.name, � CASE WHEN � ST_OrderingEquals(a.poly, b.poly) � THEN 'Exactly Equal' � ELSE 'Not Exactly Equal' END
FROM polygons AS a, � polygons AS b;
Polygon 1 | Polygon 1 | Exactly Equal
Polygon 1 | Polygon 2 | Not Exactly Equal
Polygon 1 | Polygon 3 | Not Exactly Equal
Polygon 1 | Polygon 4 | Not Exactly Equal
Polygon 1 | Polygon 5 | Not Exactly Equal
Polygon 2 | Polygon 1 | Not Exactly Equal
Polygon 2 | Polygon 2 | Exactly Equal
Polygon 2 | Polygon 3 | Not Exactly Equal
Polygon 2 | Polygon 4 | Not Exactly Equal
Polygon 2 | Polygon 5 | Not Exactly Equal
Polygon 3 | Polygon 1 | Not Exactly Equal
Polygon 3 | Polygon 2 | Not Exactly Equal
Polygon 3 | Polygon 3 | Exactly Equal
Polygon 3 | Polygon 4 | Not Exactly Equal
Polygon 3 | Polygon 5 | Not Exactly Equal
Polygon 4 | Polygon 1 | Not Exactly Equal
Polygon 4 | Polygon 2 | Not Exactly Equal
Polygon 4 | Polygon 3 | Not Exactly Equal
Polygon 4 | Polygon 4 | Exactly Equal
Polygon 4 | Polygon 5 | Not Exactly Equal
Polygon 5 | Polygon 1 | Not Exactly Equal
Polygon 5 | Polygon 2 | Not Exactly Equal
Polygon 5 | Polygon 3 | Not Exactly Equal
Polygon 5 | Polygon 4 | Not Exactly Equal
Polygon 5 | Polygon 5 | Exactly Equal
Section 24 - Equality
PostGIS
SELECT a.name, b.name, � CASE WHEN ST_Equals(a.poly, b.poly) � THEN 'Spatially Equal' � ELSE 'Not Equal' END
FROM polygons AS a, � polygons AS b;
Polygon 1 | Polygon 1 | Spatially Equal
Polygon 1 | Polygon 2 | Spatially Equal
Polygon 1 | Polygon 3 | Spatially Equal
Polygon 1 | Polygon 4 | Spatially Equal
Polygon 1 | Polygon 5 | Not Equal
Polygon 2 | Polygon 1 | Spatially Equal
Polygon 2 | Polygon 2 | Spatially Equal
Polygon 2 | Polygon 3 | Spatially Equal
Polygon 2 | Polygon 4 | Spatially Equal
Polygon 2 | Polygon 5 | Not Equal
Polygon 3 | Polygon 1 | Spatially Equal
Polygon 3 | Polygon 2 | Spatially Equal
Polygon 3 | Polygon 3 | Spatially Equal
Polygon 3 | Polygon 4 | Spatially Equal
Polygon 3 | Polygon 5 | Not Equal
Polygon 4 | Polygon 1 | Spatially Equal
Polygon 4 | Polygon 2 | Spatially Equal
Polygon 4 | Polygon 3 | Spatially Equal
Polygon 4 | Polygon 4 | Spatially Equal
Polygon 4 | Polygon 5 | Not Equal
Polygon 5 | Polygon 1 | Not Equal
Polygon 5 | Polygon 2 | Not Equal
Polygon 5 | Polygon 3 | Not Equal
Polygon 5 | Polygon 4 | Not Equal
Polygon 5 | Polygon 5 | Spatially Equal
Section 24 - Equality
PostGIS
SELECT a.name, b.name, � CASE WHEN a.poly = b.poly� THEN 'Spatially =' � ELSE 'Not =' END
FROM polygons AS a, � polygons AS b;
Polygon 1 | Polygon 1 | Spatially =
Polygon 1 | Polygon 2 | Not =
Polygon 1 | Polygon 3 | Not =
Polygon 1 | Polygon 4 | Not =
Polygon 1 | Polygon 5 | Not =
Polygon 2 | Polygon 1 | Not =
Polygon 2 | Polygon 2 | Spatially =
Polygon 2 | Polygon 3 | Not =
Polygon 2 | Polygon 4 | Not =
Polygon 2 | Polygon 5 | Not =
Polygon 3 | Polygon 1 | Not =
Polygon 3 | Polygon 2 | Not =
Polygon 3 | Polygon 3 | Spatially =
Polygon 3 | Polygon 4 | Not =
Polygon 3 | Polygon 5 | Not =
Polygon 4 | Polygon 1 | Not =
Polygon 4 | Polygon 2 | Not =
Polygon 4 | Polygon 3 | Not =
Polygon 4 | Polygon 4 | Spatially =
Polygon 4 | Polygon 5 | Not =
Polygon 5 | Polygon 1 | Not =
Polygon 5 | Polygon 2 | Not =
Polygon 5 | Polygon 3 | Not =
Polygon 5 | Polygon 4 | Not =
Polygon 5 | Polygon 5 | Spatially =
Section 24 - Equality
PostGIS
SELECT a.name, b.name, � CASE WHEN a.poly ~= b.poly� THEN 'Bounds Equal' � ELSE 'Bounds Not Equal' END
FROM polygons AS a, � polygons AS b;
Polygon 1 | Polygon 1 | Bounds Equal
Polygon 1 | Polygon 2 | Bounds Equal
Polygon 1 | Polygon 3 | Bounds Equal
Polygon 1 | Polygon 4 | Bounds Equal
Polygon 1 | Polygon 5 | Bounds Equal
Polygon 2 | Polygon 1 | Bounds Equal
Polygon 2 | Polygon 2 | Bounds Equal
Polygon 2 | Polygon 3 | Bounds Equal
Polygon 2 | Polygon 4 | Bounds Equal
Polygon 2 | Polygon 5 | Bounds Equal
Polygon 3 | Polygon 1 | Bounds Equal
Polygon 3 | Polygon 2 | Bounds Equal
Polygon 3 | Polygon 3 | Bounds Equal
Polygon 3 | Polygon 4 | Bounds Equal
Polygon 3 | Polygon 5 | Bounds Equal
Polygon 4 | Polygon 1 | Bounds Equal
Polygon 4 | Polygon 2 | Bounds Equal
Polygon 4 | Polygon 3 | Bounds Equal
Polygon 4 | Polygon 4 | Bounds Equal
Polygon 4 | Polygon 5 | Bounds Equal
Polygon 5 | Polygon 1 | Bounds Equal
Polygon 5 | Polygon 2 | Bounds Equal
Polygon 5 | Polygon 3 | Bounds Equal
Polygon 5 | Polygon 4 | Bounds Equal
Polygon 5 | Polygon 5 | Bounds Equal
Section 24 - Equality
PostGIS
Section 25 - Linear Referencing
Section 25 - Linear Referencing
PostGIS
280
“B is 40% of the way along A”
Section 25 - Linear Referencing
PostGIS
281
“The bridge is at mile 10.5 �on Highway 12”
hyw | brdg | loc |
12 | 101 | 10.5 |
hyw | geom |
12 | |
Section 25 - Linear Referencing
PostGIS
282
“The salmon habitat is �from 3km to 5km above �the confluence”
rvr | fsh | from | to |
9 | 101 | 3 | 5 |
hyw | geom |
9 | |
Section 25 - Linear Referencing
PostGIS
283
SELECT ST_LineLocatePoint(�'LINESTRING(0 0,2 2)',�'POINT(0.9 1.1)'�);
��0.5
Section 25 - Linear Referencing
PostGIS
284
SELECT ST_AsText(� ST_LineInterpolatePoint(� 'LINESTRING(0 0,2 2)',� 0.5� ));
POINT(1, 1)
Section 25 - Linear Referencing
PostGIS
285
Section 25 - Linear Referencing
PostGIS
Find nearest street to each subway station
WITH ordered_nearest AS (
SELECT
ST_GeometryN(str.geom,1) AS streets_geom,
str.gid AS str_gid,
sub.geom AS subways_geom,
sub.gid AS subways_gid,
ST_Distance(str.geom, sub.geom) AS distance
FROM nyc_streets str
JOIN nyc_subway_stations sub
ON ST_DWithin(str.geom, sub.geom, 200)
ORDER BY subways_gid, distance ASC
)
Section 25 - Linear Referencing
PostGIS
Find measure of station on nearest street
SELECT
DISTINCT ON (subways_gid)
subways_gid,
streets_gid,
distance,� ST_LineLocatePoint(
streets_geom,
subways_geom) AS measure
FROM ordered_nearest;
Section 25 - Linear Referencing
PostGIS
Find measure of station on nearest street
subways_gid | streets_gid | measure
-------------+-------------+------------------------
1 | 17404 | 0.0023154983819572554
2 | 17318 | 0.6354078182846773
3 | 19086 | 0.24946227178552738
4 | 1924 | 0.11187222763997673
5 | 2067 | 0.9261874246426975
6 | 1934 | 0.33457647816803476
7 | 2024 | 0.5549461001845787
8 | 2469 | 0.2296616075093935
9 | 2024 | 0.9069811058590412
10 | 2067 | 0.6202998183141508
Section 25 - Linear Referencing
PostGIS
How to visualize events? Turn them back into points.
-- New view that turns events back
-- into spatial objects
CREATE OR REPLACE �VIEW nyc_subway_stations_lrs AS
SELECT
events.subways_gid,
ST_LineInterpolatePoint(� ST_GeometryN(streets.geom, 1),
events.measure) AS geom,
events.streets_gid
FROM nyc_subway_station_events events
JOIN nyc_streets streets
ON (streets.gid = events.streets_gid);
Section 25 - Linear Referencing
PostGIS
Original subway stations (orange stars) on Columbus Circle have been snapped over to the nearby roadways in the LRS view (blue circles)
Shows how LRS functions can be used to snap points to a network, as well as to manage actual LRS data.
Section 25 - Linear Referencing
PostGIS
Section 26 - Dimensionally �Extended 9 Intersection Model (DE9IM)
Section 26 - DE9IM
PostGIS
Interior, Boundary and Exterior
Polygons
Linestrings
Points
Section 26 - DE9IM
PostGIS
DE9IM
The DE9IM relationship between two geometries is calculated by examining the dimensionality of each intersection in the grid. ��The full set of 9 intersections precisely defines the nature of the spatial relationship between the geometries.
Section 26 - DE9IM
PostGIS
DE9IM
dim(I(a) ∩ I(b) = 2
dim(B(a) ∩ B(b) = 0
Section 26 - DE9IM
PostGIS
DI9EM matrix codes
Section 26 - DE9IM
PostGIS
296
Fill in the DE9IM matrix for these two geometries, examining each interaction, and then filling in the cell.
Section 26 - DE9IM
PostGIS
297
In terms of relationships, the more complex picture on the left is the same as the simple one on the right, that we will use for SQL examples.
Section 26 - DE9IM
PostGIS
Calculate the DI9EM matrix for two geometries
SELECT ST_Relate(
'LINESTRING(0 0, 2 0)',
'POLYGON((1 -1, 1 1, 3 1, 3 -1, 1 -1))'
);
1010F0212
101�0F0�212
| I | B | E |
I | 1 | 0 | 1 |
B | 0 | F | 0 |
E | 2 | 1 | 2 |
Section 26 - DE9IM
PostGIS
What is this for?
Data quality control. Some features have very specific allowed relationships to other features. Find the ones that break the rules in order to flag and fix them.
“Legal” docks have a specific pattern:�IFF00F212
Section 26 - DE9IM
PostGIS
What is this for? Find any illegal docks.
SELECT docks.*
FROM docks
JOIN lakes
ON ST_Intersects(docks.geom, lakes.geom)
WHERE NOT
ST_Relate(docks.geom,
lakes.geom,
'1FF00F212');
Section 26 - DE9IM
PostGIS
What is this for? Find overlapping census blocks!
Section 26 - DE9IM
| I | B | E |
I | 2 | * | * |
B | * | * | * |
E | * | * | * |
PostGIS
What is this for? Find overlapping census blocks!
SELECT a.gid, b.gid
FROM nyc_census_blocks a,
nyc_census_blocks b
WHERE ST_Intersects(a.geom, b.geom)
AND ST_Relate(a.geom,
b.geom,
'2********')
AND a.gid != b.gid
LIMIT 10;
| I | B | E |
I | 2 | * | * |
B | * | * | * |
E | * | * | * |
Section 26 - DE9IM
PostGIS
What is this for? Find non-noded streets!
Section 26 - DE9IM
| I | B | E |
I | * | * | * |
B | * | T | * |
E | * | * | * |
Non-noded intersections could be either 0-dimensional (simple crossing) or 1-dimensional (shared interior while). Any intersection of interiors is a non-noded case, so we test for the “T” condition: any intersection at all.
PostGIS
What is this for? Find non-noded streets!
SELECT a.gid, b.gid
FROM nyc_streets a,
nyc_streets b
WHERE ST_Intersects(a.geom,
b.geom)
AND NOT ST_Relate(a.geom,
b.geom,
'****T****')
AND a.gid != b.gid
LIMIT 10;
| I | B | E |
I | * | * | * |
B | * | T | * |
E | * | * | * |
Section 26 - DE9IM
PostGIS
Section 27 - Clustering on Indexes
Section 27 - Clustering on Indexes
PostGIS
Section 27 - Clustering on Indexes
PostGIS
Section 27 - Clustering on Indexes
PostGIS
Cluster using Spatial Index
-- Cluster the blocks table based
-- on their spatial index�
CLUSTER nyc_census_blocks
USING nyc_census_blocks_geom_gist;
Section 27 - Clustering on Indexes
PostGIS
Disk Versus Memory/SSD
Section 27 - Clustering on Indexes
PostGIS
Section 28 - 3D
Section 28 - 3D
PostGIS
Dimensions
Section 28 - 3D
PostGIS
POINT (0 0)�POINT Z (0 0 0)�POINT M (0 0 0)�POINT ZM (0 0 0 0)
Higher Dimension WKT
LINESTRING (0 0, 1 1)�LINESTRING Z (0 0 0, 1 1 1)�LINESTRING M (0 0 0, 1 1 1)�LINESTRING ZM (0 0 0 0, 1 1 1 1)
Etc, etc, etc….
Section 28 - 3D
PostGIS
POLYHEDRALSURFACE Z (
((0 0 0, 0 1 0, 1 1 0, 1 0 0, 0 0 0)),
((0 0 0, 0 1 0, 0 1 1, 0 0 1, 0 0 0)),
((0 0 0, 1 0 0, 1 0 1, 0 0 1, 0 0 0)),
((1 1 1, 1 0 1, 0 0 1, 0 1 1, 1 1 1)),
((1 1 1, 1 0 1, 1 0 0, 1 1 0, 1 1 1)),
((1 1 1, 1 1 0, 0 1 0, 0 1 1, 1 1 1))
)
Higher Dimension Types
Section 28 - 3D
PostGIS
3D Functions
Section 28 - 3D
PostGIS
ST_3DDistance
-- sqrt(x^2 + y^2 + z^2) = sqrt(3)
SELECT ST_3DDistance(
'POINT Z (1 1 1)'::geometry,
'POINT Z (2 2 2)'::geometry
);
1.73205080756888
Section 28 - 3D
PostGIS
N-D Indexes
-- Default index build uses 2d �-- operator class gist_geometry_ops_2d
CREATE INDEX nyc_streets_gix_2d ON nyc_streets
USING GIST (geom);
�-- Specify the N-D operator class
CREATE INDEX nyc_streets_gix_nd ON nyc_streets
USING GIST (geom gist_geometry_ops_nd);
“operator class” gist_geometry_ops_nd
Section 28 - 3D
PostGIS
Index-enabled Spatial Functions
Section 28 - 3D
PostGIS
&&& N-D Index Operator
To search using the N-D index explicitly, use the “N-D bounding boxes intersect” operator.
-- Returns true (both 3-D on the zero plane)
SELECT 'POINT Z (1 1 0)'::geometry &&&
'POLYGON ((0 0 0, 0 2 0, 2 2 0, 2 0 0, 0 0 0))'::geometry;
-- Returns false (one 2-D one 3-D)
SELECT 'POINT Z (1 1 1)'::geometry &&&
'POLYGON ((0 0, 0 2, 2 2, 2 0, 0 0))'::geometry;
Section 28 - 3D
PostGIS
&&& N-D Index Operator
To search using the N-D index explicitly, use the “N-D bounding boxes intersect” operator.
-- Returns true (the volume around the linestring interacts �-- with the point)
SELECT 'LINESTRING Z(0 0 0, 1 1 1)'::geometry &&&
'POINT(0 1 1)'::geometry;
Section 28 - 3D
PostGIS
&&& N-D Index Operator
N-D index search can be used on 2-D data, at a slight performance penalty.
-- N-D index operator
SELECT gid, name
FROM nyc_streets
WHERE geom &&&
ST_SetSRID('LINESTRING(586785 4492901,587561 4493037)',26918);
-- 2-D index operator
SELECT gid, name
FROM nyc_streets
WHERE geom &&
ST_SetSRID('LINESTRING(586785 4492901,587561 4493037)',26918);
N-D search
2-D search
Section 28 - 3D
PostGIS
Section 29 - Nearest �Neighbor Searching
Section 29 - Nearest Neighbor Searching
PostGIS
Nearest Neighbor Search
Nearest Neighbor Join
“What is the nearest fire station to this address?”
“What are the 10 nearest gas stations to the current locations?”
“Add the nearest fire station to every parcel in the table.”
Section 29 - Nearest Neighbor Searching
PostGIS
Nearest Neighbor Search
-- The location of Broad St station
-- SRID=26918;POINT(583571.9 4506714.3)
SELECT streets.gid, streets.name,
ST_Distance(streets.geom, � 'SRID=26918;POINT(583571.9 4506714.3)') AS dist
FROM nyc_streets streets
ORDER BY
streets.geom <-> � 'SRID=26918;POINT(583571.9 4506714.3)'::geometry
LIMIT 3;
no WHERE clause
LIMIT clause
ORDER BY distance
Section 29 - Nearest Neighbor Searching
PostGIS
324
gid | name | dist
-------+-----------+--------------------
17385 | Wall St | 0.749987508809928
17390 | Broad St | 0.8836306235191059
17436 | Nassau St | 1.336828024107041
Section 29 - Nearest Neighbor Searching
PostGIS
Nearest Neighbor Join
SELECT subways.gid AS subway_gid,
subways.name AS subway,
streets.name AS street,
streets.gid AS street_gid
FROM nyc_subway_stations subways
CROSS JOIN LATERAL (
SELECT streets.name, streets.geom, streets.gid
FROM nyc_streets streets
ORDER BY streets.geom <-> subways.geom
LIMIT 1
) streets;
LATERAL join
LIMIT clause
ORDER BY distance
outer parameter
Section 29 - Nearest Neighbor Searching
PostGIS
326
Section 29 - Nearest Neighbor Searching
PostGIS
Section 30 - Tracking �History
Section 30 - Tracking History
PostGIS
History Tracking
GIS database models too frequently only model the current state of the data. Fortunately, it is easy to automatically capture history information right in the database without any special changes to client software.
Section 30 - Tracking History
PostGIS
History Tracking
For every tracked table, add a “history table”:
<tablename>_history
History has the columns of the original, plus these extras:
Section 30 - Tracking History
PostGIS
History Tracking
Section 30 - Tracking History
ON INSERT to working table | ON DELETE from working table | ON UPDATE of working table |
Add record to history table:
| Update live record in working table:
| Run the delete routine on the current live history record. Sets “deleted by” and closes the valid range. Run the insert routine with the new state of the working record. Sets “created by” and starts new open valid range. |
Add triggers to the working table that will keep the state of the history table in sync with the state of the working table.
PostGIS
CREATE TABLE nyc_streets_history (
hid SERIAL PRIMARY KEY,
gid INTEGER,
id FLOAT8,
name VARCHAR(200),
oneway VARCHAR(10),
type VARCHAR(50),
geom GEOMETRY(MultiLinestring, 26918),
valid_range TSTZRANGE,
created_by VARCHAR(32),
deleted_by VARCHAR(32)
);
Section 30 - Tracking History
PostGIS
Add current state to history
INSERT INTO nyc_streets_history
(gid, id, name, oneway, type,
geom, created, created_by)
SELECT
gid, id, name, oneway,
type, geom,
tstzrange(now(), NULL),� current_user
FROM nyc_streets;
Section 30 - Tracking History
PostGIS
CREATE OR REPLACE FUNCTION nyc_streets_update() RETURNS trigger AS
$$
BEGIN
UPDATE nyc_streets_history
SET deleted = current_timestamp, deleted_by = current_user
WHERE valid_range @> current_timestamp AND gid = OLD.gid;
INSERT INTO nyc_streets_history
(gid, id, name, oneway, type, geom, created, created_by)
VALUES
(NEW.gid, NEW.id, NEW.name, NEW.oneway, NEW.type, NEW.geom,
tstzrange(current_timestamp, NULL), current_user);
RETURN NEW;
END;
$$
LANGUAGE plpgsql;
Section 30 - Tracking History
PostGIS
Demonstrate History Tracking
Section 30 - Tracking History
PostGIS
Section 31 - Basic �PostgreSQL Tuning
Section 31 - Basic PostgreSQL Tuning
PostGIS
Dividing up RAM
shared_buffers
maintenance
spare ram
work_mem
These settings actually change how much memory the database uses when running, so be judicious. The work_mem in particular is easy to over-allocate if you under-estimate the number of concurrent connections your database will be handling.
Section 31 - Basic PostgreSQL Tuning
PostGIS
Tuning for the Query Planner
random_page_cost | seq_page_cost | effective_cache_size |
Relative to the sequential page cost and other fixed costs. Random access on spinning media is costly. Random access in RAM is fast. If your disks are fast, or SSD, this value should be lower. | For most media a sequential access is cheap, because the system will “prefetch” data ahead of the current request, so it tends to have the next needed datum in memory already. Leave this value at the default. | Operating systems cache heavily used portions of disk (like, say, indexes) into memory. This value should reflect how much “free” space you expect the OS to have, after all database and other system use is accounted for. |
These settings won’t have any impact on the amount of resources the database uses, they just alter the kinds of query plans that are used.
Section 31 - Basic PostgreSQL Tuning
PostGIS
Tuning on the Fly
SET work_mem TO '1GB';
CREATE INDEX ...;
SET work_mem TO '32MB';
Some settings can be changed at run-time, which is useful if you are testing changes, and also for some workloads.
Building a new index? You can speed up index builds by increasing work_mem before running the build.
Section 31 - Basic PostgreSQL Tuning
PostGIS
Section 32 - PostgreSQL Security
Section 32 - PostgreSQL Security
PostGIS
Read-Only Role
-- A user account for the web app
CREATE USER app1;
-- Web app needs access to specific
-- data tables
GRANT SELECT ON nyc_streets TO app1;
-- A generic role for access to PostGIS
-- functionality
CREATE ROLE postgis_reader INHERIT;
-- Give that role to the web app
GRANT postgis_reader TO app1;
Section 32 - PostgreSQL Security
PostGIS
Read-Only Role
-- This works!
SELECT * FROM nyc_streets LIMIT 1;
-- This doesn’t work!
SELECT ST_AsText(ST_Transform(geom, 4326))
FROM nyc_streets LIMIT 1;
Why??!
Section 32 - PostgreSQL Security
PostGIS
Read-Only Role
GRANT SELECT � ON geometry_columns TO postgis_reader;
GRANT SELECT � ON geography_columns TO postgis_reader;
GRANT SELECT � ON spatial_ref_sys TO postgis_reader;
-- This works now!
SELECT
ST_AsText(ST_Transform(geom, 4326))
FROM nyc_streets LIMIT 1;
Section 32 - PostgreSQL Security
PostGIS
Read-Write Role
-- Make a postgis writer role
CREATE ROLE postgis_writer;
-- Start by giving it the
-- postgis_reader powers
GRANT postgis_reader TO postgis_writer;
-- OPTIONAL insert/update/delete powers for
-- the spatial_ref_sys table
GRANT INSERT, UPDATE, DELETE � ON spatial_ref_sys TO postgis_writer;
Section 32 - PostgreSQL Security
PostGIS
Read-Write Role
-- Add insert/update/delete abilities to
-- our web application
GRANT INSERT, UPDATE, DELETE
ON nyc_streets TO postgis_writer;
-- Make app1 a PostGIS writer to see
-- if it works!
GRANT postgis_writer TO app1;
Section 32 - PostgreSQL Security
PostGIS
Encryption in PostgreSQL
Section 32 - PostgreSQL Security
PostGIS
Enabling SSL Connections
# Create a new certificate, filling out
# the certification info as prompted
openssl req -new -text -out server.req
# Strip the passphrase from the certificate
openssl rsa -in privkey.pem -out server.key
# Convert the certificate into a self-signed cert
openssl req -x509 -in server.req -text \
-key server.key -out server.crt
# Set the permission of the key
# to private read/write
chmod og-rwx server.key
Section 32 - PostgreSQL Security
PostGIS
Enabling SSL Connections
Section 32 - PostgreSQL Security
PostGIS
Enabling Data Encryption
CREATE EXTENSION pgcrypto;
-- encrypt a string using blowfish (bf)
SELECT encrypt('this is a test phrase', -- string
'mykey', -- key
'bf'); -- cipher�
-- round-trip a string using blowfish (bf)
SELECT decrypt(encrypt('this is a test phrase',
'mykey',
'bf'),
'mykey',
'bf');
Section 32 - PostgreSQL Security
PostGIS
Authentication (pg_hba.conf)
Section 32 - PostgreSQL Security
PostGIS
Authentication (pg_hba.conf)
# TYPE DATABASE USER CIDR-ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv4 local network connections:
host all all 192.168.1.0/24 md5
# IPv6 local connections:
host all all ::1/128 trust
# remote connections for nyc database only
host nyc all 10.10.1.0/16 ldap
Section 32 - PostgreSQL Security
PostGIS
Section 33 - Schemas
Section 33 - Schemas
PostGIS
Why Schemas?
Section 33 - Schemas
PostGIS
Why Schemas?
Without schemas, tables default to being put into the “public” catch-all schema. This is fine if your database only has one user, but if you have multiple users or applications, segmenting them into schemas will make access control and general cleanliness much easier to achieve.
Section 33 - Schemas
PostGIS
Common Schema Tasks
-- create a schema
CREATE SCHEMA census;
-- move a table into a schema
ALTER TABLE nyc_census_blocks SET SCHEMA census;
-- reference a table with schema �SELECT * FROM census.nyc_census_blocks LIMIT 1;
-- add schema to search path
SET search_path = census, public;
-- permanently add schema to search path
ALTER USER postgres
SET search_path = census, public;
Section 33 - Schemas
PostGIS
Setting a User Schema
-- create a user
CREATE USER myuser WITH ROLE postgis_writer;
-- create a user schema
CREATE SCHEMA myuser AUTHORIZATION myuser;
If there is a schema with the same name as a user, the default search_path results in all new tables going into that schema and default searches finding those tables.
Section 33 - Schemas
PostGIS
Section 34 - Backup and Restore
Section 34 - Backup and Restore
PostGIS
Backup / Restore Options
Section 34 - Backup and Restore
PostGIS
Generic Backup Theory
Section 34 - Backup and Restore
PostGIS
Two Kinds of Upgrade
Section 34 - Backup and Restore
PostGIS
PostgreSQL Upgrades
Section 34 - Backup and Restore
PostGIS
PostgreSQL Dump/Restore
Section 34 - Backup and Restore
PostGIS
PostgreSQL pg_upgrade
pg_upgrade
--old-datadir "/var/lib/postgres/12/data"
--new-datadir "/var/lib/postgres/13/data"
--old-bindir "/usr/pgsql/12/bin"
--new-bindir "/usr/pgsql/13/bin"
Section 34 - Backup and Restore
PostGIS
PostGIS Upgrades
Section 34 - Backup and Restore
PostGIS