PostGIS �Performance Tips
paul.ramsey@crunchydata.com
MongoDB is WeB ScALe!
Mandatory dig at NoSQL databases and the sad, sad, Sad people who use them
😀�PostGIS is great! So fast!
😡�PostGIS is unacceptably slow!
😐�…
The hero’s journey…
How do I make PostGIS faster?!?
It Depends
The $250,000 feature flag!
Brought to you by…
ALTER SESSION � SET SPATIAL_VECTOR_ACCELERATION = true
The dream…!
The reality…
ALTER SESSION � SET postgis_faster = true
yum update geos
GEOS 3.10
GEOS GOES FASTER
crunchydata.com
GEOS GOES FASTER
crunchydata.com
It Depends
What “It Depends” On
crunchydata.com
Data Size
Foreach Record(A)� Foreach Record(B)
TestJoinCondition()
TestJoinCondition()
runs 10,000 times
crunchydata.com
Data Size
Foreach Record(A)� Foreach Record(B)
TestJoinCondition()
TestJoinCondition()
runs 100,000,000 times
crunchydata.com
Performance Principle #1
Remember Spatial Indexes
Build them! �Use them! �EXPLAIN ANALYZE!
crunchydata.com
What Indexes? (Advanced)
crunchydata.com
What Indexes? (Advanced)
crunchydata.com
What Indexes? (Advanced)
Good spatial autocorrelation
Poor spatial autocorrelation
crunchydata.com
What Indexes? (Advanced)
crunchydata.com
What Indexes? (Advanced)
crunchydata.com
Spatial Indexes (RTree)
gist_geometry_ops_2d builds…
Dun, dun, duuuun!!
crunchydata.com
SELECT pts.geom
FROM pts p
JOIN countries c
ON ST_Intersects(p.geom, c.geom)� WHERE c.name = 'Canada'
SELECT pts.geom
FROM pts p
JOIN countries c
ON ST_Intersects(p.geom, c.geom)� WHERE c.name = 'Canada'
SELECT pts.geom
FROM pts p
JOIN countries c
ON ST_Intersects(p.geom, c.geom)� WHERE c.name = 'Canada'
Performance Principle #2
Strive for Homogeneity
Shapes of similar extent
Filters of similar selectivity�EXPLAIN ANALYZE!
crunchydata.com
WITH subdivided AS (� SELECT ST_Subdivide(geom) AS geom� FROM countries� WHERE c.name = 'Canada'�)
SELECT pts.geom
FROM pts p
JOIN subdivided s� ON ST_Intersects(p.geom, s.geom)
WITH subdivided AS (� SELECT ST_Subdivide(geom) AS geom� FROM countries� WHERE c.name = 'Canada'�)
SELECT pts.geom
FROM pts p
JOIN subdivided s� ON ST_Intersects(p.geom, s.geom)
WITH subdivided AS (� SELECT ST_Subdivide(geom) AS geom� FROM countries� WHERE c.name = 'Canada'�)
SELECT pts.geom
FROM pts p
JOIN subdivided s� ON ST_Intersects(p.geom, s.geom)
WITH subdivided AS (� SELECT ST_Subdivide(geom) AS geom� FROM countries�)
SELECT pts.geom
FROM pts p
JOIN subdivided s� ON ST_Intersects(p.geom, s.geom)
Performance Principle #3
Enough Detail and No More
Use simplification
Use approximations�Beware of the TOAST
crunchydata.com
WITH subdivided AS (� SELECT ST_Subdivide(geom) AS geom� FROM countries� WHERE c.name = 'Canada'�)
SELECT pts.geom
FROM pts p
JOIN subdivided s� ON ST_Intersects(p.geom, s.geom)
TOAST
crunchydata.com
TOAST
crunchydata.com
5128 vertices�915.378 Ha
4211 vertices �915.380 Ha
ST_Simplify()
964 vertices
225 vertices
ST_Simplify()
15 �vertices
ST_Simplify()
915 Ha
927 Ha
951 Ha
ST_ConcaveHull(polygon)
crunchydata.com
Match Data Detail to Resolution
How big is a pixel?
crunchydata.com
Dilate and Erode
Too detailed!
crunchydata.com
Dilate and Erode
Dilate! ST_Buffer(g, 100)
crunchydata.com
Dilate and Erode
Erode! ST_Buffer(g, -100)
crunchydata.com
Performance Principle #4
Tame Query Complexity
Use JOIN syntax
Don’t get CTE happy�EXPLAIN! Test in parts
crunchydata.com
Code smell #1 - “I Hate Joins”
SELECT p.objectid, p.name�FROM parcels p
WHERE p.zone IN (
SELECT zoneid
FROM zoning
WHERE zonename LIKE 'R%'
)
SELECT p.objectid, p.name�FROM parcels p
JOIN zoning z � ON p.zone = z.zoneid
WHERE z.zonename LIKE 'R%'
Not like this!
Like this!
crunchydata.com
Code smell #2 - “I Hate Joins”
SELECT p.objectid AS f_id,
ST_Area(ST_Intersection(p.geom, w.geom)) / � ST_Area(p.geom) � AS wetland_pct
FROM dbo.parcels p, dbo.wetland w
WHERE ST_Intersects(p.geom, w.geom)
ORDER BY p.objectid
Use JOIN syntax
crunchydata.com
Code smell #3 - “CTE all the things”
WITH p AS (
SELECT objectid, geom
FROM dbo.parcels
)
SELECT p.objectid AS f_id,
ST_Area(ST_Intersection(p.geom, w.geom)) AS wetland_area
FROM p, dbo.wetland w
WHERE ST_Intersects(p.geom, w.geom)
ORDER BY b.objectid
Avoid�optimization�boundaries
Because they (can) wreck indexing
crunchydata.com
Code smell #3 - “CTE all the things”
WITH p AS (
SELECT objectid, geom
FROM dbo.parcels
)
SELECT p.objectid AS f_id,
ST_Area(ST_Intersection(p.geom, w.geom)) AS wetland_area
FROM p, dbo.wetland w
WHERE ST_Intersects(p.geom, w.geom)
ORDER BY b.objectid
Avoid�optimization�boundaries
Because they (can) wreck indexing
crunchydata.com
Code smell #4 - “Summarize it all”
WITH bounds AS MATERIALIZED (
SELECT ST_TileEnvelope(0,0,0) AS merc,
ST_Transform(ST_TileEnvelope(0,0,0), 4326) AS ll
),
mvtgeom AS (
SELECT ST_AsMVTGeom(ST_Transform(t.geom, 3857), bounds.merc) AS geom
FROM gps_points t� CROSS JOIN bounds
WHERE ST_Intersects(t.geom, bounds.ll)
)
SELECT ST_AsMVT(mvtgeom.*) FROM mvtgeom
the �whole �world
In addition to “tile the world”, “union the world” is also popular.
crunchydata.com
Debugging Principles
“Why is my five-page SQL query running slowly?”
crunchydata.com
Debugging Principles
“Why is my five-page SQL query running slowly?”
crunchydata.com
Debugging Principles
“Why is my five-page SQL query running slowly?”
crunchydata.com
😀�PostGIS is great! So fast!
😡�PostGIS is unacceptably slow!
😐�…
The hero’s journey…
Questions?
paul.ramsey@crunchydata.com