Building a Scalable Modern GIS Infrastructure
Matt Forrest - I-GUIDE Forum 2023
Matt Forrest (he/him)
I-GUIDE Forum 2023
Overview
Matt Forrest - I-GUIDE Forum 2023
About me
Matt Forrest - I-GUIDE Forum 2023
Agenda
Matt Forrest - I-GUIDE Forum 2023
Exercises
Matt Forrest - I-GUIDE Forum 2023
What you’ll need
Matt Forrest - I-GUIDE Forum 2023
https://github.com/mbforr/modern-gis-workshop
What you’ll need
Matt Forrest - I-GUIDE Forum 2023
What is Modern GIS?
Matt Forrest - I-GUIDE Forum 2023
Matt Forrest - I-GUIDE Forum 2023
Modern GIS is the process, systems, and technology used to derive insights from geospatial data. Modern GIS uses open, interoperable, and standards based technology. It can be run locally or in the cloud and can scale to work with many different types, velocities, and scales of data.
Comparison
Matt Forrest - I-GUIDE Forum 2023
| Traditional | Modern |
Standards | Platform and software-based | Open and standards-based |
Cloud Access | Cloud-hosted or on-premises | Cloud-native |
Deployment | Local software package up to enterprise software packages | Open-source local use up to full enterprise |
Collaboration | Siloed | Interoperable |
Scalability | Single-threaded | Serverless |
Data | Limited data scale | Scalable, even further in the cloud |
The Modern Data Stack
Matt Forrest - I-GUIDE Forum 2023
History and growth
Matt Forrest - I-GUIDE Forum 2023
Why?
Matt Forrest - I-GUIDE Forum 2023
https://moderndata101.substack.com/p/evolution-of-the-data-stack-the-story
Matt Forrest - I-GUIDE Forum 2023
https://motherduck.com/blog/motherduck-open-for-all-with-series-b/
Matt Forrest - I-GUIDE Forum 2023
https://tanay.substack.com/p/understanding-the-modern-data-stack
Matt Forrest - I-GUIDE Forum 2023
The Modern GIS Stack
Matt Forrest - I-GUIDE Forum 2023
Modern Geospatial Data Stack
Data Sources
Ingestion
Reverse ETL
Storage
Transformation
Analytics
Data Science
Applications
Matt Forrest - forrest.nyc
🍦yogrt
Mapping
Early 2024
Modern Geospatial Data Stack
Data Lake
Transform
Processing
OLTP
Orchestration
Matt Forrest - forrest.nyc
Mid 2024
Formats
OLAP
Analytics
GIS
Python
Applications
Modern Geospatial Data Stack
Data Lake
Transform
Processing
OLTP
Matt Forrest - forrest.nyc
Mid 2024
Formats
OLAP
Analytics
GIS
Python
Applications
Orchestration
Modern Geospatial Data Stack
Data Sources
Ingestion
Reverse ETL
Storage
Transformation
Analytics
Data Science
Applications
Matt Forrest - forrest.nyc
🍦yogrt
Mapping
Mid 2024
The Modern GIS Stack
Data Sources
Ingestion
Reverse ETL
Storage
Transformation
GIS
Data Science
Applications
Matt Forrest - I-GUIDE Forum 2023
The Modern GIS Stack
Data Sources
Ingestion
Reverse ETL
Storage
Transformation
GIS
Data Science
Applications
Matt Forrest - I-GUIDE Forum 2023
The Modern GIS Stack
Data Sources
Ingestion
Reverse ETL
Storage
Transformation
GIS
Data Science
Applications
Matt Forrest - I-GUIDE Forum 2023
Data Sources
Matt Forrest - I-GUIDE Forum 2023
Matt Forrest - I-GUIDE Forum 2023
The Apache Arrow project specifies a standardized language-independent columnar memory format. It enables shared computational libraries, zero-copy shared memory and streaming messaging, interprocess communication, and is supported by many programming languages and data libraries.
Matt Forrest - I-GUIDE Forum 2023
Zarr
Matt Forrest - I-GUIDE Forum 2023
Geoparquet
Matt Forrest - I-GUIDE Forum 2023
Geoparquet
Matt Forrest - I-GUIDE Forum 2023
Exercise 1: Using GeoParquet
Matt Forrest - I-GUIDE Forum 2023
Ingestion
Matt Forrest - I-GUIDE Forum 2023
GDAL
Matt Forrest - I-GUIDE Forum 2023
Airflow
Matt Forrest - I-GUIDE Forum 2023
Airflow
Matt Forrest - I-GUIDE Forum 2023
Airbyte
Matt Forrest - I-GUIDE Forum 2023
Data Storage
Matt Forrest - I-GUIDE Forum 2023
1995�Oracle Spatial in Oracle 8i
Spatial SQL
ESRI started in 1969 and created commercial GIS
1990
2010
2020
2000
1994�Illustra Spatial launches
1995�Oracle Spatial Data Option (SDO)
1996�Informix acquires Illustra
Spatial Datablade launched
2001�PostGIS candidate
release
2002�Releases DB2
Spatial Extender
2003�IBM acquires Informix
Spatial Extender launched
2003�OGC adopts
ISO 19125
2005�PostGIS 1.0
released
2008�Spatial support
In MS SQL
Server
2008�Spatialite for
SQLite launched
2009�MySQL launches
spatial support
2015�Geospark
launched
2018�Spatial support
In BigQuery
2019�Spatial support
In Redshift
2020�Spatial support
In Snowflake
2021�Apache Sedona
launched
2021�Spatial support
in Apache Pinot
2022
H3 support
in Databricks
2023
Spatial support
in DuckDB
Traditional database era
The move to open
The modern data stack
PostGIS
Matt Forrest - I-GUIDE Forum 2023
Cloud data warehouses
Matt Forrest - I-GUIDE Forum 2023
BigQuery
Matt Forrest - I-GUIDE Forum 2023
Snowflake
Matt Forrest - I-GUIDE Forum 2023
Redshift
Matt Forrest - I-GUIDE Forum 2023
Analytics Toolbox
Matt Forrest - I-GUIDE Forum 2023
Analytics Toolbox
Matt Forrest - I-GUIDE Forum 2023
Analytics Toolbox
Matt Forrest - I-GUIDE Forum 2023
DuckDB
Matt Forrest - I-GUIDE Forum 2023
MotherDuck
Matt Forrest - I-GUIDE Forum 2023
Hybrid Approach
Matt Forrest - I-GUIDE Forum 2023
Complete spatial tools
Raster support
Extensions
Large scale data
Parquet!
Fast processing
Sharable databases
Exercise 2: PostGIS with Docker
Matt Forrest - I-GUIDE Forum 2023
Exercise 3: Quack with DuckDB
Matt Forrest - I-GUIDE Forum 2023
Reverse ETL
Matt Forrest - I-GUIDE Forum 2023
What is Reverse ETL
Matt Forrest - I-GUIDE Forum 2023
Transformation
Matt Forrest - I-GUIDE Forum 2023
GDAL
Matt Forrest - I-GUIDE Forum 2023
H3 Indexing
Matt Forrest - I-GUIDE Forum 2023
dbt
Matt Forrest - I-GUIDE Forum 2023
Exercise 4: Process data with dbt
Matt Forrest - I-GUIDE Forum 2023
Exercise 5: Translating rasters to H3
Matt Forrest - I-GUIDE Forum 2023
Thank you!
Matt Forrest - I-GUIDE Forum 2023
LinkedIn @mbforr
forrest.nyc
matt@carto.com
spatial-sql.com�mattforrest.substack.com