1 of 72

https://goo.gl/a92MDH

paul.ramsey@crunchydata.ca

2 of 72

  • Enterprise support for PostgreSQL
  • Security experts
    • Row level security
    • Common Criteria Certified / STIG
  • Cloud experts
    • Kubernetes, Postgres Operator

3 of 72

SQL =

4 of 72

5 of 72

6 of 72

7 of 72

Groovy �SQL!

8 of 72

Popular SQL!

9 of 72

Far Out �Spatial SQL!

10 of 72

Groovy SQL!

Motivation: Campaign Data & Analysis

  • Operational data mangling
  • Lots of reporting, summaries

11 of 72

Groovy �SQL!

12 of 72

  • String mangling is a very common use case
  • NULL || ‘string’ IS NULL
  • Arg!!!!

Groovy SQL!

13 of 72

Groovy SQL!

nullif(middle_name, ‘’)� coalesce(middle_name, ‘’)

14 of 72

Groovy SQL!

concat_ws(‘ ’, � first_name, � middle_name, � last_name)

  • How to join name/address parts?�(e.g. building mailing labels)
  • Very common in address registers

15 of 72

Groovy SQL!

concat_ws(‘, ’, � concat_ws(‘ -- ’, � apt_no, � concat_ws(‘ , � street_num, � street_name,� street_dir) � city, � state)

16 of 72

  • For BI queries on large “fact table”
  • Alternative to�Sum(CASE WHEN … THEN … ELSE … END)
  • More terse, slightly faster

Groovy SQL!

Sum() FILTER (WHERE ...)

17 of 72

Groovy SQL!

BI “fact table” - long, denormalized

name

text

age

integer

income

real

gender

text

city

text

state

text

sales

real

18 of 72

Groovy SQL!

SELECTAvg(income) AS income,� Avg(income) AS inc_male� FILTER (WHERE gender = ‘male’),

Avg(income) AS inc_older� FILTER (WHERE age > ‘55’)

FROM demographics�GROUP BY city;

19 of 72

  • Ensure the parts are aggregated in a desired order

Groovy SQL!

string_agg(name, ‘,’ � ORDER BY …)

ST_MakeLine(gps_pt ORDER BY gps_time)

20 of 72

Groovy SQL!

21 of 72

Groovy SQL!

SELECT

gps_track_id,� ST_MakeLine(� gps_pt ORDER BY gps_time� ) AS geom

FROM gps_pts

GROUP BY gps_track_id;

22 of 72

DISTINCT ON

Groovy SQL!

http://files.boundlessgeo.com/workshopmaterials/postgis-workshop-201401.zip

23 of 72

Groovy SQL!

http://files.boundlessgeo.com/workshopmaterials/postgis-workshop-201401.zip

24 of 72

Groovy SQL!

http://files.boundlessgeo.com/workshopmaterials/postgis-workshop-201401.zip

25 of 72

Groovy SQL!

SELECT

DISTINCT ON (sbw.cartodb_id)

str.*

FROM nyc_streets str

JOIN nyc_subway_stations sbw

ON ST_DWithin(

str.the_geom_webmercator,

sbw.the_geom_webmercator,

200)

ORDER BY

sbw.cartodb_id,

ST_Distance(

str.the_geom_webmercator,

sbw.the_geom_webmercator)

26 of 72

Groovy SQL!

subway

street

distance

B

32nd Street

123.5

A

23nd Street

23.4

B

Reed Avenue

12.4

A

24th Street

34.4

B

25th Street

102.2

A

Reed Avenue

45.5

C

State Street

32.1

B

State Street

56.4

27 of 72

Groovy SQL!

SELECT

DISTINCT ON (sbw.cartodb_id)

str.*

FROM nyc_streets str

JOIN nyc_subway_stations sbw

ON ST_DWithin(

str.the_geom_webmercator,

sbw.the_geom_webmercator,

200)

ORDER BY

sbw.cartodb_id,

ST_Distance(

str.the_geom_webmercator,

sbw.the_geom_webmercator)

28 of 72

Groovy SQL!

subway

street

distance

A

23nd Street

23.4

A

24th Street

34.4

A

Reed Avenue

45.5

B

Reed Avenue

12.4

B

State Street

56.4

B

25th Street

102.2

B

32nd Street

123.5

C

State Street

32.1

29 of 72

Groovy SQL!

SELECT

DISTINCT ON (sbw.cartodb_id)

str.*

FROM nyc_streets str

JOIN nyc_subway_stations sbw

ON ST_DWithin(

str.the_geom_webmercator,

sbw.the_geom_webmercator,

200)

ORDER BY

sbw.cartodb_id,

ST_Distance(

str.the_geom_webmercator,

sbw.the_geom_webmercator)

30 of 72

Groovy SQL!

subway

street

distance

A

23nd Street

23.4

B

Reed Avenue

12.4

C

State Street

32.1

31 of 72

Popular SQL!

32 of 72

Popular SQL!

What are the most popular PostGIS questions on Stack Exchange?

33 of 72

Popular SQL!

https://github.com/CartoDB� /carto-workshop/tree/master/04-database

34 of 72

Popular SQL!

35 of 72

Popular SQL!

36 of 72

Popular SQL!

37 of 72

Popular SQL!

Box–Muller transform

38 of 72

Popular SQL!

CREATE TABLE nyc_subway_stations_gaussian AS

WITH u AS (

SELECT random() AS u1,

random() AS u2

FROM generate_series(1,1000)

),

off AS (

SELECT

sqrt(-2*ln(u1))*cos(2*pi()*u2) AS x_off,

sqrt(-2*ln(u1))*sin(2*pi()*u2) AS y_off

FROM u

)

SELECT row_number() OVER () AS cartodb_id, name, routes,

ST_Translate(the_geom_webmercator, 150*x_off, 150*y_off) � AS the_geom_webmercator

FROM nyc_subway_stations sub

CROSS JOIN off

WHERE sub.name IN ('Grand St', 'Bowery', 'Essex St')

UNIFORM

GAUSSIAN

JOIN/OFFSET

39 of 72

Popular SQL!

ST_ClusterWithin(the_geom_webmercator, 25)

40 of 72

WITH c AS (

SELECT

unnest(ST_ClusterWithin(the_geom_webmercator, 23)) � AS the_geom_webmercator

FROM nyc_subway_stations_gaussian

),

d AS (

SELECT row_number() OVER () AS cluster_id, � ST_Dump(the_geom_webmercator) AS dump

FROM c

)

SELECTrow_number() OVER () AS cartodb_id, � cluster_id,� (dump).geom AS the_geom_webmercator

FROM d

Popular SQL!

41 of 72

Popular SQL!

ST_ClusterDBScan(the_geom_webmercator, 25)

42 of 72

SELECT

ST_ClusterDBSCAN(the_geom_webmercator, 23, 100) � OVER (),

the_geom_webmercator,

cartodb_id

FROM nyc_subway_stations_gaussian

Popular SQL!

Window Functions FTW!

43 of 72

Popular SQL!

ST_ClusterKMeans(the_geom_webmercator, 3)

44 of 72

Popular SQL!

Geography Pros

Geography Cons

  • Exact
  • Understood
  • “Simple”
  • Slow
  • Incomplete
  • Confusing

45 of 72

Popular SQL!

Geography Pros

Geography Cons

  • Exact
  • Understood
  • “Simple”
  • Slow
  • Incomplete
  • Confusing

46 of 72

Popular SQL!

ST_Intersects(geom, geom) = TRUE

ST_Intersects(geog, geog) = FALSE

47 of 72

Popular SQL!

ST_Segmentize(geog, 10000)

48 of 72

Popular SQL!

ST_AsMVTGeom(geom, b2dBounds, intRes)

ST_AsMVT(row)

49 of 72

Popular SQL!

50 of 72

Popular SQL!

“when I intersect a parcel dataset with 329,152 multipolygons with a jurisdiction dataset containing 525 multipolygons”

51 of 72

Popular SQL!

52 of 72

Popular SQL!

SELECT

a.id, b.id,

CASE

WHEN ST_Contains(a.geom, b.geom)

THEN b.geom

ELSE ST_Intersection(a.geom, b.geom)

END AS geom

FROM a, b

WHERE ST_Intersects(a.geom, b.geom);

53 of 72

Popular SQL!

54 of 72

Far Out �Spatial SQL!

55 of 72

Far Out Spatial SQL!

ST_Subdivide()

56 of 72

Far Out Spatial SQL!

57 of 72

Far Out Spatial SQL!

58 of 72

Far Out Spatial SQL!

59 of 72

Far Out Spatial SQL!

60 of 72

Far Out Spatial SQL!

61 of 72

Far Out Spatial SQL!

62 of 72

30x

Far Out Spatial SQL!

63 of 72

<->

Far Out Spatial SQL!

64 of 72

SELECT * �FROM nyc_subway_stations

ORDER BY

the_geom <->CDB_LatLng(40.72,-73.94)

LIMIT 2;

Far Out Spatial SQL!

65 of 72

Far Out Spatial SQL!

66 of 72

“Find the nearest that is not yourself.”

Far Out Spatial SQL!

67 of 72

SELECT

row_number() OVER () AS cartodb_id,

ST_Transform(ST_MakeLine(a.the_geom, b.the_geom),3857) � AS the_geom_webmercator,

ST_DistanceSphere(a.the_geom, b.the_geom) � AS distance,

a.name AS name_a,

b.name AS name_b

FROM nyc_subway_stations AS a

CROSS JOIN LATERAL (

SELECT subq.the_geom, subq.name

FROM nyc_subway_stations subq

WHERE a.cartodb_id != subq.cartodb_id

ORDER BY

a.the_geom <-> subq.the_geom

LIMIT 1) AS b

Far Out Spatial SQL!

68 of 72

SELECT

...

FROM table_a AS a

CROSS JOIN LATERAL (

SELECT table_b.*

FROM table_b

ORDER BY

a.the_geom <-> table_b.the_geom

LIMIT 1) AS b

Far Out Spatial SQL!

69 of 72

SELECT

row_number() OVER () AS cartodb_id,

ST_Transform(ST_MakeLine(a.the_geom, b.the_geom),3857) � AS the_geom_webmercator,

ST_DistanceSphere(a.the_geom, b.the_geom) � AS distance,

a.name AS name_a,

b.name AS name_b

FROM nyc_subway_stations AS a

CROSS JOIN LATERAL (

SELECT subq.the_geom, subq.name

FROM nyc_subway_stations subq

WHERE a.cartodb_id != subq.cartodb_id

ORDER BY

a.the_geom <-> subq.the_geom

LIMIT 1) AS b

Far Out Spatial SQL!

70 of 72

Far Out Spatial SQL!

71 of 72

Groovy SQL!

Popular SQL!

Far Out �Spatial SQL!

72 of 72

https://goo.gl/a92MDH

paul.ramsey@crunchydata.ca