1 of 21

Query Optimization Wrap-Up

February 28, 2024

Data 101/Info 258, Spring 2024 @ UC Berkeley

Aditya Parameswaran https://data101.org/sp24

LECTURE 11a

2 of 21

EXPLAIN ANALYZE with JOINs

Rules of Thumb, Part II

Other Knobs beyond Query Optimization

Lecture 10, Data 101/Info 258 Spring 2024

3 of 21

REPEAT: explain analyze select * from actor, cast_info, movie where actor.id = cast_info.person_id and movie.id = cast_info.movie_id limit 10;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=244945.11..244946.60 rows=10 width=318) (actual time=3557.268..3904.228 rows=10 loops=1)

-> Hash Join (cost=244945.11..3855759.20 rows=24248734 width=318) (actual time=3557.267..3904.224 rows=10 loops=1)

Hash Cond: (cast_info.movie_id = movie.id)

-> Hash Join (cost=202289.55..2141712.01 rows=36244344 width=116) (actual time=3152.976..3586.569 rows=9556 loops=1)

Hash Cond: (cast_info.person_id = actor.id)

-> Seq Scan on cast_info (cost=0.00..615130.44 rows=36244344 width=42) (actual time=0.006..100.851 rows=845147 loops=1)

-> Hash (cost=97287.91..97287.91 rows=4167491 width=74) (actual time=3117.718..3117.718 rows=4167491 loops=1)

Buckets: 65536 Batches: 256 Memory Usage: 2273kB

-> Seq Scan on actor (cost=0.00..97287.91 rows=4167491 width=74) (actual time=0.025..488.901 rows=4167491 loops=1)

-> Hash (cost=15598.25..15598.25 rows=662825 width=202) (actual time=243.959..243.959 rows=662825 loops=1)

Buckets: 32768 Batches: 64 Memory Usage: 1350kB

-> Seq Scan on movie (cost=0.00..15598.25 rows=662825 width=202) (actual time=0.023..70.715 rows=662825 loops=1)

Planning Time: 0.664 ms

Execution Time: 4171.240 ms

// note 2 hash joins where actor and cast_info are joined, followed by a join with movie

create index actoridindex on actor(id);

explain analyze select * from actor, cast_info, movie where actor.id = cast_info.person_id and movie.id = cast_info.movie_id limit 10;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=42655.99..42661.32 rows=10 width=318) (actual time=236.010..257.950 rows=10 loops=1)

-> Nested Loop (cost=42655.99..12965281.89 rows=24248734 width=318) (actual time=236.009..257.947 rows=10 loops=1)

-> Hash Join (cost=42655.56..1692069.63 rows=24248734 width=244) (actual time=232.978..244.324 rows=10 loops=1)

Hash Cond: (cast_info.movie_id = movie.id)

-> Seq Scan on cast_info (cost=0.00..615130.44 rows=36244344 width=42) (actual time=0.014..10.920 rows=2401 loops=1)

-> Hash (cost=15598.25..15598.25 rows=662825 width=202) (actual time=227.684..227.685 rows=662825 loops=1)

Buckets: 32768 Batches: 64 Memory Usage: 1350kB

-> Seq Scan on movie (cost=0.00..15598.25 rows=662825 width=202) (actual time=0.005..70.086 rows=662825 loops=1)

-> Index Scan using actoridindex on actor (cost=0.43..0.45 rows=1 width=74) (actual time=1.360..1.360 rows=1 loops=10)

Index Cond: (id = cast_info.person_id)

Planning Time: 3.881 ms

Execution Time: 306.243 ms

create index movieid_castinfoindex on cast_info(movie_id);

explain analyze select * from actor, cast_info, movie where actor.id = cast_info.person_id and movie.id = cast_info.movie_id limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.87..6.40 rows=10 width=318) (actual time=3.695..18.112 rows=10 loops=1)

-> Nested Loop (cost=0.87..13411901.94 rows=24248734 width=318) (actual time=3.694..18.108 rows=10 loops=1)

-> Nested Loop (cost=0.44..2138689.69 rows=24248734 width=244) (actual time=2.218..3.778 rows=10 loops=1)

-> Seq Scan on movie (cost=0.00..15598.25 rows=662825 width=202) (actual time=0.015..0.016 rows=2 loops=1)

-> Index Scan using movieid_castinfoindex on cast_info (cost=0.44..2.83 rows=37 width=42) (actual time=1.854..1.873 rows=5 loops=2)

Index Cond: (movie_id = movie.id)

-> Index Scan using actoridindex on actor (cost=0.43..0.45 rows=1 width=74) (actual time=1.431..1.431 rows=1 loops=10)

Index Cond: (id = cast_info.person_id)

Planning Time: 3.961 ms

Execution Time: 18.660 ms

drop index actoridindex;

drop index movieid_castinfoindex;

4 of 21

Two table and three table join demo!

Demo

5 of 21

Schemas: pg_catalog and information_schema

(Bonus!)

Recall from Project 1:

public, a public schema that users can read/write/insert
pg_catalog, a schema for maintaining system information
information_schema, a schema that maintains metadata about objects currently created in the database

Terminology note [link]:�A table (relation) is within�A schema (namespace) is within�A catalog (database)

information_schema is portable and stable (queries transferable b/t different DBMSes).

pg_catalog contains Postgres internal stats e.g. for query optimization.

Postgres-specific system settings, statistics, values, etc.

SQL ANSI standard; can use these schema’s views in most DBMSes (postgres [docs], MySQL, Snowflake, …)

Demo

6 of 21

Statistics in pg_catalog

(Bonus!)

Documentation 14.2: Statistics used by the Planner [link]

pg_class table [53.11 link] # tuples, # pages
pg_stats view [54.27 link] extended statistics

SELECT relname, relkind, reltuples, relpages

FROM pg_class

WHERE relname = 'actor' OR relname = 'cast_info';

/* WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public'); */

Find total number of pages/tuples used by each relation:

Relation statistics

SELECT tablename, attname,

null_frac, avg_width, n_distinct

FROM pg_stats

WHERE schemaname = 'public';

SELECT tablename, attname,

most_common_vals, most_common_freqs

FROM pg_stats

WHERE schemaname = 'public';

Demo

7 of 21

Statistics in pg_catalog

(Bonus!)

Documentation 14.2: Statistics used by the Planner [link]

pg_class view [53.11 link] # tuples, # pages
pg_stats view [54.27 link] extended statistics

Demo

8 of 21

Query Optimization Settings in pg_catalog

(Bonus!)

All runtime parameters for Postgres are in one view.

pg_settings view [54.24 link]
Planner Method Configuration [20.7.1 link]

SELECT name

FROM pg_settings

WHERE name LIKE 'enable_%';

Demo

9 of 21

Rules of Thumb, Part II

EXPLAIN ANALYZE with JOINs

Rules of Thumb, Part II

Other Knobs beyond Query Optimization

Lecture 10, Data 101/Info 258 Spring 2024

10 of 21

[Recall] Summary: Why do we care?

As data engineers, we need to understand what impacts performance.

For query-centric: we need to understand just “enough” to guide to good plans.

SQL queries rewritten into logical query plans via RA expressions.
Algebraic rules allow us to manipulate these logical query plans.
Optimization allows the system to pick the best logical query plan, and best corresponding physical query plan.
Feed the corresponding physical plan to the query processor.

We have a few knobs under our control, but it is still valuable to understand what is expensive.

For code-centric: even more essential, since we’ll be manually doing the query optimization.

We have to manually reorder our plan.
We also have to pick the right physical operators to make sure we get results in a reasonable time.

11 of 21

Summary: Why do we care? Part 2

As data engineers, we need to understand what impacts performance.

For query-centric: we need to understand just “enough” to guide to good plans.

SQL queries rewritten into logical query plans via RA expressions.
Algebraic rules allow us to manipulate these logical query plans.
Optimization allows the system to pick the best logical query plan, and best corresponding physical query plan.
Feed the corresponding physical plan to the query processor.

We have a few knobs under our control, but it is still valuable to understand what is expensive.

For code-centric: even more essential, since we’ll be manually doing the query optimization.

We have to manually reorder our plan.
We also have to pick the right physical operators to make sure we get results in a reasonable time.

Examples of things we must do now (whereas query-centric systems do automatically):

Reduce intermediate results
Do joins in the right order (ensure no quadratic blowup)
Push predicates/projections down
Keep track of sorted order
Parallelism, pipeline, partitioning (next!)

12 of 21

[Recall] Rules of Thumb

Tell the system to avoid doing extra work.

Add a predicate or projection if possible to reduce intermediate sizes
Add LIMIT k if useful
Use sorting, grouping, and other expensive set operations sparingly

Materialize if expensive.

If you repeatedly issue variants of the same query, consider building a materialized view or table.
Need to consider that materialized views/tables may become stale.

Use indexes to speed up access. Best if indexes help improve queries that are frequent and slow:

Attributes frequently used in the WHERE clause (range: B+ tree; equality: Hash/B+ tree)
Multi-attribute indexes valuable if attributes are queried together.

Order of attributes important, particularly for range searching with B+ Trees!

Need to consider the overhead of potential database updates!

13 of 21

Rules of Thumb: A few more

Some additional items:

1. Pipelining

2. Periodic reorganization

Tell the system to avoid doing extra work.

Add a predicate or projection if possible to reduce intermediate sizes
Add LIMIT k if useful
Use sorting, grouping, and other expensive set operations sparingly

Materialize if expensive.

If you repeatedly issue variants of the same query, consider building a materialized view or table.
Need to consider that materialized views/tables may become stale.

Use indexes to speed up access. Best if indexes help improve queries that are frequent and slow:

Attributes frequently used in the WHERE clause (range: B+ tree; equality: Hash/B+ tree)
Multi-attribute indexes valuable if attributes are queried together.

Order of attributes important, particularly for range searching with B+ Trees!

Need to consider the overhead of potential database updates!

3. Periodic statistics recomputation

4. Additional database system knobs

14 of 21

Other Knobs beyond Query Optimization

EXPLAIN ANALYZE with JOINs

Rules of Thumb, Part II

Other Knobs beyond Query Optimization

Lecture 10, Data 101/Info 258 Spring 2024

15 of 21

(1/4) Additional Knobs: Pipelining

So far we have described operators in a sequential manner:

Each operator independently does their work; one starts work after the other has finished.
Relatively inefficient and requires materialization of the intermediate result,�particularly if the result goes beyond buffer size

Better approach: pipelining

Tuples can “flow up” from the bottom operators to the top as soon as they are ready.
All the operators are continuously doing work in parallel

Blocking operators are exceptions, e.g., aggregation, sorting, etc.

Further benefit: reduced time to the first tuple, or first k tuples

Note: Operators themselves can also be parallelized via a process called partitioned parallelism. Critical in multi-node settings!

Scan

Filter

Agg.

16 of 21

(2/4) Additional Knob: Periodic Reorganization

Many data systems keep old versions of tuples around! Several reasons:

“Lazy” operation: cheaper to do an update without deletion. Simply mark invalid tuples and append new ones
Concurrent updates: sometimes helpful to know what old version is (more on concurrency later)

However, old versions of tuples cause bloat.Impacts:

Scanning the table may be more expensive.
Tables that started as clustered on an index may no longer be clustered.

17 of 21

(2/4) Additional Knob: Periodic Reorganization

Many data systems keep old versions of tuples around! Several reasons:

“Lazy” operation: cheaper to do an update without deletion. Simply mark invalid tuples and append new ones
Concurrent updates: sometimes helpful to know what old version is (more on concurrencyl later)

However, old versions of tuples cause bloat.Impacts:

Scanning the table may be more expensive.
Tables that started as clustered on an index may no longer be clustered.

Solution 1: VACUUM command to re-pack.

VACUUM FULL to aggressively repack, on disk but also prevents queries in the meantime.

Solution 2: For extreme UPDATEs:

May be better to CLUSTER table on an index, or
CREATE TABLE AS (CTAS) a new table, and drop old one.

18 of 21

(3/4) Additional Knob: Statistics Computation

Recall: query optimization relies on statistics!

sizes of relations, distributions of values of attributes, …

Statistics can become stale after many updates!

Data distributions may change (e.g., time has moved forward)
Number of records may change

Incorrect statistics can lead to bad query plans. Therefore VACUUM recomputes statistics, too.

Can recompute statistics as part of VACUUM

VACUUM ANALYZE; – applies to all tables
VACUUM (FULL, ANALYZE) Stops; – full version

These can be set to run periodically via the autovacuum program. [Documentation 25.1]

19 of 21

(4/4) Additional Knob: Database System Knobs

There are many settings that control system behavior!

In postgres:

imdb=# select count(*) from pg_settings;

count

-------

329

Some settings of particular importance:

max_connections: sets the number of parallel connections to the system; impacts resources provided on a per-client basis. (more later)
shared_buffers: sets the amount of memory buffers available [limited by RAM]
max_parallel_workers: sets parallelism; had set to 1 for our demos

Lots more, see Postgres manual!

20 of 21

Note about CTEs and Postgres, as of v12

Documentation Ch 16

🤔

Based on this text, do CTEs usually get pushed down in the most reversion of postgres (v16)?

21 of 21

Note about CTEs and Postgres, as of v12

Documentation Ch 16

Lots of tense discussions of this, which are now outdated with PostgreSQL >v12 in >2019:

StackExchange, forcing hash join vs nested loop, StackExchange 2

For more about subqueries, see Controlling the Planner with Explicit JOINs [14.3 link].