2 of 32

The Declarative Contract

So far, we’ve been imagining a declarative contract between the data system and the user, where the user specifies:

The relations, along with any constraints (PK, FK)

But not how, exactly, they are stored, e.g., on a disk drive
But not how, these constraints are enforced

What they want the output of a query to look like

But not how it computed

What they want the modified relations to look like

But not how exactly the modification takes place

The data system is then expected to hold up its end of the bargain by:

Figuring out how to store data, process it, enforce constraints, etc.
And to do so in the “best” way it knows how.

The Declarative Contract

3 of 32

Why Cede Control to the System?

Many good reasons to do so:

Simpler for end-users

Just specify what you want, don’t worry about how to do it
Most specifications are therefore quite compact

Hard for (most) users to reason about how to best execute a query

Exponential # of ways to execute queries!

System has more fine-grained awareness of the data and its nuances

Includes data sizes, statistics, understanding of distributions, correlations, etc.
This can influence the “best” execution strategy

The Declarative Contract

4 of 32

Why Should Users Worry about Performance I

One could argue that we (as users) should leave performance aspects up to the system, and not worry about it.

Several reasons to not do so:

Reason I: revisiting our specification

There are cases when even the “best” execution strategy from the data system is poor (slow)
We might want to understand how to modify our query into another (possibly non-equivalent) one that is faster to execute

E.g., rather than performing an analysis on all the stops in the Bay Area, start with doing it on Berkeley

An understanding of what leads to slowness is therefore important

So that we (users) can rephrase our specification in less expensive ways

The Declarative Contract

5 of 32

Why Should Users Worry about Performance II

Reason 2: contract breakdown

Despite many decades of engineering, data systems are not perfect

Best to not be at the mercy of an imperfect system

There are scenarios where data systems struggle to find good ways to execute queries

The space of query execution strategies is sometimes too large, so data systems cut corners by not examining all possible ones
The estimates for the costs (latencies) for various strategies may be incorrect, for various reasons: insufficiently fine-grained distributional information, lack of understanding of correlations and data skew, etc.
Users also oftentimes have a better idea of some data characteristics and future workload than the system

In such cases, users can steer the system towards “better” execution strategies

The Declarative Contract

6 of 32

Why Should Users Worry about Performance III

Reason 3: many heavyweight levers still under the control of users

There are still many levers (impacting performance) that users control over data systems, incl.

What the relations and materialized views should look like
Whether certain efficient data access structures (known as indexes) should be generated
The amount of resources employed, incl. memory footprint, disk space, # of machines, …

Many of these decisions have repercussions beyond efficiency of execution of the current query, including:

Ease of use
Monetary considerations
Anticipated future needs and workload

The Declarative Contract

7 of 32

OK, Let’s Worry about Performance!

Fortunately, there are many things that we can do to improve performance
But this requires us to understand a bit more about data system internals and how we can “tune” them as need to achieve what we want

We’re now “opening up the Pandora’s box”!

Let’s start by talking about a “mental model” for what a data system does

Disclaimer: will be massively hand-wavy! If you’d like to learn more about performance, I encourage you to take CS186!

8 of 32

A Mental Model for a Data System

Data is laid out on blocks/pages on stable storage (SSD or HDD)

Each block may have 100s of tuples

Data is retrieved a block at a time for reading/writing by the data system
Stored in frames in main-memory within a main-memory buffer
Written back to stable storage as needed

… Blocks …

Data System

Buffer

Controller

Data

9 of 32

A Mental Model for a Data System

Usually, computation on data in main-memory is cheap relative to I/O
So main cost in processing SQL queries is the cost of I/O

i.e., fetching the blocks into the buffer and writing them out

Also, reading/writing data blocks sequentially is typically cheaper than random reads and writes

the system can be smarter about this

… Blocks …

Data System

Buffer

Controller

Data

10 of 32

Bottomline: What Impacts Performance?

Rule 1: Read less data, in sequence, if possible

The more data you have to read/write, the slower it is

In some cases, you have to read the same data multiple times, which is slower than reading it just once

But: random accesses are slower than sequential access

Rule 2…: Later!

11 of 32

Thought Experiment

Say I have 10M stops in my Stop relation across all cities in the US
And I want to find a specific stop

SELECT * FROM Stops WHERE id = 256

The relation is laid out across many pages on disk
Q: How would we go about doing this?

… Blocks …

Data

12 of 32

Thought Experiment

Say I have 10M stops in my Stop relation across all cities in the US
And I want to find a specific stop

SELECT * FROM Stops WHERE id = 256

The relation is laid out across many pages on disk
Q: How would we go about doing this?

Data is unordered by default so don’t know where id = 256 is
So, will need to look through all of the data pages

An operation known as sequential scan

But, can be more efficient if we could have a lookup structure that could help us tell which page id = 256 is located

14 of 32

Indexes

Indexes help you look for data matching certain characteristics

Much like an index in a book! Or a library!

An index on an attribute, say age, allows you to find tuples with a specific value for age in Stops
Indexes are essentially data structures storing “index

key-location” pairs

Index key = values of age
Location = where (which block, start position within the block) tuples with that age value can be found

This may be useful, for example, to find tuples with

age = 15
age <= 30
15 <= age <= 45

15 of 32

From CS61A or your intro programming class… a Binary Search Tree

16 of 32

Binary Search Trees for Data Systems: B+ Trees

The most popular type of index in data systems are B+Trees
Essentially applying & extending the binary search tree ideas

80

20		60

100		120		140

10		15		18

20		30		40		50

60		65

80		85		90

Find A = 30

Find A = [30,63]

Pointers to records

Nodes

17 of 32

Generalizing Binary Search Trees for Data Systems: B+ Trees

Unlike binary search trees, B+trees have a high fanout

Each “node” is a block (remember our mental model!)
If “b” is fanout is ~1000, with four levels

b^4 = 10^12: one trillion tuples!

B+Trees are self-balancing

Uniform height throughout
Maintains a certain fill-factor (usually half) by splitting nodes and percolating changes to maintain balance
We won’t worry about those algorithms in this class: see CS186!

18 of 32

Generalizing Binary Search Trees for Data Systems: B+ Trees

Logarithmic complexity

For searching (as we saw)
For insert, delete, update (not covered)

In practice, depth of tree = number of nodes (blocks) retrieved is small

High fanout ensures that
Typical trees: short & wide

Can handle both value lookups (id = 20) as well as range (20 <= id <= 50) lookups

For the latter, can follow “leaf” pointers once you have identified the start of the range

Works well with updates
In PostgreSQL this is the default index type

19 of 32

When Do Indexes Apply?

Generally beneficial when your relations are large and when sequential scans are too time consuming

SELECT * FROM Stops WHERE age = 15 AND location = ‘West Oakland’;

Suppose the relation has 10M tuples, spread across 100,000 blocks

Let’s say there are about 1000 tuples that satisfy the first predicate and 100 that satisfy both predicates

No index approach: sequential scan of all 100,000 blocks of Stops, check for each tuple if two predicates are satisfied
With an index on age, look over 1000 tuples, spread across 1000 blocks in the worst case, check for each tuple if second predicate is satisfied
With a multi-dimensional index on age, location, look over 100 tuples, spread over 100 blocks in the worst case

20 of 32

When Do Indexes Apply?

SELECT * FROM Stops WHERE age = 15 AND location = ‘West Oakland’;

Suppose the relation has 10M tuples, spread across 100,000 blocks

Let’s say there are about 1000 tuples that satisfy the first predicate and 100 that satisfy both predicates

Here, the # of blocks examined using either index is relatively small: 1000 (or 100) vs 100,000 that the index may be valuable

Recall that random access is more expensive than sequential access
And index-based accesses are typically (with some exceptions that we’ll cover later) random
But if 1000 grows to (say) 10,000 or greater, it is possible that the benefits of an index may be outweighed by the penalty for random accesses
More on this later

21 of 32

Also Useful for Joins!

SELECT * FROM Stops, Zips

WHERE Stops.location = Zips.location

Say Stops has 10M tuples and Zips has 1M tuples
Q: If you have no indexes, need to consider how many pairs of tuples?

10M x 1M = 10^13

Q: How would you use indexes here, at a high level?

For each Stop, look up corresponding location in Zips (or vice versa)
For former, “only” 10M lookups in the index
Note: if I had an index on Zips.zipcode and not Zips.location, it wouldn’t help
There are other efficient join approaches that don’t use indexes; discussed later

22 of 32

Declaring Indexes

CREATE INDEX ageIndex ON Stops (age);

Allow for lookups based on age

CREATE INDEX ageLocationIndex ON Stops (age, location);

Imagine this to be a “index key-value” pair data structure based on concatenating attribute values
Allow for lookups based on pairs of age and location
Or just for age
Q: Why not location only?

DROP INDEX ageLocationIndex;

23 of 32

Demo

select * from actor where id = 23456
explain analyze select * from actor where id = 23456;
create index idIndex on actor(id);
explain analyze select * from actor where id = 23456;
explain analyze select * from actor where id >= 23456;
explain analyze select * from actor where id >= 4000000;
explain analyze select * from actor where id >= 23456 AND id < 23457;
explain analyze select * from actor where id < 23457;
drop index idIndex;
explain analyze select * from actor where name = 'Hanks, Tom';
create index nameIdIndex on actor(name,id);
explain analyze select * from actor where name = 'Hanks, Tom';
explain analyze select * from actor where id < 23457;
create index idNameIndex on actor(id,name);
explain analyze select * from actor where id < 23457;

24 of 32

Indexes Seem Magical…

Q: Why not automatically add indexes for all attributes & combinations of attributes? What are downsides of indexes?
A: the maintenance costs during updates, deletes, inserts

If an index is not maintained, it may lead to inaccurate query results
e.g., if a new tuple is inserted but it is not reflected in the index, it won’t be returned as part of queries

25 of 32

OK, Going Back to Indexes…

Which indexes should we create?
Answer: it depends!

Many databases automatically create indexes for PRIMARY KEY or UNIQUE attributes. Several reasons:

It also helps enforcing the constraints! Updating the index can serve as a check for the constraint
Many queries lookup based on these attributes

SELECT * FROM Actor, Cast_info WHERE Actor.id = Cast_info.person_id

It is a high cardinality attribute

That is, number of distinct values for this attribute is high
In fact, cardinality here is as many as the number of tuples
So is especially beneficial to use an index on this attribute.. More next

26 of 32

Why does cardinality matter?

Databases retrieve data as “blocks”
Say you are indexing based on attribute A, with 100 distinct values. You have 10,000 tuples. Each block contains 100 tuples. So we have 100 blocks.

Assuming that the tuples with a given value of A (A = a) are randomly distributed, every block, on average, has one tuple with A=a.
Thus, we can’t avoid looking at all blocks => the index doesn’t help.
Q: What if we had a 1000 distinct values instead of 100? How many blocks would we have to look at then for A = a?
A: 10 blocks, so there may be some savings.

… Blocks …

27 of 32

Why does cardinality matter? II

As mentioned previously, sequential access is much faster than random access:

Depending on the characteristics of the storage media, reading 1 in 10 blocks may not be enough to justify the use of an index

Might be as fast to just scan

But reading 1 in 100 or 1 in 1000 may be enough [again, it depends]

So, indexes on low cardinality attributes are not a great idea
Exception: if data is clustered on that attribute

i.e., sorted based on that attribute and stored in that order on disk
If we knew that all values of A = a are together (as opposed to randomly shuffled), we can simply read one block as opposed to 100 blocks
Can force PostgreSQL to cluster your data based on an index

CLUSTER Stops ON ageIndex;

28 of 32

How Do We Decide?

Good on attributes used often in WHERE clauses
Good on key attributes (sometimes done by default)
Good on attributes that are “clustered”
Not very good on attributes where there are many tuples per value of attribute
Not very good on tables that are modified more often than queried

29 of 32

Other Types of Indexes: Hash Indexes

Like hash-based dictionaries or maps in programming languages

A hash function applied to value a, i.e., h(a) yields location where pointers for tuples with A = a are found

Main challenge: how to “extend” the hash table as values of A grow

See linear or extendible hashing

10
20
30
40

50
60
70
80

A1
A2

A3
A4

A5
A6

A7
A8

BLOCK

DATA FILE

HASH INDEX

h(A1) = 30; h(A2) = 20; …

30 of 32

Types of Indexes: Hash Indexes

Constant complexity O(1)
Can only handle value lookups, not range lookups

Why?

In PostgreSQL done as follows:

CREATE INDEX <name> ON <table> USING hash (column);

10
20
30
40

50
60
70
80

A1
A2

A3
A4

A5
A6

A7
A8

BLOCK

DATA FILE

HASH INDEX

h(A1) = 30; h(A2) = 20; …

1 of 32

2 of 32

3 of 32

4 of 32

5 of 32

6 of 32

7 of 32

8 of 32

9 of 32

10 of 32

11 of 32

12 of 32

13 of 32

14 of 32

15 of 32

16 of 32

17 of 32

18 of 32

19 of 32

20 of 32

21 of 32

22 of 32

23 of 32

24 of 32

25 of 32

26 of 32

27 of 32

28 of 32

29 of 32

30 of 32

31 of 32

32 of 32