1 of 19

Systems Primer

Bits to disks to clouds

Why? ‘Full stack’ engineer?

2 of 19

Key Questions

Why a rich variety of DBs?
How to eval DBs for my data apps?

Data Size	Hardware/ Storage	Algorithms	Languages + Libraries
‘Tiny’ (< 10 GBs)	Store in RAM	‘Usual’ CS Algorithms (sort, hash, ..)	Pandas (Python) + SQL (Spreadsheets for < 100 MBs)
‘Small’ (10GB - 10TB)
‘Big’ (> 10 TBs)

Goal: Gain intuition – What are Speed and Cost tradeoffs?

3 of 19

4 of 19

Basics

INT32: 4 bytes, INT64: 8 bytes
2^10 = 1024, 2^20 ~= 1Million, 2^30 ~= 1 Billion (10^9), 2^40 ~= 1 Trillion (10^12)
To store int32 records: 1 Million records = 4MB, 1 Billion records = 4 GB
[Often use 1000 vs 1024, for quick approximations]

RAM – SSD/HDD Read/Writes

(aka ‘Paging’)

CPU – RAM Read/Writes

Bytes in RAM, SSDs, HDDs

5 of 19

IO

Systems

D-RAM

S-RAM

Hard Drives (HDD)

Flash/SSD/

Non-volatile mem (nvm)

Tape

L1, L2

Cache

MBs, 20 ns

GBs, 50-100 ns

1-10 TBs,

~10s microsecs

10s of TBs-PBs, 10s of msecs

Cost/bit

Speed

Volatile -- data lost when you turn off power

Non-Volatile

6 of 19

IO Blocks

For Efficiency

E.g., to store 1GB data, we partition the data, and store in IO Blocks

Data

(1 GB)

IO Blocks

‘Partition’ into IO blocks

7 of 19

IO Blocks

For Efficiency

Key Concepts

Data is stored in Blocks (aka “partitions”)
Sequential IO is 10x-100x+ faster than ‘many’ random IO

E.g., 1 MB (located sequentially) versus 1 Million bytes in random locations

HDDs/SSDs copy sequential ‘big’ blocks of bytes to/from RAM

1 Package

1000 Packages

(1 Trailer)

3000 Packages

(3 Trailers)

‘M’ Trucks

1 Byte

64MB-Blocks

(64MBs of sequential, aka contiguous bytes

Nx64MB-Blocks

(‘N’ sequential 64MB-Blocks)

Set of ‘M’

Nx64MB-Blocks

8 of 19

IO Blocks

For Efficiency

E.g., to store 1GB data, we need 15.625 64MB-Blocks (= 1GB/64MB) in

Sixteen 64MB-Blocks, (1x64MB = 64MB) OR
Eight 2x64MB Blocks, (i.e, M = 8, N = 2), OR
Four 4x64MB Blocks, OR Two 8x64MB Blocks, OR One 16x64B Block
(OR some combo)

2x64MB

4x64MB

Data

(1 GB)

IO Blocks

9 of 19

Read,

Write

IO Blocks

Data stored in ‘M’

Nx64MB Blocks

// Read M Nx64MB-Blocks for Data

ReadData(M) For i:1…M Blocks

ReadBlocks(startLocation[ith Block], N)

Read/Write Data into RAM

// Read N contiguous Blocks from startLocation

ReadBlocks(startLocation, N Blocks)

Access Block’s startLocation in IO device
Scan N Blocks in one efficient operation. That is

Read N*64MB (or multiple) contiguous bytes to RAM

WriteBlocks is similar to ReadBlocks, except

2b. Write N*64MB (or multiple) contiguous bytes from RAM

10 of 19

Basic system numbers

	Access Latency (secs)	Scan Throughput (GBs/sec)	Cost/TB (Aug’ 24)
RAM (D-RAM)	~100 nanosec	~100-150 GB/sec	~3125$/TB
High-end SSD	~10 microsec	~5-10 GB/sec	~100$/TB
HDD Seek (Hard Disk)	~10 millisec	~100-200 MB /sec	~25$/TB
Machines M1 to M2 (Network)	~ 1 microsec	~ 5 GB/sec	N/A

Access Latency (secs) = Time to access Block’s start location

Scan Throughput (GB/sec) = Speed to Scan + Copy data to RAM

Note: Why are they different speeds?

Digital (e.g. SSDs and RAM) vs Analog (e.g. HDD seeks)
Distance from CPU (e.g. RAM is on same chip as CPU vs SSD)

11 of 19

IO Cost Model

	Access Latency (secs)	Scan Throughput (GBs/sec)	Total Time to read one 64MB-Block
RAM (D-RAM)	~100 nanosec	~100 GB/sec
High-end SSD	~10 microsec	~5 GB/sec
HDD Seek (Hard Disk)	~10 millisec	~100 MB /sec
Machine 2 Machine (Network)	~ 1 microsec	~ 5 GB/sec

100 + 64MB/100GB/s

= 100 + 640000 nsec

10 + 12800 microsec

10 + 640 millisec

1 + 12800 microsec

Total Time to ReadData =

AccessLatency * M + DataSize/ScanThroughput

Where

DataSize = data size (in bytes)
M = Number of non-contiguous Blocks
AccessLatency = Time to access Block
ScanThroughput = Speed to copy/scan to RAM

12 of 19

Example

Data size

Q1: What’s row size (i.e., size of each record)?

SID: int32 ⇒ 4 bytes

Student: char[100] ⇒ 100 bytes

Address: char[200] ⇒ 200 bytes

Bio: char[720] ⇒ 720 bytes ## Note: Picking so row size=1024

⇒ Each row is 1024 bytes

Q2: What’s table size

with 1000 student rows? 1000 * 1024 bytes = 1 MB
with 1M student rows? 1M * 1024 bytes = 1 GB
With 1B student rows? 1B * 1024 bytes = 1 TB

Students
SID	Name	Address	Bio
40001	Mickey	43 Toontown	Mickey is a Sophomore in CS. He is...
40002	Daffy	147 Main St	Daffy is part of the Orchestra. He was...
50003	Donald	312 Escondido	Donald is a 1st year MS in EE. He was...
50004	Minnie	451 Gates
10008	Pluto	97 Packard

13 of 19

Example

Data speed

	Access Latency for 1 Block (secs)	Scan Time for 1 TB (sec)
RAM (D-RAM)	~100 nanosec	1TB/100 GBps = ~10 sec
High-end SSD	~10 microsec	1TB/5GBps = ~200 sec
HDD	~10 millisec	1TB/100MBps =~10,000sec
Copy to RAM on Machine Z1 from RAM on Machine Z2	(Network) 1 microsec + (RAM on Z1) 100 nsec + (RAM on Z2) 100 nsec ~= 1.2 microsec [10,000x faster vs reading from Z1’s HDD Or 10x faster vs reading from Z1’s SSD]	(Network) 1 TB/5GBps + (RAM on Z1) 1 TB/100GBps + (RAM on Z2) 1 TB/100GBps ~= 220 secs [50x faster vs reading from Z1’s HDD (~10,000 secs) Or 10% slower vs reading from Z1’s SSD]

Q3: For 1 Billion student table of size 1TBs (stored in M=1)

[Recall Total Time to ReadData =

AccessLatency * M + DataSize/ScanThroughput ]

14 of 19

Example

Data speed

	Access Latency for 1 Block (secs) (Same as Q3)	Scan Time for 1 TB (sec) with 100x more capacity? (Same numbers as Q3, divided by 100)
RAM (D-RAM)	~100 nanosec	10 sec/100
High-end SSD	~10 microsec	200 sec/100
HDD	~10 millisec	~10,000/100sec
Copy to RAM on Machines Z_1..100 from RAM on Z_101..200	~= 1.2 microsec	~= 220/100 secs

Q4: With 100 parallel machines? (100x RAM, 100x disks)

15 of 19

Example: Find a student, by scanning data

Find ‘Daffy’ from a DB of billion students (1 TB)

Design Choices

Data in RAM (Scan sequentially & filter)
Data in HDD (no IO Blocks; rows in random spots) (Access each row. Look for Daffy. Repeat for every row)
Data in 64MB-Blocks in HDD (Access each block. Scan rows, look for Daffy. Repeat per 64MB-block)

(@100$/32GB) 3000$
(@25$/TB of disk) 25$
(@25$/TB of disk) 25$

(AccessLatency) 100 nsec + (Scan) 1000 GB / 100 GBps ~= 10 secs

Storage Cost

Time

(AccessLatency) 10 msec * 1 Billion rows + (Scan) 1 TB /100 MBps

= 10^7 secs + 10^4 secs

~= 115 days

Number of 64MB-blocks = 1 TB/64MB = 15625

(AccessLatency) 10 msecs *15625 blocks + (Scan) 1 TB /100 MB-sec

= 10.15^4 secs ~= 3 hrs

In 2 weeks, we’ll see how to do this a lot faster (in msecs) with good Indexes (e.g, hashing)

Design1

Design2

Design3

16 of 19

Example: Find a student, by scanning data

Find ‘Daffy’ from a DB of billion students (1 TB)

Design Choices

Data in SSD (no IO Blocks; rows in random spots) (Access each row, look for Daffy; Repeat for every row)
Data in SSD 64MB-Blocks (Access each block, scan rows, look for Daffy; Repeat per 64MB-block)

(@150$/TB of SSD) 150$
(@150$/TB of SSD) 150$

Storage Cost

Time

(AccessLatency) 10 microsec * 1 Billion rows + (Scan) 1 TB /5GBps

= 10^4 secs + 200 secs

~= 1200 secs

Number of 64MB-blocks = 1 TB/64MB = 15625

(AccessLatency) 10 microsecs *15625 blocks + (Scan) 1 TB /5 GBps

= 10.15^4 secs

~= 200 secs

In 2 weeks, we’ll see how to do this a lot faster (in msecs) with good Indexes (e.g, hashing)

Design4

Design5

17 of 19

Clouds of machines

(Example datacenter

from 2:50 min)

Typical dedicated (non-shared) machine assumptions (unless problem states otherwise): 64 GB RAM Block sizes: 64 MB-DB block (and RAM page size) Example: AWS/GCP offer machine instances (e.g, ec2.r5 offers 1-3GBps network bandwidth, 2CPU/16GB RAM to 96 CPU/768GB RAM for $-$$$ in Nov’21)

18 of 19

Cloud DBs

[Optional]

Data size	Size	Storage Cost	Compute Cost	Cost Range (per yr)
Small	~5 TB	~$1300	~$50,000	~$25k-$75k
Medium	~50 TB	~$14000	~$160,000	~$100k-$200k
Large	~200TB	~$55,000$	~$400,000	~$300k-$500k

Reference Snowflake [Jun’22]

(Google for full pricing calculators for BigQuery, RedShift, etc.)

Rough rule of thumb for data sizes

Ballpark Cloud DB costs

(*For popular Tables data.

Some apps use JSON (hierarchical data) in

nonSQL DBs – see later)

‘10-100x’ Lower costs in past 10 yrs.

Primary Driver? “db admin” team → Cloud management tools

Data Size	Hardware/ Storage	Algorithms	Languages + Libraries
‘Tiny’ (< 10 GBs)	Store in RAM	‘Usual’ CS Algorithms (sort, hash, ..)	Pandas (Python) + SQL (Spreadsheets for < 100 MBs)
‘Small’ (10GB - 10TB)	Store in SSD	CS145 Algorithms	SQL on Cloud
‘Big’ (> 10 TBs)	Store in HDD	CS145 Algorithms	SQL on Cloud

19 of 19

In this Section

Systems – Size, Cost, and Speeds of GB/TBs

On RAM, HDDs/SSDs, clouds

IO Cost model

Simple IO model for Total Time (for RAM, SSD, HDD)
Good estimate – Reality is a bit more complex to fully model (buses, cache hit rates, etc.)

Examples analyzing system design choices