Systems Primer
Bits to disks to clouds
Why? ‘Full stack’ engineer?
Key Questions
Data Size | Hardware/ Storage | Algorithms | Languages + Libraries |
‘Tiny’ (< 10 GBs) | Store in RAM | ‘Usual’ CS Algorithms (sort, hash, ..) | Pandas (Python) + SQL (Spreadsheets for < 100 MBs) |
‘Small’ (10GB - 10TB) | | | |
‘Big’ (> 10 TBs) | | | |
Goal: Gain intuition – What are Speed and Cost tradeoffs?
Basics
Basics
RAM – SSD/HDD Read/Writes
(aka ‘Paging’)
CPU – RAM Read/Writes
Bytes in RAM, SSDs, HDDs
IO
Systems
D-RAM
S-RAM
Hard Drives (HDD)
Flash/SSD/
Non-volatile mem (nvm)
Tape
L1, L2
Cache
MBs, 20 ns
GBs, 50-100 ns
1-10 TBs,
~10s microsecs
10s of TBs-PBs, 10s of msecs
Cost/bit
Speed
Volatile -- data lost when you turn off power
Non-Volatile
IO Blocks
For Efficiency
E.g., to store 1GB data, we partition the data, and store in IO Blocks
Data
(1 GB)
IO Blocks
‘Partition’ into IO blocks
IO Blocks
For Efficiency
Key Concepts
1 Package
1000 Packages
(1 Trailer)
3000 Packages
(3 Trailers)
‘M’ Trucks
1 Byte
64MB-Blocks
(64MBs of sequential, aka contiguous bytes
Nx64MB-Blocks
(‘N’ sequential 64MB-Blocks)
Set of ‘M’
Nx64MB-Blocks
IO Blocks
For Efficiency
E.g., to store 1GB data, we need 15.625 64MB-Blocks (= 1GB/64MB) in
2x64MB
4x64MB
Data
(1 GB)
IO Blocks
Read,
Write
IO Blocks
Data stored in ‘M’
Nx64MB Blocks
// Read M Nx64MB-Blocks for Data
ReadData(M) For i:1…M Blocks
Read/Write Data into RAM
// Read N contiguous Blocks from startLocation
ReadBlocks(startLocation, N Blocks)
WriteBlocks is similar to ReadBlocks, except
2b. Write N*64MB (or multiple) contiguous bytes from RAM
Basic system numbers
| Access Latency (secs) | Scan Throughput (GBs/sec) | Cost/TB (Aug’ 24) |
RAM (D-RAM) | ~100 nanosec | ~100-150 GB/sec | ~3125$/TB |
High-end SSD | ~10 microsec | ~5-10 GB/sec | ~100$/TB |
~10 millisec | ~100-200 MB /sec | ~25$/TB | |
Machines M1 to M2 (Network) | ~ 1 microsec | ~ 5 GB/sec | N/A |
Access Latency (secs) = Time to access Block’s start location
Scan Throughput (GB/sec) = Speed to Scan + Copy data to RAM
Note: Why are they different speeds?
IO Cost Model
| Access Latency (secs) | Scan Throughput (GBs/sec) | Total Time to read one 64MB-Block |
RAM (D-RAM) | ~100 nanosec | ~100 GB/sec | |
High-end SSD | ~10 microsec | ~5 GB/sec | |
~10 millisec | ~100 MB /sec | | |
Machine 2 Machine (Network) | ~ 1 microsec | ~ 5 GB/sec | |
100 + 64MB/100GB/s
= 100 + 640000 nsec
10 + 12800 microsec
10 + 640 millisec
1 + 12800 microsec
Total Time to ReadData =
AccessLatency * M + DataSize/ScanThroughput
Where
Example
Data size
Q1: What’s row size (i.e., size of each record)?
SID: int32 ⇒ 4 bytes
Student: char[100] ⇒ 100 bytes
Address: char[200] ⇒ 200 bytes
Bio: char[720] ⇒ 720 bytes ## Note: Picking so row size=1024
⇒ Each row is 1024 bytes
Q2: What’s table size
Students | | | |
SID | Name | Address | Bio |
40001 | Mickey | 43 Toontown | Mickey is a Sophomore in CS. He is... |
40002 | Daffy | 147 Main St | Daffy is part of the Orchestra. He was... |
50003 | Donald | 312 Escondido | Donald is a 1st year MS in EE. He was... |
50004 | Minnie | 451 Gates | |
10008 | Pluto | 97 Packard | |
Example
Data speed
| Access Latency for 1 Block (secs) | Scan Time for 1 TB (sec) |
RAM (D-RAM) | ~100 nanosec | 1TB/100 GBps = ~10 sec |
High-end SSD | ~10 microsec | 1TB/5GBps = ~200 sec |
~10 millisec | 1TB/100MBps =~10,000sec | |
Copy to RAM on Machine Z1 from RAM on Machine Z2 | (Network) 1 microsec + (RAM on Z1) 100 nsec + (RAM on Z2) 100 nsec ~= 1.2 microsec [10,000x faster vs reading from Z1’s HDD Or 10x faster vs reading from Z1’s SSD] | (Network) 1 TB/5GBps + (RAM on Z1) 1 TB/100GBps + (RAM on Z2) 1 TB/100GBps ~= 220 secs [50x faster vs reading from Z1’s HDD (~10,000 secs) Or 10% slower vs reading from Z1’s SSD] |
Q3: For 1 Billion student table of size 1TBs (stored in M=1)
[Recall Total Time to ReadData =
AccessLatency * M + DataSize/ScanThroughput ]
Example
Data speed
| Access Latency for 1 Block (secs) (Same as Q3) | Scan Time for 1 TB (sec) with 100x more capacity? (Same numbers as Q3, divided by 100) |
RAM (D-RAM) | ~100 nanosec | 10 sec/100 |
High-end SSD | ~10 microsec | 200 sec/100 |
HDD | ~10 millisec | ~10,000/100sec |
Copy to RAM on Machines Z1..100 from RAM on Z101..200 | ~= 1.2 microsec | ~= 220/100 secs |
Q4: With 100 parallel machines? (100x RAM, 100x disks)
Example: Find a student, by scanning data
Find ‘Daffy’ from a DB of billion students (1 TB)
Design Choices
Data in RAM (Scan sequentially & filter) |
Data in HDD (no IO Blocks; rows in random spots) (Access each row. Look for Daffy. Repeat for every row) |
Data in 64MB-Blocks in HDD (Access each block. Scan rows, look for Daffy. Repeat per 64MB-block) |
(@100$/32GB) 3000$ |
(@25$/TB of disk) 25$ |
(@25$/TB of disk) 25$ |
(AccessLatency) 100 nsec + (Scan) 1000 GB / 100 GBps ~= 10 secs |
|
|
Storage Cost
Time
(AccessLatency) 10 msec * 1 Billion rows + (Scan) 1 TB /100 MBps
= 10^7 secs + 10^4 secs
~= 115 days
Number of 64MB-blocks = 1 TB/64MB = 15625
(AccessLatency) 10 msecs *15625 blocks + (Scan) 1 TB /100 MB-sec
= 10.15^4 secs ~= 3 hrs
In 2 weeks, we’ll see how to do this a lot faster (in msecs) with good Indexes (e.g, hashing)
Design1
Design2
Design3
Example: Find a student, by scanning data
Find ‘Daffy’ from a DB of billion students (1 TB)
Design Choices
Data in SSD (no IO Blocks; rows in random spots) (Access each row, look for Daffy; Repeat for every row) |
Data in SSD 64MB-Blocks (Access each block, scan rows, look for Daffy; Repeat per 64MB-block) |
(@150$/TB of SSD) 150$ |
(@150$/TB of SSD) 150$ |
|
|
Storage Cost
Time
(AccessLatency) 10 microsec * 1 Billion rows + (Scan) 1 TB /5GBps
= 10^4 secs + 200 secs
~= 1200 secs
Number of 64MB-blocks = 1 TB/64MB = 15625
(AccessLatency) 10 microsecs *15625 blocks + (Scan) 1 TB /5 GBps
= 10.15^4 secs
~= 200 secs
In 2 weeks, we’ll see how to do this a lot faster (in msecs) with good Indexes (e.g, hashing)
Design4
Design5
Clouds of machines
(Example datacenter
from 2:50 min)
Typical dedicated (non-shared) machine assumptions (unless problem states otherwise):
(e.g, ec2.r5 offers 1-3GBps network bandwidth, 2CPU/16GB RAM to 96 CPU/768GB RAM for $-$$$ in Nov’21) |
Cloud DBs
[Optional]
Data size | Size | Storage Cost | Compute Cost | Cost Range (per yr) |
Small | ~5 TB | ~$1300 | ~$50,000 | ~$25k-$75k |
Medium | ~50 TB | ~$14000 | ~$160,000 | ~$100k-$200k |
~200TB | ~$55,000$ | ~$400,000 | ~$300k-$500k |
Reference Snowflake [Jun’22]
(Google for full pricing calculators for BigQuery, RedShift, etc.)
Rough rule of thumb for data sizes
Ballpark Cloud DB costs
(*For popular Tables data.
Some apps use JSON (hierarchical data) in
nonSQL DBs – see later)
‘10-100x’ Lower costs in past 10 yrs.
Primary Driver? “db admin” team → Cloud management tools
Data Size | Hardware/ Storage | Algorithms | Languages + Libraries |
‘Tiny’ (< 10 GBs) | Store in RAM | ‘Usual’ CS Algorithms (sort, hash, ..) | Pandas (Python) + SQL (Spreadsheets for < 100 MBs) |
‘Small’ (10GB - 10TB) | Store in SSD | CS145 Algorithms | SQL on Cloud |
‘Big’ (> 10 TBs) | Store in HDD | CS145 Algorithms | SQL on Cloud |
In this Section