1 of 36

Midterm Review II

CS-202 Computer Systems

2 of 36

Today’s focus

Multi-level indexing
File operation wrt implementation
Journaling for crash consistency
IO
MLFQ

2

3 of 36

File system recap: inode

3

A persistent structure assigned to each file by the file system

Inode ID is unique for a file

An inode contains metadata of a file

Permissions, length, access times
Location of data blocks and indirection blocks

Inode table indexes files and directories stored on disk

4 of 36

Pages (page table) vs. blocks in file system

4

	Page	Block
Purpose	Fixed size chunks in DRAM	Fixed size chunk on disk
Manager	OS memory management	File system (inodes, bitmaps etc.)
Size	4 KB, also (2MB, 1GB)	512 bytes to 4 KB (depends on FS and storage device)
Access mech.	CPU via load/stores	File IO operations (read/write)
Permission granularity	Pages (PTE)	File-level and directory-level permissions

5 of 36

Peeking inside a partition (storage block)

Persistent storage modeled as a sequence of N blocks

From 0 to N-1: 64 blocks
Some blocks store data
Other blocks store metadata:

An array of inodes
Bitmap tracking free inodes and data blocks (free lists)
Boot block and superblock are at the beginning of the partition

5

Data blocks

B

S

i

d

I

0

7

I

D

8

15

D

16

23

D

24

31

D

32

39

D

63

D

40

47

D

48

55

D

56

Data blocks

Inodes

Free lists

6 of 36

Inode table information

Stored on a reserved area on disk

Superblock stores start/end of inode entries

Fixed-size entries of 128/256 bytes

Efficient and predictable indexing
Entry contains metadata and pointers to actual data blocks for a file

6

S

I

inode blocks

Data blocks

Inode array / inode table

Inode

Inode number

Permissions

File size

Timestamps

…

Block pointer

Block pointers stores data block information

7 of 36

Data structure for multi-level indexing

Inode contains a fixed, asymmetric tree, with fixed sized data blocks (e.g., 4 KB) as its leaves

First 12 pointers point to data blocks
Last three point to intermediate blocks

#13: pointer to a block containing pointers to data blocks
#14: double indirect pointer
#15: triple indirect pointer

7

File metadata

Inode

Direct pointer to blocks

Indirect

pointers

inode

├─ Direct Pointer → Block (data)

├─ Single-indirect pointer → [Pointer Block] → Blocks (data)

├─ Double-indirect pointer → [Pointer Block] → [Pointer Block] → Blocks (data)

└─ Triple-indirect pointer → [Pointer Block] → [Pointer Block] → [Pointer Block] → Blocks (data)

8 of 36

File allocation approach: Multi-level indexing

8

S

I

inode blocks

Remaining blocks

Inode array

File metadata

Inode

…

Data blocks

Indirect block

Double indirect block

Triple indirect block

12 x 4 KB = 48 KB

1K x 4 KB = 4 MB

1K x 1K x 4 KB = 4 GB

1K x 1K x 1K x 4 KB = 4 TB

9 of 36

Today’s focus

Multi-level indexing
File operation wrt implementation
Journaling for crash consistency
IO
MLFQ

9

10 of 36

File operation: Reading from a file

First, must open the file

open(“/cs202/w07”, O_RDONLY)
Traverse directory tree to find inode “w07”
Read that inode, check permissions, and return a file descriptor

Then, for each read() that is issued:

Read inode
Read appropriate data block (depending on the offset)
Update last access time in inode
Update file offset in in-memory open file table for fd

10

11 of 36

File operation: Writing to a file

Open file: open(“/cs202/w07”, O_WRONLY)

(assuming file already exists)

Might need to allocate new blocks:

Each logical write can generate up to five I/O operations:

Read free data block bitmap for free space
Write to data block bitmap to allocate a data block
Read the file’s inode
Write to the file’s inode to include a pointer to the new block
Write to the new data block

11

12 of 36

File operation: Writing to a file

Might need to allocate new blocks:

Creating a file involves more operations:

Read and write free inode bitmap
Write inode
(Read) and write directory data
Write directory data

If the directory is full, must allocate new blocks

12

13 of 36

Exercise: output of the program

13

14 of 36

Exercise: which inode blocks read/written

14

Inode blocks

15 of 36

Exercise: which data blocks read/written

15

Data blocks

16 of 36

Today’s focus

Multi-level indexing
File operation wrt implementation
Journaling for crash consistency
IO
MLFQ

16

17 of 36

Simplified journaling example

Goal: Write 10 to block 0 and 5 to block 1 atomically

This does not work! Must not crash between 1 and 2 unit time

17

Time	Block 0	Block 1	Extra	Extra	Extra
0	12	3	0	0	0
1	10	3	0	0	0
2	10	5	0	0	0

18 of 36

Simplified journaling example

Goal: Write 10 to block 0 and 5 to block 1 atomically

First write to the journal using transaction
Valid block indicates transaction completed successfully

18

Time	Block 0	Block 1	Block 0’	Block 1’	Valid
0	12	3	0	0	0
1	12	3	10	0	0
2	12	3	10	5	0
3	12	3	10	5	1
4	10	3	10	5	1
5	10	5	10	5	1
6	10	5	10	5	0

Extra blocks: journal

Journal commit block

Writes to journal: Journal transaction

19 of 36

Simplified journaling example

Goal: Write 10 to block 0 and 5 to block 1 atomically

Crash before time unit 3: old data
Crash after time unit 3: new data (need recovery)
Crash after time unit 6: new data already present

19

Time	Block 0	Block 1	Block 0’	Block 1’	Valid
0	12	3	0	0	0
1	12	3	10	0	0
2	12	3	10	5	0
3	12	3	10	5	1
4	10	3	10	5	1
5	10	5	10	5	1
6	10	5	10	5	0

20 of 36

Today’s focus

Multi-level indexing
File operation wrt implementation
Journaling for crash consistency
IO
MLFQ

20

21 of 36

Canonical device

OS configures and communicates based on the agreed protocol

Uses device driver for communication

Device controller signals to the CPU either through polling or interrupt

21

Status

CMD

DTA

Device registers

Device internals

Microcontroller (CPU + RAM)

Storage (ROM/flash)

Special purpose chips

Controller

// 1. spin while (STATUS == BUSY) ;� // 2. Write data to DTA register�dtaRegister = DATA;� // 3. Write command to CMD register�cmdRegister = COMMAND; // 4. spin�while (STATUS == BUSY) ;

Polling-based CPU-device communication

22 of 36

Three points to minimize CPU spinning

Busy waiting (1. and 4.) wastes CPU cycles
For efficiency: CPU should avoid waiting for the completion of commands
Solution: Use interrupts

Inform the device of a request (to get rid of 1.)
Wait for a signal of completion (to get rid of 4.)

22

// 1. spin while (STATUS == BUSY) ;�// 2. Write data to DTA register�dtaRegister = DATA;�// 3. Write command to CMD register�cmdRegister = COMMAND; // 4. spin�while (STATUS == BUSY) ;

B

A

C

A

CPU

Disk

1

2

4

3

Time →

23 of 36

Device protocol at a high level

Interrupts to the rescue!

OS waits for an interrupt than spinning:

Interrupts are handled centrally through a dedicated OS thread
On interrupt arrival, wakeup a kernel thread that waits on the interrupt

23

B

A

C

A

CPU

Disk

B

C

A

CPU

Disk

3

B

2

A

B

A

1

4

1

2

4

3

Time →

24 of 36

Transferring data to/from controller

PIO (programmed IO): CPU tells the device what data is

One instruction for each byte/word
Efficient for few bytes; consumes CPU cycles proportional to data size

DMA (direct memory access): CPU tells the device where data is

Give controller access to memory bus
Ask to transfer data to/from memory directly
Efficient for large data transfers

24

B

C

A

CPU

Disk

3

B

2

A

B

A

1

4

25 of 36

Today’s focus

Multi-level indexing
File operation wrt implementation
Journaling for crash consistency
IO
MLFQ

25

26 of 36

Multi-level feedback queue (MLFQ)

Goal: Supporting general purpose scheduling

Challenge: The scheduler must support both long running background threads (batch processing) and low latency foreground threads (interactive processes)

Batch-based thread: Response time is not important; cares for long run times (cares for lots of CPU time, also minimizes context switch time)
Interactive thread: Response time is critical, short bursts

(not a lot of CPU is required but used frequently)

26

27 of 36

Multi-level feedback queue (MLFQ)

MLFQ objective:

Optimize turnaround time

For batch-based jobs
Also helps with short running jobs

Minimize response time for better interactivity

Does not know the completion time of the job

27

28 of 36

Multi-level feedback queue (MLFQ)

MLFQ has a distinct set of queues

Each queue is assigned a different priority level

Only one job runs on a queue
A job at a higher priority queue will always run first
Use round-robin among jobs if they are in the same queue

28

Decreasing priority

Time slice

29 of 36

Multi-level feedback queue (MLFQ)

MLFQ has a distinct set of queues

Each queue is assigned a different priority level

Only one job runs on a queue
A job at a higher priority queue will always run first
Use round-robin among jobs if they are in the same queue

Rule 1: if priority (A) > priority (B) then A runs

Rule 2: if priority (A) == priority (B) then A, B run in RR

29

Decreasing priority

Time slice

30 of 36

MLFQ changing priority of jobs

Rule 3: Jobs start at top priority

Rule 4: If a job A uses up total time slice, the scheduler lowers A’s priority

Allows short jobs to execute first to finish earlier
Longer jobs get demoted as they repeatedly exhaust their time slice

→ MLFQ approximates SJF

30

31 of 36

Handling starved tasks

Starvation

If there are too many interactive jobs (they start at the highest priority level)
Long running jobs will never receive any CPU time

31

0

100

150

200

50

L3

L2

L1

32 of 36

Handling starved tasks

Starvation

If there are too many interactive jobs (they start at the highest priority level)
Long running jobs will never receive any CPU time

Rule 5: Periodically moves all threads to the topmost queue (priority boosting)

32

0

100

150

200

50

L3

L2

L1

0

100

150

200

50

L3

L2

L1

boost

33 of 36

MLFQ: All basic rules

Set of rules adjusts priorities dynamically

Rule 1: if priority (A) > priority (B) then A runs

Rule 2: if priority (A) == priority (B) then A, B run in RR

Rule 3: Threads start at top priority

Rule 4: If a thread A uses up total time slice, the scheduler lowers A’s priority

Rule 5: Periodically moves all threads to the topmost queue (priority boosting)

33

Decreasing priority

Time slice

34 of 36

MLFQ question

Every thread starts at L1 (Rule 3)
Every 10 ticks, all threads boosted to L1 (Rule 5)
A thread blocked on IO does not consume CPU cycles

Resumes execution after IO operation completes

34

Tick per time slice

Decreasing priority

L1 L2 L3

1

2

4

Thread ID	Arrival time	Length
0	0	7
1	0	3
2	3	6
3	3	0.99 + IO for 4 + 5

35 of 36

MLFQ question

35

Tick per time slice

Decreasing priority

L1 L2 L3

1

2

4

Thread ID	Arrival time	Length
0	0	7
1	0	3
2	3	6
3	3	0.99 + IO for 4 + 5

0

20

30

40

10

L3

L2

L1

boost

36 of 36

MLFQ question

Q. When does thread 3 resumes execution after being blocked?

Tick 12

Q. How does priority boosting after 10 ticks affect the execution order?

All threads get chance to execute (avoid starvation)

Q. How does MLFQ balances short threads vs. long-running threads?

Gives shorter jobs to finish early through higher priority levels

Q. If a thread repeatedly performs IO and avoids long CPU bursts, how does MLFQ treats it against CPU-bound jobs?

IO thread stays in higher priority unless it ends up using its time slice for that priority level.

36

Thread ID	Arrival time	Length
0	0	7
1	0	3
2	3	6
3	3	0.99 + IO for 4 + 5

0

20

30

40

10

L3

L2

L1

boost