1 of 36

Midterm Review II

CS-202 Computer Systems

2 of 36

Today’s focus

  • Multi-level indexing
  • File operation wrt implementation
  • Journaling for crash consistency
  • IO
  • MLFQ

2

3 of 36

File system recap: inode

3

  • A persistent structure assigned to each file by the file system
    • Inode ID is unique for a file
  • An inode contains metadata of a file
    • Permissions, length, access times
    • Location of data blocks and indirection blocks
  • Inode table indexes files and directories stored on disk

4 of 36

Pages (page table) vs. blocks in file system

4

Page

Block

Purpose

Fixed size chunks in DRAM

Fixed size chunk on disk

Manager

OS memory management

File system (inodes, bitmaps etc.)

Size

4 KB, also (2MB, 1GB)

512 bytes to 4 KB (depends on FS and storage device)

Access mech.

CPU via load/stores

File IO operations (read/write)

Permission granularity

Pages (PTE)

File-level and directory-level permissions

5 of 36

Peeking inside a partition (storage block)

  • Persistent storage modeled as a sequence of N blocks
    • From 0 to N-1: 64 blocks
    • Some blocks store data
    • Other blocks store metadata:
      • An array of inodes
      • Bitmap tracking free inodes and data blocks (free lists)
      • Boot block and superblock are at the beginning of the partition

5

Data blocks

B

S

i

d

I

I

I

I

0

7

I

D

D

D

D

D

D

D

8

15

D

D

D

D

D

D

D

D

16

23

D

D

D

D

D

D

D

D

24

31

D

D

D

D

D

D

D

D

32

39

D

63

D

D

D

D

D

D

D

40

47

D

D

D

D

D

D

D

D

48

55

D

D

D

D

D

D

D

D

56

Data blocks

Data blocks

Data blocks

Data blocks

Data blocks

Data blocks

Inodes

Free lists

6 of 36

Inode table information

  • Stored on a reserved area on disk
    • Superblock stores start/end of inode entries

  • Fixed-size entries of 128/256 bytes
    • Efficient and predictable indexing
    • Entry contains metadata and pointers to actual data blocks for a file

6

S

I

I

I

I

I

I

I

I

inode blocks

Data blocks

Inode array / inode table

Inode

Inode number

Permissions

File size

Timestamps

Block pointer

Block pointers stores data block information

7 of 36

Data structure for multi-level indexing

  • Inode contains a fixed, asymmetric tree, with fixed sized data blocks (e.g., 4 KB) as its leaves
    • First 12 pointers point to data blocks
    • Last three point to intermediate blocks
      • #13: pointer to a block containing pointers to data blocks
      • #14: double indirect pointer
      • #15: triple indirect pointer

7

File metadata

Inode

Direct pointer to blocks

Indirect

pointers

inode

├─ Direct Pointer → Block (data)

├─ Direct Pointer → Block (data)

├─ Single-indirect pointer → [Pointer Block] → Blocks (data)

├─ Double-indirect pointer → [Pointer Block] → [Pointer Block] → Blocks (data)

└─ Triple-indirect pointer → [Pointer Block] → [Pointer Block] → [Pointer Block] → Blocks (data)

8 of 36

File allocation approach: Multi-level indexing

8

S

I

I

I

I

I

I

I

I

inode blocks

Remaining blocks

Inode array

File metadata

Inode

Data blocks

Indirect block

Double indirect block

Triple indirect block

12 x 4 KB = 48 KB

1K x 4 KB = 4 MB

1K x 1K x 4 KB = 4 GB

1K x 1K x 1K x 4 KB = 4 TB

9 of 36

Today’s focus

  • Multi-level indexing
  • File operation wrt implementation
  • Journaling for crash consistency
  • IO
  • MLFQ

9

10 of 36

File operation: Reading from a file

  • First, must open the file
    • open(“/cs202/w07”, O_RDONLY)
    • Traverse directory tree to find inode “w07”
    • Read that inode, check permissions, and return a file descriptor

  • Then, for each read() that is issued:
    • Read inode
    • Read appropriate data block (depending on the offset)
    • Update last access time in inode
    • Update file offset in in-memory open file table for fd

10

11 of 36

File operation: Writing to a file

  • Open file: open(“/cs202/w07”, O_WRONLY)

(assuming file already exists)

  • Might need to allocate new blocks:
    • Each logical write can generate up to five I/O operations:
      • Read free data block bitmap for free space
      • Write to data block bitmap to allocate a data block
      • Read the file’s inode
      • Write to the file’s inode to include a pointer to the new block
      • Write to the new data block

11

12 of 36

File operation: Writing to a file

Might need to allocate new blocks:

  • Creating a file involves more operations:
    • Read and write free inode bitmap
    • Write inode
    • (Read) and write directory data
    • Write directory data

  • If the directory is full, must allocate new blocks

12

13 of 36

Exercise: output of the program

13

14 of 36

Exercise: which inode blocks read/written

14

Inode blocks

15 of 36

Exercise: which data blocks read/written

15

Data blocks

16 of 36

Today’s focus

  • Multi-level indexing
  • File operation wrt implementation
  • Journaling for crash consistency
  • IO
  • MLFQ

16

17 of 36

Simplified journaling example

Goal: Write 10 to block 0 and 5 to block 1 atomically

This does not work! Must not crash between 1 and 2 unit time

17

Time

Block 0

Block 1

Extra

Extra

Extra

0

12

3

0

0

0

1

10

3

0

0

0

2

10

5

0

0

0

18 of 36

Simplified journaling example

Goal: Write 10 to block 0 and 5 to block 1 atomically

  • First write to the journal using transaction
  • Valid block indicates transaction completed successfully

18

Time

Block 0

Block 1

Block 0’

Block 1’

Valid

0

12

3

0

0

0

1

12

3

10

0

0

2

12

3

10

5

0

3

12

3

10

5

1

4

10

3

10

5

1

5

10

5

10

5

1

6

10

5

10

5

0

Extra blocks: journal

Journal commit block

Writes to journal: Journal transaction

19 of 36

Simplified journaling example

Goal: Write 10 to block 0 and 5 to block 1 atomically

  • Crash before time unit 3: old data
  • Crash after time unit 3: new data (need recovery)
  • Crash after time unit 6: new data already present

19

Time

Block 0

Block 1

Block 0’

Block 1’

Valid

0

12

3

0

0

0

1

12

3

10

0

0

2

12

3

10

5

0

3

12

3

10

5

1

4

10

3

10

5

1

5

10

5

10

5

1

6

10

5

10

5

0

20 of 36

Today’s focus

  • Multi-level indexing
  • File operation wrt implementation
  • Journaling for crash consistency
  • IO
  • MLFQ

20

21 of 36

Canonical device

  • OS configures and communicates based on the agreed protocol
    • Uses device driver for communication
  • Device controller signals to the CPU either through polling or interrupt

21

Status

CMD

DTA

Device registers

Device internals

Microcontroller (CPU + RAM)

Storage (ROM/flash)

Special purpose chips

Controller

// 1. spin

while (STATUS == BUSY) ;�

// 2. Write data to DTA register�*dtaRegister = DATA;�

// 3. Write command to CMD register�*cmdRegister = COMMAND;

// 4. spinwhile (STATUS == BUSY) ;

Polling-based CPU-device communication

22 of 36

Three points to minimize CPU spinning

  • Busy waiting (1. and 4.) wastes CPU cycles
  • For efficiency: CPU should avoid waiting for the completion of commands
  • Solution: Use interrupts
    • Inform the device of a request (to get rid of 1.)
    • Wait for a signal of completion (to get rid of 4.)

22

// 1. spin

while (STATUS == BUSY) ;�// 2. Write data to DTA register�*dtaRegister = DATA;�// 3. Write command to CMD register�*cmdRegister = COMMAND;

// 4. spin�while (STATUS == BUSY) ;

B

A

C

A

CPU

Disk

1

2

4

3

Time →

23 of 36

Device protocol at a high level

Interrupts to the rescue!

  • OS waits for an interrupt than spinning:
    • Interrupts are handled centrally through a dedicated OS thread
    • On interrupt arrival, wakeup a kernel thread that waits on the interrupt

23

B

A

C

A

CPU

Disk

B

C

A

CPU

Disk

3

B

2

A

B

A

A

1

4

1

2

4

3

Time →

Time →

24 of 36

Transferring data to/from controller

  • PIO (programmed IO): CPU tells the device what data is
    • One instruction for each byte/word
    • Efficient for few bytes; consumes CPU cycles proportional to data size
  • DMA (direct memory access): CPU tells the device where data is
    • Give controller access to memory bus
    • Ask to transfer data to/from memory directly
    • Efficient for large data transfers

24

B

C

A

CPU

Disk

3

B

2

A

B

A

A

1

4

25 of 36

Today’s focus

  • Multi-level indexing
  • File operation wrt implementation
  • Journaling for crash consistency
  • IO
  • MLFQ

25

26 of 36

Multi-level feedback queue (MLFQ)

Goal: Supporting general purpose scheduling

Challenge: The scheduler must support both long running background threads (batch processing) and low latency foreground threads (interactive processes)

  • Batch-based thread: Response time is not important; cares for long run times (cares for lots of CPU time, also minimizes context switch time)
  • Interactive thread: Response time is critical, short bursts

(not a lot of CPU is required but used frequently)

26

27 of 36

Multi-level feedback queue (MLFQ)

  • MLFQ objective:
    • Optimize turnaround time
      • For batch-based jobs
      • Also helps with short running jobs
    • Minimize response time for better interactivity
      • Does not know the completion time of the job

27

28 of 36

Multi-level feedback queue (MLFQ)

  • MLFQ has a distinct set of queues
    • Each queue is assigned a different priority level
  • Only one job runs on a queue
  • A job at a higher priority queue will always run first
  • Use round-robin among jobs if they are in the same queue

28

Decreasing priority

Time slice

29 of 36

Multi-level feedback queue (MLFQ)

  • MLFQ has a distinct set of queues
    • Each queue is assigned a different priority level
  • Only one job runs on a queue
  • A job at a higher priority queue will always run first
  • Use round-robin among jobs if they are in the same queue

Rule 1: if priority (A) > priority (B) then A runs

Rule 2: if priority (A) == priority (B) then A, B run in RR

29

Decreasing priority

Time slice

30 of 36

MLFQ changing priority of jobs

Rule 3: Jobs start at top priority

Rule 4: If a job A uses up total time slice, the scheduler lowers A’s priority

  • Allows short jobs to execute first to finish earlier
  • Longer jobs get demoted as they repeatedly exhaust their time slice

→ MLFQ approximates SJF

30

31 of 36

Handling starved tasks

Starvation

  • If there are too many interactive jobs (they start at the highest priority level)
  • Long running jobs will never receive any CPU time

31

0

100

150

200

50

L3

L2

L1

32 of 36

Handling starved tasks

Starvation

  • If there are too many interactive jobs (they start at the highest priority level)
  • Long running jobs will never receive any CPU time

Rule 5: Periodically moves all threads to the topmost queue (priority boosting)

32

0

100

150

200

50

L3

L2

L1

0

100

150

200

50

L3

L2

L1

boost

boost

boost

33 of 36

MLFQ: All basic rules

Set of rules adjusts priorities dynamically

Rule 1: if priority (A) > priority (B) then A runs

Rule 2: if priority (A) == priority (B) then A, B run in RR

Rule 3: Threads start at top priority

Rule 4: If a thread A uses up total time slice, the scheduler lowers A’s priority

Rule 5: Periodically moves all threads to the topmost queue (priority boosting)

33

Decreasing priority

Time slice

34 of 36

MLFQ question

  • Every thread starts at L1 (Rule 3)
  • Every 10 ticks, all threads boosted to L1 (Rule 5)
  • A thread blocked on IO does not consume CPU cycles
    • Resumes execution after IO operation completes

34

Tick per time slice

Decreasing priority

L1 L2 L3

1

2

4

Thread ID

Arrival time

Length

0

0

7

1

0

3

2

3

6

3

3

0.99 + IO for 4 + 5

35 of 36

MLFQ question

35

Tick per time slice

Decreasing priority

L1 L2 L3

1

2

4

Thread ID

Arrival time

Length

0

0

7

1

0

3

2

3

6

3

3

0.99 + IO for 4 + 5

0

20

30

40

10

L3

L2

L1

boost

boost

36 of 36

MLFQ question

Q. When does thread 3 resumes execution after being blocked?

  • Tick 12

Q. How does priority boosting after 10 ticks affect the execution order?

  • All threads get chance to execute (avoid starvation)

Q. How does MLFQ balances short threads vs. long-running threads?

  • Gives shorter jobs to finish early through higher priority levels

Q. If a thread repeatedly performs IO and avoids long CPU bursts, how does MLFQ treats it against CPU-bound jobs?

  • IO thread stays in higher priority unless it ends up using its time slice for that priority level.

36

Thread ID

Arrival time

Length

0

0

7

1

0

3

2

3

6

3

3

0.99 + IO for 4 + 5

0

20

30

40

10

L3

L2

L1

boost

boost