1 of 28

CSE 451

Operating Systems

L14 - Scheduling - 2

Slides by: Tom Anderson

Baris Kasikci

2 of 28

How can we improve round-robin? Multi-level Feedback Queue (MFQ)

  • Goals:
    • Responsiveness for latency sensitive tasks
    • Low overhead
    • Starvation freedom
    • Some tasks are high/low priority
    • Fairness (among equal priority tasks)
  • Not perfect at any of them!

3 of 28

MFQ

  • Set of Round Robin queues
    • Each queue has a separate priority
  • High priority queues have short time slices
    • Low priority queues have long time slices
  • Scheduler picks first thread in highest priority queue
  • Tasks start in highest priority queue
    • If time slice expires, task drops one level
      • Hypothesis: the task is taking more time, so it’s not as sensitive to timeliness
  • If low priority tasks are being starved, bump them up a level
    • If a task isn’t getting resources over a period of time

4 of 28

MFQ

5 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

6 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

7 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

Time slice

Drop priority

8 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

Time slice

Drop priority

9 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

Time slice

Drop priority

Preemption

10 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

Time slice

Drop priority

Preemption

CPU Bound 1 continues here, because its time-slice was extended

11 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

Time slice

Drop priority

Preemption

12 of 28

Example: Workload with I/O and Compute Tasks (Multilevel Feedback Queues)

Tasks

I/O Bound

CPU Bound 1

CPU Bound 2

Time

I/O start

I/O ends

I/O end

I/O start

Time slice

Drop priority

Preemption

Preemption

13 of 28

MFQ Downside Example: Multilevel Feedback Queues and Starvation

Tasks

I/O Bound - 1

I/O Bound - 2

I/O Bound - 3

I/O Bound - 4

I/O Bound - 5

CPU Bound - At low priority

Starvation - bump to higher priority level

At the limit, the CPU-bound task should get close to 1/6th of the CPU

I/O start

I/O ends

I/O end

I/O start

14 of 28

How to allocate resources in a principled way? -> Max-Min Fairness

  • Maximize the minimum allocation to any task
      • Recursively schedule the remaining tasks the same way
  • Results:
      • If any task needs less than their fair share, they get as much as they need
      • Any left over resources divided evenly among remaining tasks

15 of 28

Fair Queuing: Max-Min Fairness

  • Applies to repeating tasks
    • Ex: web server splits its time between compute and disk/network I/O
    • Ex: end hosts transferring packets over a network
  • Maximize the minimum allocation given to a task
    • Keep track of CPU usage over time
    • Schedule the task with the smallest accumulated CPU time
      • Max-min: maximize the minimum allocation given to any task
  • Result:
    • If any task needs less than their equal share, will get their entire allocation
    • Any extra time split evenly among remaining tasks using max-min (recursion)
    • If all remaining tasks need are compute bound, split evenly

16 of 28

Example: Workload with I/O and Compute Tasks

  • Prioritizing I/O

Tasks

I/O Bound

CPU Bound

CPU Bound

Time

Issues

I/O Request

I/O

Completes

Issues

I/O Request

17 of 28

Example: Workload with I/O and Compute Tasks

  • Fair queuing

Tasks

I/O Bound

CPU Bound

CPU Bound

Time

Issues

I/O Request

I/O

Completes

Issues

I/O Request

I/O

Completes

Issues I/O Request

I/O

Completes

18 of 28

Weighted Fair Queueing

  • What if some tasks are more important than others?
  • With priority scheduling, higher priority task is always scheduled
    • Lower priority tasks can wait forever (starve) if high priority task is large
  • Weighted fair queueing: assign tasks weights that sum to 1
    • Schedule task with least accumulated CPU usage/weight
    • Every task gets *some* portion of the CPU

Give latency sensitive

tasks extra weight

Example:

A

B

C

A weight: 0.5

B: 0.25

C: 0.25

19 of 28

Uniprocessor Summary (1)

  • FIFO is simple and minimizes overhead.
  • If tasks are variable in size, then FIFO can have very poor average response time.
  • If tasks are equal in size, FIFO is optimal in terms of average response time.
  • SJF is optimal in terms of average response time.
  • SJF is pessimal in terms of variance in response time.

20 of 28

Uniprocessor Summary (2)

  • If tasks are variable in size, Round Robin approximates SJF.
  • If tasks are equal in size, Round Robin will have very poor average response time.
  • Tasks that intermix processor and I/O benefit from SJF and can do poorly under Round Robin, especially with a long quantum
  • MFQ attempts to prioritize latency sensitive tasks
    • MFQ approximates SJF => High variance for long jobs
    • Under high load, MFQ approximates round robin (everyone at highest priority)
  • Fair queueing improves response time for I/O-bound tasks by equalizing CPU usage over a long period
    • Strategy-proof: everyone gets equal share of CPU if they need it

21 of 28

Multiprocessor Scheduling

  • What would happen if we used multilevel feedback or fair queueing on a multiprocessor?
    • Contention for scheduler spinlock
    • Cache slowdown:
      • Task data from last time it ran is still in its old cache => will it run there?
      • ready list (or ptable) data structure pings from one CPU to another

22 of 28

Per-Processor Affinity Scheduling

  • Each processor has its own ready list
    • Protected by a per-processor spinlock
  • Put threads back on the ready list where it had most recently run
    • Ex: when I/O completes, or on Condition->signal
  • Idle processors can steal work from other processors
    • Spinlock allows synchronizing across CPUs

23 of 28

Per-Processor Multi-level Feedback with Affinity Scheduling

Need rebalancing in case CPU 1 has a lot of high priority tasks

Lock

Lock

Lock

24 of 28

Bulk Synchronous Model of Parallelism

  • Loop at each processor:
    • Compute on local data (in parallel)
    • Barrier
    • Send (selected) data to other processors (in parallel)
    • Barrier
  • Examples:
    • Machine learning training and inference
    • MapReduce
    • Fluid flow over a wing
  • Most parallel algorithms can be recast in BSP

25 of 28

Bulk Synchronous in Action

CPUs do uneven work -> stragglers; everyone waits as the barrier; slowest thread determines progress

26 of 28

Scheduling Parallel Programs

  • What happens if one thread gets time-sliced while other threads from the same program are still running?
    • Assuming program uses locks and condition variables, it will still be correct

27 of 28

Parallel Program Speedup

Little to no overhead to running in parallel, all cpus used

Some Overhead, some cpus underused

Can make use of additional cpus, but only up to a point (lock contention, tail latency)

28 of 28

Midterm Feedback

  • Lecture topics seem disjointed and hard to follow, cognitive load is heavy
    • We will work on improving coherence
    • Topic summaries that connect presented concepts
  • Lectures move simultaneously fast and slow
    • We will recalibrate to go slower on harder concepts
  • Psets are misaligned with lectures and are too lengthy
    • Psets will be fully-aligned with lectures
    • Psets length will be reduced by 30%-40% compared to prior years
  • Slides need improvement
    • We will make them complete
      • Please check slide notes
    • We will replace hand-written diagrams with clearer ones
  • Responsiveness is good, but can be improved by not breaking the flow
    • Repeating of questions to make them audible
  • Midterm and final creates stress
    • Our goal is for everyone to be successful & learn OS principles