1 of 51

The Blink Scheduler

London is minding your tasks

BlinkOn 3, Nov 5th 2014

{alexclarke, petrcermak, picksi, rmcilroy, skyostil}@chromium.org

2 of 51

The main problem: Traffic jams

  • High priority tasks (e.g. user input) can get stuck behind slow tasks.
  • Delays of up to 1 second when scrolling

3 of 51

The main problem: Traffic jams

4 of 51

The main problem: Traffic jams

Compositor task

posted here

...and executed here, after 1 second

5 of 51

Issue: How do we measure traffic jams?

queueing_durations track how long compositor tasks are stuck in the message loop queue.

20 ms median delay

6 of 51

Issue: How do we measure traffic jams?

mean_input_event_latency tracks how long user input takes to hit the GPU.

50 ms median delay!

ChromiumPerf/android-nexus4/smoothness.sync_scroll.key_mobile_sites/mean_input_event_latency

7 of 51

Task sources

Renderer main thread tasks come from many sources:

IPC messages

Main thread

MessageLoop

Blink shared timers

Input events from cc

BeginFrames from cc

DOM timers

Input events

v8 GC

callOnMain

HTML parsing

Network loads

8 of 51

Goal: Perfect Scheduling

  • Perfect scheduling makes sure the right things get done at the right time.

  • We’re doing the same stuff, just in a sensible order.

  • Enable well-written content to take advantage of good scheduling.

9 of 51

All about scheduling

10 of 51

Pre-Scheduler world

Task 1

Task 3

New Task

Task 2

11 of 51

Pre-Scheduler world

Task 1

Task 3

Task 4

Task 2

12 of 51

Pre-Scheduler world

Task 1 taken from the front of the queue and executed.

Task 1

Task 3

Task 4

Task 2

13 of 51

Pre-Scheduler world

Task 1

Task 3

◕︵◕

Task 2

A new task will need to wait until all the preceding tasks have been executed.

14 of 51

Scheduler world

Task 1

Task 3

New Task

Task 2

Fast Lane

Ideally we’d like to be able to fast-track high priority tasks.

15 of 51

Scheduler world

Task 1

Task 3

Task 2

Task 4

16 of 51

Scheduler world

Task 1

Task 3

Task 2

Task 1

\(• ◡ •)/

17 of 51

Multiple queues give more flexibility

Task 1

Task 3

Task 1

Task 4

Task 2

18 of 51

Issue: Maintaining execution order

Most tasks need to have their relative execution order maintained.

Only reorder tasks that have opted into prioritization.

  • Can’t composite the results of input processing before the input has been processed

  • Ordering assumptions baked into existing code

19 of 51

Solution: Tasks are grouped by type

  • Maintains relative execution order

  • Provides a clear API for posting tasks

Default

Compositor

Idle

20 of 51

Solution: Tasks are grouped by type

Default

Compositor

Idle

Always prioritise compositor?

Priority

21 of 51

Issue: Static prioritisation doesn’t work

Always prioritising compositor tasks regressed page loading time by 14%.

22 of 51

Solution: Dynamic prioritisation

Use contextual awareness to determine prioritisation policy.

  • User interaction makes:
    • Compositor tasks more important
    • Resource loading less important
  • When finished drawing a frame, allow�background tasks:
    • pro-active GC
    • background HTML parsing

23 of 51

Policy selection

Events will change the active policy:

  • User input events Prioritize Compositor
  • Compositor committed frame Idle
  • Compositor begin frame Normal
  • Page navigation Page Load

24 of 51

Queue priorities set by policy

Default

CC

Idle

Compositor

Normal

Loading

Default

Idle

CC

Default

Idle

CC

Priority

25 of 51

Issue: Task starvation

We need a scheduler which cannot be ‘gamed’ by sociopathic tasks to completely starve out other tasks.

26 of 51

Issue: Task starvation

CC

CC

CC

CC

CC

CC

CC

CC

CC

27 of 51

Solution: Weighted round robin

to avoid starvation

CC

CC

CC

CC

CC

CC

CC

CC

Default

CC

CC

CC

CC

Default

CC

CC

CC

28 of 51

Architecture

29 of 51

Finding a home for the Scheduler

  • Re-implemented much of the message loop
  • Posting tasks required a lot of Blink-Chromium plumbing
  • Task execution required Russian doll closures

Version 1 implementation was in Blink

30 of 51

Finding a home for the Scheduler

  • Required low level changes in base
  • Changing mission critical but crufty code
  • Changes would be inherited by all message loops

Version 2 implementation was in base/message_loop

31 of 51

Finding a home for the Scheduler

  • Sits on top of base::MessageLoop
  • Lets us use more powerful primitives from base
  • Avoids crossing layers unnecessarily

Version 3 (current) lives in content/renderer/scheduler/

32 of 51

Scheduler architecture overview

  • Scheduler provides N task queues
  • Policies determine which task queues are prioritised
  • System behaviour determines policies

33 of 51

Scheduler architecture overview

Three key elements�

  1. API for posting to different queues
  2. Interface that selects the next queue to process
  3. Mechanism for changing policies

34 of 51

Architecture

base::MessageLoop

Task

Task

Task

...

MessageLoop

Proxy

35 of 51

Architecture: Posting API

TaskQueueManager

Incoming

task

queue

Work

queue

Incoming

task

queue

Work

queue

DefaultTaskRunner

Compositor

TaskRunner

base::MessageLoop

Sched.

task

Task

Sched.

task

Scheduling entry point

Other tasks that aren’t scheduled (e.g. IPC)

...

36 of 51

Architecture: Posting API

TaskQueueManager

Incoming

task

queue

Work

queue

Incoming

task

queue

Work

queue

base::MessageLoop

Sched.

task

Task

Sched.

task

Scheduling entry point

Other tasks that aren’t scheduled (e.g. IPC)

...

DefaultTask

Runner

Chromium:

DefaultTaskRunner()->PostTask()

CompositorTaskRunner()->PostTask()

IdleTaskRunner()->PostIdleTask()

Blink:

Scheduler::shared()->scheduler()

Platform::current()->callOnMain()

WebThread::postTask()

Compositor

TaskRunner

37 of 51

Architecture: Queue selector

Scheduling decision: which work queue to service

TaskQueueManager

Incoming

task

queue

Work

queue

Incoming

task

queue

Work

queue

base::MessageLoop

Sched.

task

Task

Sched.

task

Scheduling entry point

...

Work queue state

DefaultTaskRunner

Compositor

TaskRunner

RendererTask QueueSelector

38 of 51

Architecture: Policies

TaskQueueManager

Incoming

task

queue

Work

queue

Incoming

task

queue

Work

queue

RendererTask QueueSelector

base::MessageLoop

Sched.

task

Task

Sched.

task

Scheduling entry point

...

Contextual information

(e.g., user input) drives policy

Renderer

Scheduler

Queue priorities

DefaultTaskRunner

Compositor

TaskRunner

39 of 51

Results

40 of 51

Results from Scheduler V1

Queueing_durations ~12ms => ~7ms

(Nexus 7v2)

41 of 51

Preliminary results for V3

(Nexus 7v2)

42 of 51

Future steps

43 of 51

Future steps: Idle tasks

  • Get non-critical work off the critical path
  • Only executed when frame has been committed

Idle Period

CC

commit

begin frame

44 of 51

Idle task characteristics

  • May be starved for arbitrarily long
  • Take a deadline argument
    • Should finish by the deadline or earlier
  • Posted tasks won’t run until next idle period

45 of 51

First idle task customer: V8 GC

  • Do GC work off critical path
  • Idle task performs GC incremental marking step
  • Estimates how much heap it can mark before deadline
  • Repost task if more to do
  • Early result: 50% reduction in GC on the critical path

46 of 51

Future Idle task customers…?

  • V8 GC sweeping
  • V8 optimising compiler
  • HTML parsing
  • Oilpan incremental marking

47 of 51

Future steps: LoadingTaskQueue

  • Adding a queue for page loading tasks
    • Resource loading
    • HTML parsing
    • Javascript compilation
  • Add a loading priority policy
    • Prioritises loading tasks to optimise first paint ‘above the fold’
  • Deprioritise when in compositor priority policy

48 of 51

Future steps: Half-baked ideas

  • Provide scheduling primitives to JS
    • setTimeout(0) should mean setTimeout(0)
    • Posting idle tasks
  • Power optimisation
    • Synchronize idle/background ticking in renderers with other wakeups
  • Feedback for developers
    • Point out scheduling problems

49 of 51

Future steps: Half-baked ideas

  • Provide scheduling primitives to JS
    • setTimeout(0) should mean setTimeout(0)
    • Posting idle tasks
  • Power optimisation
    • Synchronize idle/background ticking in renderers with other wakeups
  • Feedback for developers
    • Point out scheduling problems

50 of 51

Lessons learned

  • How (not) to write a scheduler
    • Existing code unintentionally�relies on implicit ordering
    • It’s easy to break low level�things in subtle ways
  • Failing fast and iterating reaches good solutions
  • Live and die by metrics
  • Having MTV contacts (eseidel@) was invaluable

51 of 51

Making your tasks scheduler aware

  • Avoid blocking the main thread >8ms
  • Check shouldYieldForHighPriorityWork() from long running tasks
  • Post to the most appropriate queue (or add a new one‽)
  • Do background work using idle tasks
  • Join scheduler-dev@chromium.org to get involved.