1 of 112

to GIL or not to GIL:

the Future of Multi-Core (C)Python

2 of 112

Who Am I?

  • software engineer at Microsoft (Python extension for VS Code)
  • CPython core developer (since 2012)
    • 8 PEPs (5 accepted, 3 open)
    • sys.implementation
    • module.__spec__
    • C OrderedDict

2 / 112

3 of 112

Who Am I?

  • software engineer at Microsoft (Python extension for VS Code)
  • CPython core developer (since 2012)
    • 8 PEPs (5 accepted, 3 open)
    • sys.implementation
    • module.__spec__
    • C OrderedDict
  • tired of hearing about how the GIL makes Python awful
  • in late 2014 decided to do something about it

3

4 of 112

Overview

  1. Context
    • CPython’s Architecture
    • What happens when Python Runs?
    • Threads and Locks
  2. The GIL
  3. The Future
    • The C-API
    • Subinterpreters!
  4. Q&A

4

5 of 112

Context

5

6 of 112

An Overview of CPython’s Architecture

  • process
  • runtime
  • interpreter
  • Python thread
  • call stack
  • eval loop
  • the OS process
  • everything Python-related in a process
  • all Python threads and everything they share
  • wrapper around OS thread with eval loop inside
  • stack of eval loop instances (i.e. Python function calls)
  • executes the sequence of instructions in a code obj

6

7 of 112

What Happens When Python Runs?

7

process

env vars

signals

...

  1. process initializes

8 of 112

What Happens When Python Runs?

8

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

  • process initializes
  • main thread starts

9 of 112

What Happens When Python Runs?

9

process

CPython runtime

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

config

PyMem

...

  • process initializes
  • main thread starts
  • Python runtime initializes

10 of 112

What Happens When Python Runs?

10

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

Interpreter (“main”)

sys

sys.modules

...

  • process initializes
  • main thread starts
  • Python runtime initializes
    1. main interpreter initializes

11 of 112

What Happens When Python Runs?

11

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes

12 of 112

What Happens When Python Runs?

12

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads

code object A

“bytecode”

10 …

20 …

30 …

40 …

13 of 112

What Happens When Python Runs?

13

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads

code object A

“bytecode”

10 …

20 …

30 …

40 …

14 of 112

What Happens When Python Runs?

14

code object A

“bytecode”

10 …

20 …

30 …

40 …

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

15 of 112

What Happens When Python Runs?

15

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object A

“bytecode”

10 …

20 …

30 …

40 …

16 of 112

The Eval Loop

<set up>

for instruction in code object:

<maybe side-channel stuff>

<occasionally release & re-acquire the GIL>

<execute next instruction>

16

17 of 112

What Happens When Python Runs?

17

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object A

“bytecode”

10 …

20 … # call

30 …

40 …

18 of 112

What Happens When Python Runs?

18

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object B

“bytecode”

10 …

20 …

frame

19 of 112

What Happens When Python Runs?

19

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object B

“bytecode”

10 …

20 …

frame

eval loop

20 of 112

What Happens When Python Runs?

20

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object B

“bytecode”

10 …

20 …

frame

eval loop

21 of 112

What Happens When Python Runs?

21

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object A

“bytecode”

10 …

20 …

30 …

40 …

22 of 112

What Happens When Python Runs?

22

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object A

“bytecode”

10 …

20 …

30 …

40 …

23 of 112

What Happens When Python Runs?

23

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

  • process initializes
  • main thread starts
  • Python runtime initializes
    • main interpreter initializes
    • main Py thread initializes
  • Python program loads
  • Python frame initializes
  • eval loop steps through bytecode

code object A

“bytecode”

10 …

20 …

30 …

40 …

24 of 112

What Happens When Python Runs?

24

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

“bytecode”

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

25 of 112

Multi-threading!

25

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

26 of 112

Multi-threading!

26

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

27 of 112

Multi-threading!

27

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

28 of 112

Multi-threading!

28

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

29 of 112

Multi-threading!

29

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

30 of 112

Multi-threading!

30

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

31 of 112

Multi-threading!

31

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

32 of 112

Multi-threading!

32

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

33 of 112

Multi-threading!

33

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

34 of 112

Multi-threading!

34

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

35 of 112

Multi-threading!

35

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

“bytecode”

A10

A20 # t.start()

A40 # t.join()

B10

B20

A30

C10

C20

C10

C20B10

B20

A30

B10

C10B20

C20A30

C10B10

B20

A30

C20

36 of 112

“Race Condition”

A.K.A. “Resource Contention”

36

# thread A

spam = read()

spam.a = 42

write(spam)

# thread B

spam = read()

if spam.a != 42:

race here

37 of 112

“Race Condition”

A.K.A. “Resource Contention”

37

# thread A

spam = read()

spam.a = 42

write(spam)

# thread B

spam = read()

if spam.a != 42:

acquire lock A ->

acquire lock A ->

release lock A ->

release lock A ->

38 of 112

“Race Condition”

A.K.A. “Resource Contention”

38

# thread A

spam = read()

spam.a = 42

write(spam)

# thread B

spam = read()

if spam.a != 42:

acquire lock A ->

acquire lock A ->

release lock A ->

release lock A ->

39 of 112

“Race Condition”

A.K.A. “Resource Contention”

39

# thread A

spam = read()

spam.a = 42

write(spam)

# thread B

spam = read()

if spam.a != 42:

acquire lock A ->

acquire lock A ->

release lock A ->

release lock A ->

spam = read()

spam.a = 42

write(spam)

spam = read()

if spam.a != 42:

40 of 112

The GIL

40

41 of 112

The GIL (“Global Interpreter Lock”)

41

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

42 of 112

State At Different Layers

42

process ->

global runtime ->

interpreter ->

thread / stack / ceval

env vars

GIL

sys module

current frame

sockets

signal handlers

modules

stack depth

file handles

Py_AtExit() funcs

atexit handlers

“tracing”

signals

GC

fork handlers

hook: trace

(thread-local storage)

allocator (mem)

hook: eval_frame

hook: profile

...

objects (w/ refcounts)

codecs

current exception

pending calls

context

“eval breaker”

...

43 of 112

The GIL (“Global Interpreter Lock”)

43

process

env vars

signals

“data”

“heap”

thread (“main”)

TSS (TLS)

“stack”

CPython runtime

config

PyMem

...

interpreter (“main”)

sys

sys.modules

...

Python thread

frame

eval loop

__main__

A10 …

A20 …

B10 …

B20 ...

A30 …

A40 …

thread

TSS (TLS)

“stack”

Python thread

frame

eval loop

def spam():

t = threading.Thread(target=spam)

t.start()

t.join()

spam()

C10 …

C20 …

44 of 112

The Eval Loop

<set up>

for instruction in code object:

<maybe side-channel stuff>

<occasionally release & re-acquire the GIL>

<execute next instruction>

44

45 of 112

When is the GIL Released?

  • eval loop: every few instructions
  • around C code that does not touch runtime resources
  • around IO operations
  • (by C extensions)

45

46 of 112

Costs and Benefits

of the GIL

  • Multi-core parallelism of Python code
  • ???
  • Cheaper than fine-grained locks
  • Simpler eval loop implementation
  • Simpler object implementation
  • Simpler C-API implementation

46

47 of 112

Costs and Benefits

of the GIL

  • Multi-core parallelism of Python code
  • ???
  • Cheaper than fine-grained locks
  • Simpler implementation
    • eval loop
    • object system
    • C-API

47

48 of 112

Effect and Perception

Who does it really affect?

  • Users with threaded, CPU-bound *Python* code (relatively few people)
  • Basically no one else

48

49 of 112

Effect and Perception

Who does it really affect?

  • Users with threaded, CPU-bound *Python* code (relatively few people)
  • Basically no one else

Why? C implementation releases the GIL around IO and CPU-intensive code.

49

50 of 112

Effect and Perception

Who does it really affect?

  • Users with threaded, CPU-bound *Python* code (relatively few people)
  • Basically no one else

So why does the GIL get such a bad wrap?

  • Lack of understanding
  • Experience with other programming languages
  • Haters gonna hate

50

51 of 112

Working Around the GIL

  • C-extension modules
    • rewrite CPU-bound code in C
    • release the GIL around that code
  • multi-processing
  • (async / await)

51

52 of 112

Past Efforts to Remove the GIL

  • 1999 Greg Stein
  • Larry Hastings' Gilectomy (on hold)
  • other Python implementations
    • unladen swallow

52

53 of 112

Other Python Implementations

53

GIL?

C-API?

latest Py version

CPython

yes

yes

3.7

no

2.7

no

2.7

PyPy (& w/STM)

yes (no)

3.6

no?

~3.4+

54 of 112

The Future

54

55 of 112

A New C-API

  • the history
  • the problem
  • the solutions

55

56 of 112

The C-API

  • historically fundamental to Python’s success
  • organic growth
  • early efforts to simplify
  • core devs: growing concerns
  • core devs: increasing efforts

56

57 of 112

The Problem

  • getting rid of GIL needs low-level changes
  • parts of public C-API expose low-level details (e.g. refcounts)
  • so...
  • getting rid of GIL requires breaking parts of C-API

57

58 of 112

The Causes

  • didn’t think 20+ years into future
  • “consenting adults”
  • accidental leaks

58

59 of 112

The Solutions

  • someone has to care enough to do the work
  • physically separate the categories of C-API
  • more opaque structs
  • Python (C)FFI
  • (maybe) break compatibility in a few places
  • deprecate C-API in favor of something like Cython (official)
  • ...

59

60 of 112

The Solutions

  • someone has to care enough to do the work
  • physically separate the categories of C-API
  • more opaque structs
  • Python (C)FFI
  • (maybe) break compatibility in a few places
  • deprecate C-API in favor of something like Cython (official)
  • ...

60

61 of 112

The Solutions

  • someone has to care enough to do the work
  • physically separate the categories of C-API
  • more opaque structs
  • Python (C)FFI
  • (maybe) break compatibility in a few places
  • deprecate C-API in favor of something like Cython (official)
  • ...

61

62 of 112

Categorizing the C-API

  • “internal”
  • “private”
  • “unstable”
  • “stable”

“Do not touch!”

“Use at your own risk!”

“Go for it (but rebuild your extension each Python release)!”

“Worry-free!”

62

63 of 112

The Solutions

  • someone has to care enough to do the work
  • physically separate the categories of C-API
  • more opaque structs
  • Python (C)FFI
  • (maybe) break compatibility in a few places
  • move toward something like Cython (official)
  • ...

63

64 of 112

The Projects

64

<capi-sig@python.org>

65 of 112

The Projects

65

<capi-sig@python.org>

66 of 112

Beyond the C-API...

66

67 of 112

Subinterpreters!!!

67

68 of 112

Interpreters in a Single Process

  • initial interpreter: “main”
    • has certain responsibilities
  • “subinterpreter”: any other interpreter created within the runtime
  • isolated-ish

68

69 of 112

Interpreters in a Single Process

  • initial interpreter: “main”
    • has certain responsibilities
  • “subinterpreter”: any other interpreter created within the runtime
  • isolated-ish

69

CPython runtime

config

PyMem

...

Interpreter (“main”)

sys

sys.modules

...

subinterpreter

sys

sys.modules

...

subinterpreter

sys

sys.modules

...

Py thread

f

el

Py thread

f

el

f

el

Py thread

f

el

Py thread

f

el

f

el

Py thread

f

el

70 of 112

Interpreters in a Single Process

  • initial interpreter: “main”
    • has certain responsibilities
  • “subinterpreter”: any other interpreter created within the runtime
  • isolated-ish

70

CPython runtime

config

PyMem

...

Interpreter (“main”)

sys

sys.modules

...

subinterpreter

sys

sys.modules

...

subinterpreter

sys

sys.modules

...

Py thread

f

el

Py thread

f

el

f

el

Py thread

f

el

Py thread

f

el

f

el

Py thread

f

el

71 of 112

Subinterpreters

  • initial interpreter: “main”
    • has certain responsibilities
  • “subinterpreter”: any other interpreter created within the runtime
  • isolated-ish
  • C-API for over 20 years
  • PEP 554: stdlib module

71

72 of 112

PEP 554 - “Multiple Interpreters in the Stdlib”

  • https://www.python.org/dev/peps/pep-0554/
  • new “interpreters” module
    • create(), list_all(), etc.
    • Interpreter class
    • create_channel()
    • RecvChannel, SendChannel

72

73 of 112

PEP 554 - “Multiple Interpreters in the Stdlib”

  • https://www.python.org/dev/peps/pep-0554/
  • new “interpreters” module
    • create(), list_all(), etc.
    • Interpreter class
    • create_channel()
    • RecvChannel, SendChannel

73

74 of 112

PEP 554: Example 1

import interpreters

interp = interpreters.create()

interp.run(dedent("""

print('spam')

"""))

74

75 of 112

PEP 554: Example 1

import interpreters

interp = interpreters.create()

interp.run(dedent("""

print('spam')

"""))

75

76 of 112

PEP 554: Example 1

import interpreters

interp = interpreters.create()

interp.run(dedent("""

print('spam')

"""))

76

77 of 112

PEP 554: Example 2

interp = interpreters.create()

def func():

interp.run(dedent("""

print('spam')

"""))

t = threading.Thread(target=func)

t.start()

77

78 of 112

PEP 554: Example 3

interp = interpreters.create()

interp.run(dedent("""

x = 'spam'

"""))

interp.run(dedent("""

print(x)

"""))

78

79 of 112

PEP 554: Example 3

interp = interpreters.create()

interp.run(dedent("""

x = 'spam'

"""))

interp.run(dedent("""

print(x)

"""))

79

80 of 112

PEP 554: Example 3

interp = interpreters.create()

interp.run(dedent("""

x = 'spam'

"""))

interp.run(dedent("""

print(x)

"""))

80

81 of 112

PEP 554 - “Multiple Interpreters in the Stdlib”

  • https://www.python.org/dev/peps/pep-0554/
  • new “interpreters” module
    • create(), list_all(), etc.
    • Interpreter class
    • create_channel()
    • RecvChannel, SendChannel

81

82 of 112

PEP 554 - “Multiple Interpreters in the Stdlib”

  • https://www.python.org/dev/peps/pep-0554/
  • new “interpreters” module
    • create(), list_all(), etc.
    • Interpreter class
    • create_channel()
    • RecvChannel, SendChannel

82

83 of 112

PEP 554 - “Multiple Interpreters in the Stdlib”

  • https://www.python.org/dev/peps/pep-0554/
  • new “interpreters” module
    • create(), list_all(), etc.
    • Interpreter class
    • create_channel()
    • RecvChannel, SendChannel

83

For now:

  • limited supported types
    • str, int, None, etc.
    • PEP 3118 buffers
  • actual objects not shared
  • no buffering

84 of 112

PEP 554: Example 4

(rchan, schan

) = interpreters.create_channel()

interp = interpreters.create()

def func():

interp.run(dedent("""

import spam

data = spam.do_something()

ch.send(data) # blocks

""", channels={ch: schan})

t = threading.Thread(target=func)

t.start()

data = rchan.recv() # blocks

process_data(data)

84

85 of 112

PEP 554: Example 4

(rchan, schan

) = interpreters.create_channel()

interp = interpreters.create()

def func():

interp.run(dedent("""

import spam

data = spam.do_something()

ch.send(data) # blocks

""", channels={ch: schan})

t = threading.Thread(target=func)

t.start()

data = rchan.recv() # blocks

process_data(data)

85

86 of 112

PEP 554: Example 4

(rchan, schan

) = interpreters.create_channel()

interp = interpreters.create()

def func():

interp.run(dedent("""

import spam

data = spam.do_something()

ch.send(data) # blocks

""", channels={ch: schan})

t = threading.Thread(target=func)

t.start()

data = rchan.recv() # blocks

process_data(data)

86

87 of 112

PEP 554: Example 4

(rchan, schan

) = interpreters.create_channel()

interp = interpreters.create()

def func():

interp.run(dedent("""

import spam

data = spam.do_something()

ch.send(data) # blocks

""", channels={ch: schan})

t = threading.Thread(target=func)

t.start()

data = rchan.recv() # blocks

process_data(data)

87

88 of 112

PEP 554: Example 4

(rchan, schan

) = interpreters.create_channel()

interp = interpreters.create()

def func():

interp.run(dedent("""

import spam

data = spam.do_something()

ch.send(data) # blocks

""", channels={ch: schan})

t = threading.Thread(target=func)

t.start()

data = rchan.recv() # blocks

process_data(data)

88

89 of 112

PEP 554: Example 4

(rchan, schan

) = interpreters.create_channel()

interp = interpreters.create()

def func():

interp.run(dedent("""

import spam

data = spam.do_something()

ch.send(data) # blocks

""", channels={ch: schan})

t = threading.Thread(target=func)

t.start()

data = rchan.recv() # blocks

process_data(data)

89

90 of 112

Who Cares?

  • a human-oriented concurrency model (IMHO)
    • "opt-in sharing"
  • “the isolation of processes, with the efficiency of threads”
  • gateway to multi-core CPython

90

91 of 112

Who Cares?

  • a human-oriented concurrency model (IMHO)
    • "opt-in sharing"
  • “the isolation of processes, with the efficiency of threads”
  • gateway to multi-core CPython

91

92 of 112

Who Cares?

  • a human-oriented concurrency model (IMHO)
    • "opt-in sharing"
  • “the isolation of processes, with the efficiency of threads”
  • gateway to multi-core CPython

92

93 of 112

Who Cares?

  • a human-oriented concurrency model (IMHO)
    • "opt-in sharing"
  • “the isolation of processes, with the efficiency of threads”
  • gateway to multi-core CPython

93

94 of 112

Stop Sharing the GIL!!!

94

95 of 112

State At Different Layers

95

process ->

global runtime ->

interpreter ->

thread / stack / ceval

env vars

GIL

sys module

current frame

sockets

signal handlers

modules

stack depth

file handles

Py_AtExit() funcs

atexit handlers

“tracing”

signals

GC

fork handlers

hook: trace

(thread-local storage)

allocator (mem)

hook: eval_frame

hook: profile

...

objects

codecs

current exception

pending calls

context

“eval breaker”

...

96 of 112

Stop Sharing the GIL!

  • allow each interpreter to execute independently
  • threads within an interpreter would still share a “GIL”
  • shouldn’t require wide-spread changes
  • no change to single-threaded (or single-interpreter) performance

96

97 of 112

Stop Sharing the GIL!

  • allow each interpreter to execute independently
  • threads within an interpreter would still share a “GIL”
  • shouldn’t require wide-spread changes
  • no change to single-threaded (or single-interpreter) performance

97

98 of 112

Why Hasn’t It Been Done Already?

  • forgotten feature
  • no one interested enough (to do the work)
  • “good enough” alternatives
  • scary! (or not)
  • blockers...

98

99 of 112

the blockers

  • lingering bugs
  • subinterpreters only in C-API
  • how to guard against races between interpreters?
  • enough time to do the work!
  • C globals

(╯°□°)╯︵ ┻━┻

99

100 of 112

the blockers

  • lingering bugs
  • subinterpreters only in C-API
  • how to guard against races between interpreters?
  • enough time to do the work!
  • C globals

(╯°□°)╯︵ ┻━┻

100

101 of 112

the blockers

  • lingering bugs
  • subinterpreters only in C-API
  • how to guard against races between interpreters?
  • enough time to do the work!
  • C globals

(╯°□°)╯︵ ┻━┻

101

102 of 112

the blockers

  • lingering bugs
  • subinterpreters only in C-API
  • how to guard against races between interpreters?
  • enough time to do the work!
  • C globals

(╯°□°)╯︵ ┻━┻

102

103 of 112

C "Globals"

  • “static globals”, "static locals"
  • TSS/TLS (Thread-Specific Storage)
  • in the CPython code base
  • in extension modules
    • static types, exceptions, singletons; etc.
    • C globals in included shared libraries (e.g. OpenSSL in cryptography)
    • efforts to fix: PEPs 3121, 384, 489, (573), 575, (579), (580); Cython; Red Hat; Instagram
    • (type “slots”)

103

104 of 112

the project

104

105 of 112

the project

https://github.com/ericsnowcurrently/multi-core-python

  • PEP 554
  • resolve bugs
  • deal with C globals
  • move some runtime state into the interpreter state
    • including the GIL

105

106 of 112

the project

https://github.com/ericsnowcurrently/multi-core-python

  • PEP 554
  • resolve bugs
  • deal with C globals
  • move some runtime state into the interpreter state
    • including the GIL

106

107 of 112

State At Different Layers

107

process ->

global runtime ->

interpreter ->

thread / stack / ceval

env vars

GIL

sys module

current frame

sockets

signal handlers

modules

stack depth

file handles

Py_AtExit() funcs

atexit handlers

“tracing”

signals

GC

fork handlers

hook: trace

(thread-local storage)

allocator (mem)

hook: eval_frame

hook: profile

...

objects

codecs

current exception

pending calls

context

“eval breaker”

...

108 of 112

Beneficial Side Effects

  • find bugs and deficiencies in runtime (e.g. init/fini)
  • motivation to fix them
  • clean-up in runtime implementation (incl. globals, C-API, header files)
  • reduce coupling between components in runtime implementation
  • encourage fewer static globals in C extension modules
  • (improve interpreter startup performance)
  • (improve object isolation (e.g. in memory))
  • ...

108

109 of 112

What’s Next?

  1. PEP 554, blockers, and per-interpreter GIL
  2. low-hanging fruit (optimization)
  3. deferred functionality

109

110 of 112

Thanks!

110

111 of 112

Thanks!

Questions?

111

112 of 112

Resources

112