1 of 21

A Retrospective* on My “Multi-core Python” Project

Eric Snow

Python Language Summit 2020

as presented (April 16): slides

2 of 21

  • collaboration
  • timeline
  • hard problems
  • current status (todo, blockers, estimates, community)

as presented (April 16): slides

3 of 21

Retrospective Summary (minus action items)

Good:

  • the project remained as tractable as expected
  • most of the tasks are beneficial to CPython on their own (as expected)
  • broad community support and excitement
  • many folks helping out
  • GH repo to coordinate collaboration
  • using many technologies and venues for comms
  • TalkPython podcast brought more attention than expected

Not so Good (Could be Improved):

  • collaboration would have benefited from a dedicated mailing list (or discourse group, etc.)
  • early solo efforts (up to 2016) led to burnout
  • lack of awareness/use of subinterpreters (historically) led to many small defects, problematic corner cases, and contrary design choices that require fixing now

Other Observations:

  • typical open source situation: noone wanted to tackle it so I did it (and then others joined in)
  • many offers of help, but about half were able to follow through (I'm thankful to both camps)
  • driving the collaborative effort has taken more time than expected
  • my objectives ended up overlapping with those of several other projects (e.g. Facebook)
  • mentoring slowed things down a lot but was more than worth it!

as presented (April 16): slides

4 of 21

Collaborators

Major:

  • Nick Coghlan
  • Victor Stinner
  • Emily Morehouse *
  • Petr Viktorin
  • Joannah Nanjekye
  • Dino Viehland
  • Eddie Elizondo
  • Pablo Galindo

Minor:

  • Yury Selivanov
  • Steve Dower
  • Kyle Stanley *
  • Maciej Szulik
  • Vinay Sajip
  • Serhiy Storchaka
  • Davin Potts
  • Stefan Behnel
  • Dong-hee Na *
  • Lewis Gaul
  • Phil Connell
  • Ben Edwards
  • Christian Heimes
  • Dimiter Naydenov

Sorry if I've missed anyone!

Community / Feedback:

  • Graham Dumpleton
  • Travis Oliphant
  • Matt Rocklin
  • Sturla Molden
  • Nathaniel Smith
  • Greg Smith
  • Thomas Wouters
  • Marc-André Lemburg
  • Brett Cannon
  • Barry Warsaw
  • Neil Schemenauer
  • Raymond Hettinger
  • Trent Nelson
  • Antoine Pitrou
  • R. David Murray
  • Stefan Krah
  • Larry Hastings
  • Lisa Roach
  • Carl Shapiro
  • Michael Kennedy
  • many others...

Obviously:

  • Guido van Rossum

as presented (April 16): slides

5 of 21

Collaboration Tools / Venues

  • https://github.com/ericsnowcurrently/multi-core-python
    • issue tracker
    • wiki
  • bugs.python.org
  • https://github.com/python/cpython
  • mailing lists
  • email (no list)
  • twitter
  • video chat
  • (TalkPython podcast)
  • PyCon
  • Core Sprint

as presented (April 16): slides

6 of 21

Timeline (before my project started)

  • 1989 - Python is born
  • 1997 (1.5) - subinterpreters add to runtime and C-API
  • 2003 (2.3) - PEP 311: Gilstate API (explicitly ignores subinterpreter support)
  • <2007 - mod_wsgi born (Graham Dumpleton)
  • 2007 (3.0) - PEP 3121: Extension Module Initialization and Finalization
  • 2008 - PEP "583": A Concurrency Memory Model for Python
  • 2008 - I start getting involved in core development
  • 2009 (3.1) - PEP 384: stable ABI
  • 2012 - I become a core developer
  • 2013 - async in Python
  • 2013 - Trent Nelson's "pyparallel" project

as presented (April 16): slides

7 of 21

Timeline (before I burnt out)

  • 2014 (Oct) - start seriously thinking about (C)Python's multi-core story
  • 2015 - start gathering notes and investigating possible solutions
  • 2015 - decision to take the subinterpreters approach
  • 2015 - [Petr Viktorin] PEP 489: Multi-phase extension module initialization
  • 2015 - discussions on import-sig leading to PEP 573
  • 2015 - many email threads about subinterpreters
  • 2015 - hacking/iterating on new ext. module to expose subinterpreters C-API

as presented (April 16): slides

8 of 21

Timeline (during my hiatus)

  • 2016 - taking a long break (burnt out!)
  • 2016 - working on other stuff (C OrderedDict, PEPs 468, 520)
  • 2016 - [Larry Hastings] "gilectomy" project
  • 2016 - good discussions at inaugural core dev sprint in Sunnyvale

as presented (April 16): slides

9 of 21

Timeline (back on track)

  • 2017 - revived!
  • 2017 - proof-of-concept extension module to expose subinterpreters C-API
  • 2017 - created PEP 554
  • 2017 - lots of discussion about subinterpreters
  • 2017 - [OpenStack Ceph] starts using subinterpreters
  • 2017 - merged implementation of PEP 432 (only restructuring & private C-API)
  • 2017 - added Include/internal/
  • 2017 - consolidated much global state into a new PyRuntimeState struct
  • 2017 - good discussions at 2nd core dev sprint in Sunnyvale
  • 2017 - new job at Microsoft (VS Code Python extension); 20% time for CPython

as presented (April 16): slides

10 of 21

Timeline (accelerating)

  • 2018 - merged mostly complete low-level PEP 554 extension for use in Lib/test/
  • 2018 - working through problems in subinterpreters with "the Night's Watch"
  • 2018 - [JEP] starts using subinterpreters
  • 2018 - core devs starting to get serious about solving problems with C-API
  • 2018 - lots of feedback and excitement at PyCon (e.g. Travis @ extensions)
  • 2018 - start mentoring Emily
  • 2018 - starting effort to move stuff from PyRuntimeState to PyInterpreterState
  • 2018 - had to partially revert pending calls change due to daemon threads 😝️
  • 2018 - new github repo (publish notes, track progress, & manage collaboration)
  • 2018 - (Guido steps down) core dev sprint dominated by governance discussion
  • 2018 - my dear wife had a baby
  • 2018 - focusing effort mostly on remaining PEP 554 implementation work

as presented (April 16): slides

11 of 21

Timeline (cruising)

  • 2019 - lots of effort (mostly Victor) to move things out of the stable/public C-API
  • 2019 - starting collaboration with Facebook/Instagram folks
  • 2019 - lots of other collaborators
  • 2019 - request to steering council for BDFL-delegate for PEP 554 (got Antoine)
  • 2019 - PyCon: gave talk; continued community support and excitement
  • 2019 - (at PyCon) switched personal effort to moving GIL to PyInterpreterState
  • 2019 - missed 3.8 feature freeze 😥️
  • 2019 - [Davin Potts] shared memory support in multiprocessing module
  • 2019 - started work on tool to identify globals in Python C code (and test to fail)
  • 2019 - lots of great discussion at core dev sprint in London

as presented (April 16): slides

12 of 21

Timeline (wrapping things up?)

  • 2020 - spending too long working on c-analyzer tool
  • 2020 - [Joannah] high-level PEP 554 implementation effectively ready to go
  • 2020 - (CoViD-19) 😷️
  • 2020 - (PyCon goes online-only) language summit remotely, no sprints 😥️

as presented (April 16): slides

13 of 21

Timeline (related PEPs)

  • 2003 (2.3) - PEP 311: Gilstate API
  • 2007 (3.0) - PEP 3121: Extension Module Initialization and Finalization
  • 2009 (3.1) - PEP 384: stable ABI
  • 2012 - PEP 432: Restructuring the CPython startup sequence
  • 2015 - PEP 489: Multi-phase extension module initialization
  • 2017 - PEP 554: Multiple Interpreters in the Stdlib
  • 2018 - PEP 574: Pickle protocol 5 with out-of-band data
  • 2018 - PEP 573: Module State Access from C Extension Methods
  • 2018 - PEP 579: Refactoring C functions and methods
  • 2019 - PEP 590: Vectorcall: a fast calling protocol for CPython (579 / 575 / 576 / 580)
  • 2019 - PEP 587 -- Python Initialization Configuration

as presented (April 16): slides

14 of 21

Hard Problems

  • passing data safely between subinterpreters
  • C global variables (statics, module-level)
    • singletons
    • static types
    • etc.
  • so many globals!
  • globals!!!
  • per-interpreter allocators
  • per-interpreter GIL (hard due to exposed concurrency problems)
  • interpreter initialization & finalization
  • runtime finalization
  • daemon threads! ☹️😭️😤️😝️

as presented (April 16): slides

15 of 21

Current Status

  • todo
  • blockers
  • estimates
  • community

as presented (April 16): slides

16 of 21

Current Status: todo

  • globals (ultimately, a per-interpreter GIL)
    • for extension modules, move to module state (PEPs 3121, 489, 573, etc.)
    • (mostly) consolidate everything else to PyRuntimeState (or PyInterpreterState/PyThreadState)
    • figure out what to do with static types, singletons (e.g. None), freelists, etc.
    • per-interpreter (move as much as possible from PyRuntimeState to PyInterpreterState)
      • (n-1) allocators
      • (n) GIL
  • PEP 554
    • get it accepted
    • get high-level impl. merged
    • impl: buffers in channels
    • impl: channel_send_wait()
    • impl: exception propagation

as presented (April 16): slides

17 of 21

Current Status: blockers

  • PEP 554 acceptance
  • time and effort

as presented (April 16): slides

18 of 21

Current Status: estimates

  • per-interpreter GIL: 3.10 for sure, but probably not 3.9; still looking tractable
  • PEP 554: 3.9, if accepted in time

as presented (April 16): slides

19 of 21

Current Status: community

  • broad support for both subinterpreters and not sharing GIL between them
  • impact to extension authors (likely relatively minimal, per Travis Oliphant)
  • PEP 554 without splitting out GIL? (will users expect parallelism?)
  • already using subinterpreters via C-API:
    • mod_wsgi
    • JEP
    • Openstack Ceph
    • ...

as presented (April 16): slides

20 of 21

Thank You!

21 of 21