1 of 11

Understanding Python Memory Management

Chelsea Dole

Data Engineer

2 of 11

What is “memory”, and what manages it in Python?

  • RAM (“Random Access Memory”) is used by applications on your computer for short-term, fast-access data storage

  • In CPython, the “Python memory manager” handles the allocation and deallocation of memory used for objects in your programs
    • Creating a variable
    • Utilizing a data structure, like a dictionary or list
    • “Call stack frames”

3 of 11

“test”

0x1045f8b70

0x1045adaf0

0x1022c6934

...

TYPE: int, VALUE: 1000, REF COUNT: 1

  • Variable — a “label” referencing an object in memory

4 of 11

“test”

0x1045f8b70

0x1045adaf0

0x1022c6934

...

TYPE: int, VALUE: 1000, REF COUNT: 1

“another_test”

TYPE: int, VALUE: 1000, REF COUNT: 2

  • Variable — a “label” referencing an object in memory

5 of 11

So, what was that “ref count” thing?

  • Reference count — Python’s way of tracking the number of “labels” referencing an object in memory.

Increasing Ref Count:

Decreasing Ref Count:

1000

6 of 11

Garbage Collection!

🎉

🎉

🚛

🚛

7 of 11

  • Garbage Collection — an automated memory management technique used in the source code of many languages (including Python!) to free up memory that is no longer in use
  • Static memory allocation
    • Memory manually managed by the developer, and allocated at compile time
    • NO garbage collection

Source code is faster and uses less memory. Bug prone for developers.

Source code is more complex, and uses more memory. Easier for developers.

  • Dynamic memory allocation
    • Memory never managed directly by the developer, and allocated by the Python Memory Manager at runtime
    • Garbage collection!

8 of 11

However, ref counting isn’t reliable enough to be Python’s exclusive method of garbage collection

  1. Python keeps track of all objects with ref count > 0 by age in three “generation buckets
  2. Once a bucket reaches a certain size, a “Mark and Sweep” algorithm is run on it, and every generation below it
  3. ✍️ Check all objects in the generation bucket, and “mark” all reachable objects
  4. 🧹Delete everything that’s not “marked”
  5. 🎓Promote all surviving objects to the next generation

Solution: generational “Mark and Sweep” algorithm

9 of 11

  1. Reference counting

  • Generational “Mark and Sweep”algorithm

10 of 11

How a language chooses to manage memory is a major functional and philosophical choice

  • Speed, or ease of use for developers?
  • Clarity of source code, or clarity of language syntax?

Python chooses to maintain a complex and relatively slow memory management system internally, because it enables Python to be simple, beautiful, and readable for developers.

11 of 11

Thank you!