Python Concurrency
… and parallelism.
Who am I?
Hi! My name is Santiago Basulto…
What we’ll see during this tutorial
What this tutorial is NOT about
Martelli Model of Scalability
Why do we need to learn concurrent programming?
Computer Architecture
Back to the basics
The von Neumann architecture
Example of code accessing CPU, RAM or I/O
# Data stored in memory
x = 1
# Calculation done in CPU
x += 3
# I/O (write to a file)
with open('res.txt', 'w') as fp:
fp.write(f"The value is {x}")
# I/O (print to screen)
print(x)
Access time of different resources:
CPU Cycle | 1 second |
RAM Access | 4 minutes |
Generic SSD Access | 1.5–4 days |
Hard Drive access | 1-9 months |
Network Request SF -> NYC | 5 years |
Network Request SF -> Hong Kong | 11 years |
(Expressed in human relative times)
Access time of different resources:
The
Operating System
The guardian of the system
The OS is the one protecting hardware
x = 1
x += 3
with open('res.txt', 'w') as fp:
fp.write(f"The value is {x}")
print(x)
OS
The OS is the one protecting hardware
x = 1
x += 3
with open('res.txt', 'w') as fp:
fp.write(f"The value is {x}")
print(x)
OS
The Process
How computers run programs
What you write:
# Data
x = 1
# Calculation
x += 3
# I/O
with open('res.txt', 'w') as fp:
fp.write(f"The value is {x}")
# I/O
print(x)
What actually happens:
Your OS wraps your code in a special structure called Process.
A process holds the state of the execution of that given “instance” of the program running.
That’s why we can have several instances of the same program running at the same time.
x = 1
x += 3
with open('res.txt', 'w') as fp:
fp.write(f"The value is {x}")
print(x)
Your code:
Allocated RAM
x = 1 4
fp = <file open #1893>
Local variables:
And more...
The Process
We can run several instances of the same program
(at the same time)
Which results in multiple processes created:
Which results in multiple processes created:
Each process has a different Process ID.
Process Concurrency
Running multiple process, “at the same time”
Let’s start by assuming that we have only 1 CPU…
How many processes can we run
at the same time?
# 1 #
Just 1! A single CPU can take care of only 1 process at a time.
That’s the max.
But even with 1-CPU computers, it felt like there were multiple things happening at the same time.
How was it possible?
Time Slicing
Every OS includes a scheduler, which is in charge of administering CPU time for running processes.
The OS Scheduler
P1
P2
P3
This is what we call:
Concurrency
Several processes are “running” but they share the same CPU.
So, when can we talk about
Parallelism?
Only in multi-core systems, when there are two “things” actually running at the same time.
In a 2-core system
P1
CPU 1
CPU 2
CPU 1
P2
P3
CPU 2
2
CPU 1
2
CPU 2
CPU 2
In a 2-core system
P1
P2
P3
Parallelism
Parallelism
In a 2-core system
P1
P2
P3
CPU 1 is idle
Wait a minute..
How does the OS Scheduler decide which processes get CPU time?
Let’s look again at our first time slicing example.
There’s something strange with P2:
It just gets short bursts of CPU. But why?
P1
P2
P3
P2 is probably a process that we call:
I/O Bound
Remember? Access time of different resources:
CPU Cycle | 1 second |
RAM Access | 4 minutes |
Generic SSD Access | 1.5–4 days |
Hard Drive access | 1-9 months |
Network Request SF -> NYC | 5 years |
Network Request SF -> Hong Kong | 11 years |
I/O tasks
(they’re very slow)
When a process requests access to I/O, the OS will “unschedule” it, giving another process CPU time.
This is because I/O is slow. Instead of waiting idle until it finishes, it can do other stuff on the side.
Making our programs concurrent.
We need just one more concept in order to jump into Python concurrency.
“Intra-program” concurrency.
How can you make your own program concurrent?
Let’s say we’re dealing with an I/O bound program.
Example:
Our program starts by pulling data from 3 different websites.
Once it has fetched the data, it does some simple computation with all of it.
# program start
d1 = get_website_1() # 2 secs
d2 = get_website_2() # 2 secs
d3 = get_website_3() # 2 secs
combine(d1, d2, d3) # very fast
The problem is that, each network request blocks for 2 seconds.
The total time is >= 6 seconds.
There was a lot of idle time.
# program start
d1 = get_website_1() # 2 secs
d2 = get_website_2() # 2 secs
d3 = get_website_3() # 2 secs
combine(d1, d2, d3) # very fast
Total execution time >= 6 seconds!
A visual example: sequential execution
Start
Time
www1
www2
www3
+2 secs
+2 secs
+2 secs
Processing
(very fast)
Total
>= 6 secs
Multithreading
A mechanism to make our own programs concurrent (and potentially parallel).
A multithreaded example
Start
Time
Total
~ 2 secs
www1
www3
www2
Processing
(very fast)
+2 secs
How would this program look like, in code?
Something like this:
We’re now “spawning” the 3 network requests at the same time.
And blocking until all of them are done.
We then proceed with the calculations.
# program start
d1, d2, d3 = run_concurrently(
get_website_1,
get_website_2,
get_website_3) # 2 secs
combine(d1, d2, d3) # very fast
Python Threads
Python includes a built-in Threading library:
threading
Example: 1. Thread Basics.ipynb