1 of 26

Lecture 24: Parallelism & Threading

CSE 373: Data Structures and Algorithms

1

CSE 373 25WI

CSE 373 25WI

1

2 of 26

Warm Up

Imagine again you are a chef in a restaurant, now you want to come up with as many ways as possible to speed up the way in which you are able to cook many different dishes all at one time?

Answer the Warm Up:

pollev.com/hschafer

CSE 373 25WI

2

3 of 26

Announcements

  • Please nominate your TAs for an award!
  • Almost done with assignments!
    • EX6 due tonight, EX7 due Tuesday of final’s week
    • P4 due Friday 3/14 at 11:59 pm
    • Late days available for all assignments except Exam Resubs
  • No office hours during finals week, so come in this week with any questions!

CSE 373 25WI

3

4 of 26

Concurrency vs Parallelism

§parallelism refers to running things simultaneously on separate resources (ex. Separate CPUs)

§concurrency refers to running multiple threads on a shared resources

§Concurrency is one person cooking multiple dishes at the same time.

§Parallelism is having multiple people (possibly cooking the same dish).

§Allows processes to run ‘in the background’

§Responsiveness – allow GUI to respond while computation happens

§CPU utilization – allow CPU to compute while waiting (waiting for data, for input)

§isolation – keep threads separate so errors in one don’t affect the others

compilers

CSE 373 25WI

4

5 of 26

malloc in C is like “new” in Java

Asks the OS to partition memory space in the current stack frame. We call this space in the frame the “heap”

Instructions that act on variables get transformed into machine code and stored at the bottom of memory

Global variables and static variables get stored in a special place in the frame since they need to stick around the entire process

Local variables get added to the “stack” within the frame when called then pop off the stack at the end of the method since we don’t need them anymore

CSE 373 25WI

5

6 of 26

What is a Thread?

  • You can think of a thread as a single sequence of tasks
    • Like an employee in a company who is focusing on doing one set of instructions
    • Smallest sequence of instructions that can be managed independently by a scheduler

  • A process is the combination of all related threads working together
    • If a thread is an employee, the process is the company
    • Sequential programming uses a single thread to execute all instructions one after the other
    • Multi-threaded processes separates instructions into multiple threads that can be programmed to work in unison

  • Threads each have their own dedicated resources on the RAM
    • Multi-threaded processes will have some shared resources across all the threads working together such as the files they are reading from

CSE 373 25WI

6

7 of 26

What is a Thread?

In most modern OS’s threads are the unit of scheduling.

  • Separate the concept of a process from the “thread of execution”
  • Threads are contained within a process
  • Usually called a thread, this is a sequential execution stream within a process

Cohabit the same RAM address space

  • Threads within a process see the same heap and globals and can communicate with each other through variables and memory
  • Each thread has its own call/instruction stack
  • But, they can interfere with each other – need synchronization for shared resources

Disadvantages:

  • If threads share data, you need locks or other synchronization
    • Very bug-prone and difficult to debug
  • Threads can introduce overhead
    • Lock contention, context switch overhead, and other issues
  • Need language support for threads

Advantages:

  • They execute concurrently like processes
  • You (mostly) write sequential-looking code
  • Threads can run in parallel if you have multiple CPUs/cores

CSE 373 25WI

7

8 of 26

What is a Thread?

Chrome treats tabs as different processes

CSE 373 25WI

8

9 of 26

Processes and Threads

Heap

Memory

Registers

Stack

Thread

Registers

Stack

Thread

Registers

Stack

Thread

Process

CSE 373 25WI

9

10 of 26

Sequential Programming

Only one “task” is being processed at a time

  • All other “tasks” queue up behind the first one
  • sequential programming demands finishing one “task” before starting the next one

Even while processing one query, the CPU is idle the vast majority of the time

  • It is blocked waiting for I/O to complete
  • Disk I/O can be very, very slow (10 million times slower …)
  • Like the chef waiting for the produce to be retrieved from the store

At most one I/O operation is being processed at a time

  • Missed opportunities to speed I/O up
  • Separate devices in parallel, better scheduling of a single device, etc.
  • Performance improvements can only be made by improving hardware
  • Moore’s Law (1965) - the observation that the number of transistors in an integrated circuit (IC) doubles about every two years
    • In September 2022, Nvidia CEO Jensen Huang considered Moore's law dead, while Intel CEO Pat Gelsinger was of the opposite view.

CSE 373 25WI

10

11 of 26

CSE 373 23SP

11

CSE 373 25WI

11

12 of 26

How does a computer execute a program?

  • Your operating system manages a pointer that iterates through the RAM stack frames to decide when to send instructions to your CPU
  • The instructions are sent to your computer’s CPU to execute
    • We used to have single core processors: one lane road
    • Now we often have 4 core processors: 4 lane highway
      • Invented by EEs at Stanford in 1998, commercially available in 2001
    • “2.6ghz clock speed” how many instructions each core can execute a second
      • An electrical second hand that triggers individual operations
    • Kasey’s computer: 13th Gen Intel(R) Core(TM) i7-1360P 2.20 GHz
      • 4 physical cores, 12 virtual

Core 1

Core 3

Core 2

Core 4

  • The amount of cores on your computer are the amount of things your computer can do at one time

CSE 373 25WI

12

13 of 26

How does the CPU handle tasks?

  • Sequential
    • One processor does one task at one time
    • One chef cooking a single dish all themselves
  • Parallelism
    • One task is split up and the work shared across multiple processors
    • Running things simultaneously on separate resources (ex. Separate CPUs)
      • Multiple things happening simultaneously
    • Parallelism is having multiple people cooking separate dishes in the kitchen at the same time
  • Concurrency
    • Multiple tasks are being executed using one set of resources
    • refers to running multiple threads on a shared resources (and how to synchronize access)
    • Concurrency is multiple people cooking the same dish at the same time

Core 1

Core 3

Core 2

Core 4

CSE 373 25WI

13

14 of 26

Parallelism

A search engine could run concurrently:

  • Example: Execute queries one at a time, but issue I/O requests against different files/disks simultaneously
    • Could read from several index files at once, processing the I/O results as they arrive
  • Example: Web server could execute multiple queries at the same time
    • While one is waiting for I/O, another can be executing on the CPU

Use multiple “workers” aka threads

  • As a query arrives, create a new thread to handle it
  • The thread reads the query from the network, issues read requests against files, assembles results and writes to the network
  • The thread uses blocking I/O; the thread alternates between consuming CPU cycles and blocking on I/O�
  • The OS context switches between threads
  • While one is blocked on I/O, another can use the CPU
  • Multiple threads I/O requests can be issued at once

CSE 373 25WI

14

15 of 26

Parallelism Example

Goal: find the sum of an array

Sequential approach: start at index 0 and sum one by one until the end of the array

Parallel approach: 4 processors will each find the sum of ¼ of the array then add those 4 results together

CSE 373 25WI

15

16 of 26

Parallelism Example Pseudocode

int sum(int[] arr) {

SumThread[] res = new SumThread[4];

len = arr.length;

for (int i = 0; i < 4; i++) {

res[i] = sumRange(arr, i*len/4, (i+1_*len/4);

res[i].start();

}

int ans = 0;

for (int i = 0; i < 4; i++) {

res[i].join();

ans += res[i].ans;

}

return ans;

}

SumThread sumRange(int[] arr, int lo, int high) {

result = 0;

for (int j = lo; j < hi; j++) {

result += arr[j];

}

return result;

}

join() causes program to pause until the other thread completes its run method to await its result

  • Without a join the other threads ans field may not have been filled yet

SumThread object is constructed with run() method override to indicate the work to happen when thread is triggered. In this case, summing up the values within the array.

start() creates a new thread and then triggers the run() method

CSE 373 25WI

16

17 of 26

Possible issues

  • Different machines have different numbers of processors
    • We shouldn’t assume the number of processors like we did in the example, but instead take in the thread count as a parameter
  • The Operating System is constantly doing time slicing
    • The OS is running many things at once, so we aren’t guaranteed that when we ask we will get simultaneous access to the number of processors we need
  • Depending on our problem, not all sub processes will take the same amount of time
    • Timing your threads to finish together at the right time can be very tricky!
  • race condition: when two computer program processes or threads attempt to access the same resource at the same time
      • If two threads try to change the state of data simultaneously bad things happen

CSE 373 25WI

17

18 of 26

Concurrency

Parallelism and concurrency are similar but technically different techniques

  • Parallelism: Using threads to “simultaneously” compute values
    • Example: Have 4 threads sum one quarter of the array
  • Concurrency: Parallelism with the added complexity of the threads reading/writing to the same memory “simultaneously
    • Example: Having multiple threads in a web application for updating a user’s Bank Account

CSE 373 25WI

18

19 of 26

Concurrency Challenges

  • With threads, you can’t control the order in which lines of code run or switch to other threads
  • Threads share objects on the heap
  • Can this cause a problem?

public class Bank {

private int amount;

...

public boolean withdraw(int amount) {

if (this.amount < amount) {

return false;

}

this.amount -= amount;

return true;

}

}

A: 1

if (this.amount < amount) {

B: 1

if (this.amount < amount) {

A: 3

this.amount -= amount;

1

2

�3

4

A: 4

return true;

B: 3

this.amount -= amount;

B: 4

return true;

B

account.withdraw(100);

A

account.withdraw(150);

Main

Bank acc = new Bank(200);

Thread A -> acc.withdraw(150);

Thread B -> acc.withdraw(100);

CSE 373 25WI

19

20 of 26

Concurrency

  • Concurrency issues arise because you can’t control the ordering in which threads run or access/update shared values.
    • The bug in the last slide is known as a race condition, when the behavior breaks because two threads interleaved in a bad ordering.
  • These bugs are very easy to write and hard to spot! Need to think carefully about considering all possible orderings!
  • Common solutions involve:
    • Don’t allow mutability! If you can’t mutate the data, then you won’t have a weird race condition like this. Many functional languages take this approach of immutability as a default to make supporting this easier.
    • Introduce the concept of locking

CSE 373 25WI

20

21 of 26

Locks / Mutexes

  • A lock is a way for threads to communicate with each other to ensure at most one thread is accessing some data at a time.
    • Essentially stops parallelism and only allows one at a time.
  • If done correctly, solves race conditions but can slow down your program (removes true parallelism since other threads have to wait).
    • The trick is balancing in finding the smallest regions necessary for locking/unlocking

public class Bank {

private int amount;

private Lock lock;

...

public boolean withdraw(int amount) {

lock.lock(); // Blocks until free

if (this.amount < amount) {

return false;

}

this.amount -= amount;

return true;

lock.unlock(); // Frees the lock

}

}

CSE 373 25WI

21

22 of 26

Is everything all good?

Locks are great for ensuring synchronization, but can lead to their own problems…

Suppose I had a program with two resources (e.g., a user account and a post they’ve made). And want separate locks for each to allow methods that only need to access one lock on one.

Are there any potential problems here?

public void editPost() {

userLock.lock();

postLock.lock();

// Edit the post

userLock.unlock();

postLock.unlock();

}

public void deletePost() {

postLock.lock();

userLock.lock();

// Delete the post

postLock.unlock();

userLock.unlock();

}

CSE 373 25WI

22

23 of 26

Deadlocking

  • It’s possible that in programs with multiple locks, to create a deadlock where no progress can be made.
    • Can be spotted easily in simple programs like our last slide, but in real systems often extremely hard to spot since you might have locks all over the code base for different resources.
    • Can also happen if a thread crashes unexpectedly without releasing its locks! Error handling and locks is very complex!
  • From the user’s perspective, the program will just wait forever happily pausing and waiting for the resources to become free (they never will)
  • Solutions: Attempts to automatically check deadlocks and ensure they don’t happen
    • See Rust and its borrowing system

CSE 373 25WI

23

24 of 26

Helpful Resources

CSE 373 25WI

24

25 of 26

Questions?

CSE 373 25WI

25

26 of 26

That’s all!

CSE 373 25WI

26