Threads
CS-446/646
C. Papachristos
Robotic Workers (RoboWork) Lab
University of Nevada, Reno
Remember: Processes
A Process is a Heavyweight Abstraction
Includes many things
Creating a new Process is costly
Communicating between Processes is also costly
CS446/646 C. Papachristos
Remember: Processes
Concurrent Programs
Recall simple Web server example
To execute these Programs we need to:
This is inefficient:
CS446/646 C. Papachristos
Really,
Remember: Processes
Rethinking Processes
Core Idea: Think separately about the concept of a Process, and a State of Execution
Address Space, Privileges, Resources, etc.
CS446/646 C. Papachristos
PC, SP, Registers
Threads
Threads & Processes
Modern OSs have implementations of the concepts of Processes and Threads
Thread
Process
A Thread is bound to a single Process
A Thread now becomes the unit of Scheduling
CS446/646 C. Papachristos
Threads
Thread Control Block (TCB)
The required data structure (implementation specific)
CS446/646 C. Papachristos
struct thread {
tid_t tid; /* Thread identifier. */
enum thread_status status; /* Thread state. */
char name[16]; /* Name (for debugging purposes). */
uint8_t *stack; /* Saved stack pointer. */
int priority; /* Priority. */
struct list_elem allelem; /* List element for all threads list. */
struct list_elem elem; /* List element. */
unsigned magic; /* Detects stack overflow. */
};
Threads
Threads in a Process
CS446/646 C. Papachristos
Threads
Threads in a Process
�
CS446/646 C. Papachristos
Threads
Threads & Processes
Why?
Multithreading can be very useful
So, Multithreading is even useful on a single-core Processor
CS446/646 C. Papachristos
Threads
Multithreaded Concurrent Server
Using fork() to create a new Process to handle requests is an overkill:
CS446/646 C. Papachristos
for(;;) {
int client_sock = accept(server_sock, addr, addrlen);
if ((child_pid = fork()) == 0) { // Child Process
close(server_sock);
handle_request( client_sock );
} else { // Parent Process
// back to for(;;)
}
}
void handle_request(int sock) {
// Process client request
close(sock);
}
Threads
Multithreaded Concurrent Server
Instead, create a new Thread for each request:
CS446/646 C. Papachristos
void run_web_server() {
for(;;) {
int client_sock = accept(server_sock, addr, addrlen);
thread_create_interface(handle_request, client_sock);
}
}
void handle_request(int sock) {
// Process request
close(sock);
}
?
Note: This will now execute in a new Thread,�i.e. will have its own State of execution and can be separately Scheduled
Threads
Thread API (Unix)
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine) (void *), void *arg);
int pthread_join(pthread_t thread, void **retval);
CS446/646 C. Papachristos
Threads
Thread API (Unix)
void pthread_exit(void *retval);
CS446/646 C. Papachristos
Threads
Thread API (Unix) – Example 1
CS446/646 C. Papachristos
int main() {
pthread_t t1, t2;
pthread_create(&t1, NULL, thread_fn, (void*)1);
pthread_create(&t2, NULL, thread_fn, (void*)2);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
return 0;
}
void* thread_fn(void *arg) {
int tid = (int)arg;
printf("thread %d\n", tid);
printf("is now\n");
usleep(1);
printf("running!\n");
pthread_exit(NULL);
}
$ gcc -o threads threads.c -pthread
$ ./threads
thread 1
is now
thread 2
is now
running!
running!
Note: stdout is by default Line-Buffered
Threads
Thread API (Unix) – Example 1 (Better)
CS446/646 C. Papachristos
int main() {
pthread_t t1, t2;
thread_data t1_d, t2_d;
t1_d.id = 1;
t2_d.id = 2;
pthread_create(&t1, NULL, thread_fn, (void*)&t1_d);
pthread_create(&t2, NULL, thread_fn, (void*)&t2_d);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
return 0;
}
void* thread_fn(void *arg) {
thread_data t_d = *(thread_data *)arg;
printf("thread %d is now running!\n", td.id);
pthread_exit(NULL);
}
$ gcc -o threads threads.c -pthread
$ ./threads
thread 1 is now running!
thread 2 is now running!
Note: stdout is by default Line-Buffered
Threads
Thread Implementation
A general prototype:
… thread_create( … , start_routine, args);
CS446/646 C. Papachristos
Threads
Thread Implementation
User-Level Threads:
Kernel-Level Threads:
CS446/646 C. Papachristos
Threads
Kernel-Level Threads
CS446/646 C. Papachristos
The Thread competes for resources with all other Threads in all Processes on the system that are in the same Scheduling Allocation Domain (a group of one or more Processors).
Threads
Kernel-Level Thread Limitations
CS446/646 C. Papachristos
Threads
User-Level Threads
CS446/646 C. Papachristos
The Thread competes for resources with all other Threads in the same Process that were also created with the PTHREAD_SCOPE_PROCESS contention scope, and are scheduled relative to other�Threads in the Process according to their Scheduling policy and priority.
Threads
User-Level Thread Limitations
How to solve this?
CS446/646 C. Papachristos
Threads
Implementing User-Level Threads (upcoming Homework Assignment)
(All performed/accounted-for at User-Space :)
CS446/646 C. Papachristos
Threads
Implementing User-Level Threads (upcoming Homework Assignment)
(All performed/accounted-for at User-Space :)
CS446/646 C. Papachristos
Threads
Kernel-Level vs User-Level Threads
Kernel-Level Threads
User-level Threads
Understanding their differences is important
CS446/646 C. Papachristos
Threads
Multiplexing User-Level Threads
Use both Kernel-Level and User-Level Threads
Big picture:
Different mappings exist, representing different tradeoffs
CS446/646 C. Papachristos
Threads
Multiplexing User-Level Threads
One-to-One ( 1 : 1 Threading )
Note 1: The PTHREAD_SCOPE_SYSTEM contention scope typically indicates that a User-Space Thread is bound directly to a single Kernel-Scheduling entity. This is the case on Linux for the obsolete LinuxThreads implementation and the modern NPTL implementation, which are both 1:1 Threading implementations.�Linux supports PTHREAD_SCOPE_SYSTEM, but not PTHREAD_SCOPE_PROCESS.�
Note 2: There is no difference between Thread and Process in Linux. There only exists the concept of Thread Group Identifier (TGID) which allows to determine what is shared and what is not between created Threads.
CS446/646 C. Papachristos
Threads
Multiplexing User-Level Threads
Many-to-One ( n : 1 Threading )
Portable
CS446/646 C. Papachristos
Threads
CS446/646 C. Papachristos
Threads
Multiplexing User-Level Threads
Two-Level
CS446/646 C. Papachristos
Threads
CS446/646 C. Papachristos
Threads
Thread Miscellanea
Note: Potentially dangerous to call fork() from a Multi-Threaded Process as any Mutexes (Synchronization Mechanisms, more on these later on…) held in non-calling Threads will be forever locked in the Child Process.
CS446/646 C. Papachristos
fork(2): A Process shall be created with a single Thread. If a Multi-Threaded Process calls fork(), the new Process shall contain a replica of the calling Thread and its entire Address Space, possibly including the states of Mutexes (more on these later…) and other Resources. Consequently, to avoid errors, the Child Process may only execute Async-Signal-Safe operations until such time as one of the exec functions is called.
Threads
Thread Miscellanea
CS446/646 C. Papachristos
POSIX.1 distinguishes the notions of Signals that are directed to the Process as a whole and Signals that are directed to individual Threads. According to POSIX.1, a Process-directed Signal (sent using kill() for example) should be handled by a single, arbitrarily selected Thread within the Process.
signal(7): A Signal may be generated (and thus pending) for a Process as a whole (e.g. when sent using kill()) or for a specific Thread (e.g. certain Signals, such as SIGSEGV and SIGFPE, generated as a consequence of executing a specific machine-language instruction are Thread-directed, as are Signals targeted at a specific Thread using pthread_kill()). A Process-directed Signal may be delivered to any one of the Threads that does not currently have the Signal blocked. If more than one of the Threads has the Signal unblocked, then the Kernel chooses an arbitrary Thread to which to deliver the Signal.
Threads
Non-Preemptive Thread Scheduling
int sched_yield (void);
Causes the calling Thread to relinquish the CPU. The Thread is moved to the end of the Queue for its Static Priority (more on that in later Scheduling Lecture) and a new Thread gets to run.
Note: Strategic calls to sched_yield() (! where applicable !) can improve performance by giving other Threads a chance to run … Avoid calling sched_yield() unnecessarily or inappropriately (e.g., when resources needed by other schedulable Threads are still held by the caller), since doing so will result in unnecessary Context Switches, which will degrade system performance.
CS446/646 C. Papachristos
while (1) {
printf("ping\n");
sched_yield();
}
while (1) {
printf("pong\n");
sched_yield();
}
Ping Thread
Pong Thread
Threads
Non-Preemptive Thread Scheduling
int sched_yield (void);
Causes the calling Thread to relinquish the CPU. The Thread is moved to the end of the Queue for its Static Priority (more on that in later Scheduling Lecture) and a new Thread gets to run.
Ping / Pong execution trace:
printf("ping\n");
sched_yield();
printf("pong\n");
sched_yield();
…
CS446/646 C. Papachristos
Note: sched_yield() is intended for use with Real-Time Scheduling policies (SCHED_FIFO, SCHED_RR, more on these later…). Its use of with non-deterministic Scheduling policies such as SCHED_OTHER is unspecified and very likely means your application design is broken.
Threads
Preemptive Thread Scheduling
Preemptive Scheduling
CS446/646 C. Papachristos
Threads
Thread Context Switching
Context Switching in the same Process, such that Virtual Memory Address Space is not switched
Note: Without additional considerations (e.g. Weak Affinity) Thread Context Switching can lead to Memory Pollution
The Context Switch Routine does all of the magic
All this has to be done in Assembly language
CS446/646 C. Papachristos
Threads
Background: Calling Conventions
A standard on how functions should be implemented and called by the Machine
Why ?
CS446/646 C. Papachristos
Threads
Background: Calling Conventions
x86 Calling Convention – Stack setup
CS446/646 C. Papachristos
SP points to “bottom” of Stack (lower Virtual Addresses). As parameters are pushed on the Stack, the SP advances (downwards).
During a method call, variables are�pushed on the Stack and the SP advances more. Parameters passed to the subroutine constantly change their relative offset w.r.t. the SP.
We also need a FP that points to the start�(Base) of the Stack Frame and does not�move during the subroutine call.�Parameters passed to the subroutine remain at a constant offset w.r.t. the FP.
Threads
Background: Calling Conventions
Registers divided into 2 groups:
CS446/646 C. Papachristos
Threads
Background: Calling Conventions
Registers divided into 2 groups:
CS446/646 C. Papachristos
foo(): Save active Caller Registers
CALL foo_label (pushes PC)
Restore Caller Registers
Save used Callee Registers
… do stuff …
return: Restore Callee Saved Registers
RET (returns to Caller)
Threads
Remember: Pintos Thread Implementation
Pintos Thread Control Block (TCB) Structure:
CS446/646 C. Papachristos
struct thread {
tid_t tid;
enum thread_status status;
char name[16];
uint8_t *stack; /* Saved stack pointer. */
int priority;
struct list_elem allelem;
struct list_elem elem;
unsigned magic; /* Detects stack overflow. */
};
stack field is where the value of %esp (Stack Pointer (SP)) will�be saved when the Thread is Preempted
uint32_t thread_stack_ofs = offsetof(struct thread, stack);
Remember: Threads have no owned Heap
Threads
Example: Pintos Thread Context Switching
Pintos C declaration:
struct thread *switch_threads (struct thread *cur, struct thread *next);
CS446/646 C. Papachristos
(Remember: Calling Convention specifies pushing args (i.e. next, cur) for called function (i.e. switch_threads), then return address before making a call)
Threads
Example: Pintos Thread Context Switching
CS446/646 C. Papachristos
pushl %ebx #base address (in Data Segment - DS)
pushl %ebp #current frame base pointer
pushl %esi #sequential access instr. source index (in DS)
pushl %edi #sequential access instr. destination index (in ES)
Save Callee-Saved Regs of the current Thread
Threads
Example: Pintos Thread Context Switching
CS446/646 C. Papachristos
thread_stack_ofs is the offset of stack field in thread struct
pushl %ebx
pushl %ebp
pushl %esi
pushl %edi
mov thread_stack_ofs, %edx
Threads
Example: Pintos Thread Context Switching
CS446/646 C. Papachristos
movl SWITCH_CUR(%esp), %eax %eax = cur
movl %esp, ( %eax , %edx ,1) cur -> stack = %esp
movl SWITCH_NEXT(%esp), %ecx %ecx = next
movl ( %ecx , %edx ,1), %esp %esp = next -> stack
pushl %ebx
pushl %ebp
pushl %esi
pushl %edi
mov thread_stack_ofs, %edx
SWITCH_CUR(%esp) is equivalent to 20(%esp), or address esp+20, which holds the call arg cur
SWITCH_NEXT(%esp) is equivalent to 24(%esp), or address esp+24 , which holds the call arg next
Save Stack Pointer of the current Thread
Load Stack Pointer of the next Thread
Threads
Example: Pintos Thread Context Switching
switch_threads into the next Thread’s frame (at the time that itself�previously had called switch_threads)
CS446/646 C. Papachristos
pushl %ebx
pushl %ebp
pushl %esi
pushl %edi
mov thread_stack_ofs, %edx
movl SWITCH_CUR(%esp), %eax
movl %esp, (%eax,%edx,1)
movl SWITCH_NEXT(%esp), %ecx
movl (%ecx,%edx,1), %esp
popl %edi
popl %esi
popl %ebp
popl %ebx
ret
Restore Callee-Saved Regs of the next Thread
return from switch_threads() in the next Thread (call happened when it was Context-Switched)
At this point esp has changed to the next Thread’s Stack
Threads
Example: Pintos Thread Context Switching
CS446/646 C. Papachristos
pushl %ebx
pushl %ebp
pushl %esi
pushl %edi
mov thread_stack_ofs, %edx
movl SWITCH_CUR(%esp), %eax
movl %esp, (%eax,%edx,1)
movl SWITCH_NEXT(%esp), %ecx
movl (%ecx,%edx,1), %esp
popl %edi
popl %esi
popl %ebp
popl %ebx
ret
Threads
CS446/646 C. Papachristos
Next Lecture Reading Preparation
Operating Systems – Three Easy Pieces (https://pages.cs.wisc.edu/~remzi/OSTEP/)
Concurrency
CS446/646 C. Papachristos
Time for Questions !
CS-446/646
CS446/646 C. Papachristos