KernelThreadSanitizer (KTSAN)
a data race detector for the Linux kernel
Andrey Konovalov
November 23rd 2015
Userspace tools
Userspace TSan
ThreadSanitizer - a fast data race detector for C/C++ and Go
KernelThreadSanitizer (KTSAN)
Data race
Two parts
Compiler instrumentation
void foo(int *p) {
__tsan_func_entry(__builtin_return_address(0));
__tsan_write4(p);
*p = 42;
__tsan_func_exit();
}
void foo(int *p) {
*p = 42;
}
Runtime part
Race detection algorithm
Thread time
void foo(int *x) { | time = 1 |
mutex_lock(&lock); | time = 2 |
*x = 1; | time = 3 |
mutex_unlock(&lock); | time = 4 |
} | time = 5 |
Vector clocks
T_i[0] | ... | T_i[i] | ... | T_i[j] | ... | T_i[N] |
T[0] | ... | ... | T[i] | ... | ... | T[N] |
Vector clocks update: memory access
thr->vc[thr->id]++
Vector clocks update: mutex lock
for (i = 0; i <= N; i++)
thr->vc[i] = max(thr->vc[i], mtx->vc[i])
| | | | | |
| | | | | |
max
max
Mutex
Thread
Vector clocks update: mutex unlock
for (i = 0; i <= N; i++)
mtx->vc[i] = max(mtx->vc[i], thr->vc[i])
| | | | | |
| | | | | |
max
max
Thread
Mutex
Shadow state
|
|
|
|
|
|
|
|
Memory cell
(8 bytes)
Thread ID |
Timestamp |
Position |
IsWrite |
Thread ID |
Timestamp |
Position |
IsWrite |
Thread ID |
Timestamp |
Position |
IsWrite |
Thread ID |
Timestamp |
Position |
IsWrite |
Shadow cells
Shadow cell
Shadow cell - represents one access
to some of the bytes in 8-byte word
Thread ID |
Timestamp |
Position |
IsWrite |
Example: first access
|
|
|
|
|
|
|
|
Memory
T1 |
TS1 |
0:2 |
W |
|
|
|
|
|
|
|
|
Shadow
|
|
|
|
Write in T1
Example: second access
|
|
|
|
|
|
|
|
Memory
T1 |
TS1 |
0:2 |
W |
|
|
|
|
|
|
|
|
Shadow
T2 |
TS2 |
4:4 |
R |
Read in T2
Example: third access
|
|
|
|
|
|
|
|
Memory
T1 |
TS1 |
0:2 |
W |
T3 |
TS3 |
0:4 |
R |
|
|
|
|
Shadow
T2 |
TS2 |
4:4 |
R |
Read in T3
Example: race?
T1 |
TS1 |
0:2 |
W |
T3 |
TS3 |
0:4 |
R |
Shadow
T2 |
TS2 |
4:4 |
R |
Example: synchronized?
Stack traces
Trace
Shadow memory
Kernel synchronization primitives
Mutex annotations
void __sched mutex_lock(struct mutex *lock)
{
might_sleep();
ktsan_mtx_pre_lock(lock, true, false);
__mutex_fastpath_lock(&lock->count, __mutex_lock_slowpath);
mutex_set_owner(lock);
// Current thread acquires mutex: thr->vc[i] = max(thr->vc[i], mtx->vc[i])
ktsan_mtx_post_lock(lock, true, false, true);
}
void __sched mutex_unlock(struct mutex *lock)
{
// Current thread releases mutex: mtx->vc[i] = max(mtx->vc[i], thr->vc[i])
ktsan_mtx_pre_unlock(lock, true);
__mutex_fastpath_unlock(&lock->count, __mutex_unlock_slowpath);
ktsan_mtx_post_unlock(lock, true);
}
Per-cpu annotations
Kernel atomics API
Atomic annotations
Atomic annotations, continued
Headaches
Scheduler
Interrupts
Memory consumption
Per-CPU mode
Per-CPU mode
Benign data races in the kernel
Stubborn kernel developers
“The Linux kernel largely ignores the C memory model definition, and relies on practical compiler behavior.
So-called 'data races' are common in kernel code.”
Peter Hurley (https://lkml.org/lkml/2015/8/25/707)
“Also if the following is true:
> As the consequence C compilers stopped guarantying that "word accesses are atomic".
a lot of stuff will break in the kernel. Maybe compilers should stop moving towards the lala land?”
Dmitry Torokhov (https://lkml.org/lkml/2015/9/4/547)
KTSAN status
Trophies
Report example
ThreadSanitizer: data-race in ipc_obtain_object_check
Read at 0xffff88047f810f68 of size 8 by thread 2749 on CPU 5:
[<ffffffff8147d84d>] ipc_obtain_object_check+0x7d/0xd0 ipc/util.c:621
[< inline >] msq_obtain_object_check ipc/msg.c:90
[<ffffffff8147e708>] msgctl_nolock.constprop.9+0x208/0x430 ipc/msg.c:480
[< inline >] SYSC_msgctl ipc/msg.c:538
[<ffffffff8147f061>] SyS_msgctl+0xa1/0xb0 ipc/msg.c:522
[<ffffffff81ee3e11>] entry_SYSCALL_64_fastpath+0x31/0x95 arch/x86/entry/entry_64.S:188
Previous write at 0xffff88047f810f68 of size 8 by thread 2755 on CPU 4:
[<ffffffff8147cf97>] ipc_addid+0x217/0x260 ipc/util.c:257
[<ffffffff8147eb4c>] newque+0xac/0x240 ipc/msg.c:141
[< inline >] ipcget_public ipc/util.c:355
[<ffffffff8147daa2>] ipcget+0x202/0x280 ipc/util.c:646
[< inline >] SYSC_msgget ipc/msg.c:255
[<ffffffff8147efaa>] SyS_msgget+0x7a/0x90 ipc/msg.c:241
[<ffffffff81ee3e11>] entry_SYSCALL_64_fastpath+0x31/0x95 arch/x86/entry/entry_64.S:188
Report example, continued
220 int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int size)
...
// Publish new (make it accessible to other threads)
240 id = idr_alloc(&ids->ipcs_idr, new,
241 (next_id < 0) ? 0 : ipcid_to_idx(next_id), 0,
242 GFP_NOWAIT);
...
// Initialize new
252 current_euid_egid(&euid, &egid);
253 new->cuid = new->uid = euid;
254 new->gid = new->cgid = egid;
Thank you!
Questions?
Andrey Konovalov, andreyknvl@gmail.com
Dmitry Vyukov, dvyukov@google.com