1 of 54

The Store-Order Consistency Testing Problem

for C-like Memory Models

Grace Tan (National University of Singapore)

advised by Umang Mathur (National University of Singapore)

in collaboration with Shankaranarayanan Krishna (IIT Bombay) and Andreas Pavlogiannis (Aarhus University)

1

2 of 54

Why study consistency testing?

2

3 of 54

Concurrency is everywhere

Moore’s law is over
Clock frequencies have stopped increasing
We need concurrency!

Multi-core CPUs
GPUs

... sadly, concurrency is very complicated

3

4 of 54

Simplified Dekker’s Algorithm (no unlock)

t0:� wants_to_enter[0] = true� while wants_to_enter[1] {� if turn != 0 {� wants_to_enter[0] = false� while turn != 0 {� // busy wait� }� wants_to_enter[0] = true� }� }�� // critical section

t1:� wants_to_enter[1] = true� while wants_to_enter[0] {� if turn != 1 {� wants_to_enter[1] = false� while turn != 1 {� // busy wait� }� wants_to_enter[1] = true� }� }�� // critical section

4

5 of 54

Simplified Dekker’s Algorithm (no unlock)

t0:� wants_to_enter[0] = true� while wants_to_enter[1] {� if turn != 0 {� wants_to_enter[0] = false� while turn != 0 {� // busy wait� }� wants_to_enter[0] = true� }� }�� // critical section

t1:� wants_to_enter[1] = true� while wants_to_enter[0] {� if turn != 1 {� wants_to_enter[1] = false� while turn != 1 {� // busy wait� }� wants_to_enter[1] = true� }� }�� // critical section

5

Textbooks: Correct under sequential consistency

Reality: Not sequentially consistent...

Need to consider weak memory!

x86

ARM

C / C++

CUDA

...

compiler passes

caches

reordering

parallelism

...

6 of 54

Q: Can this terminate?

t0:� wants_to_enter[0] = true� while wants_to_enter[1] {� if turn != 0 {� wants_to_enter[0] = false� while turn != 0 {� // busy wait� }� wants_to_enter[0] = true� }� }�� // critical section

t1:� wants_to_enter[1] = true� while wants_to_enter[0] {� if turn != 1 {� wants_to_enter[1] = false� while turn != 1 {� // busy wait� }� wants_to_enter[1] = true� }� }�� // critical section

6

Program

7 of 54

Q: Can this terminate?

t0:� wants_to_enter[0] = true� while wants_to_enter[1] {� if turn != 0 {� wants_to_enter[0] = false� while turn != 0 {� // busy wait� }� wants_to_enter[0] = true� }� }�� // critical section

t1:� wants_to_enter[1] = true� while wants_to_enter[0] {� if turn != 1 {� wants_to_enter[1] = false� while turn != 1 {� // busy wait� }� wants_to_enter[1] = true� }� }�� // critical section

7

Program

Idea: try simulating possible terminating behaviours,

check if any are feasible

on the actual machine

8 of 54

Q: Can this terminate? (Simulate possible behaviour)

t0:� wants_to_enter[0] = true� while wants_to_enter[1] {� // skipped� }�� // critical section

t1:� wants_to_enter[1] = true� while wants_to_enter[0] {� // skipped� }�� // critical section

8

“Run of the program”

9 of 54

Q: Can this terminate? (Simulate possible behaviour)

t0:

w(x0, true)�r(x1, false)

t1:

w(x1, true)�r(x0, false)

9

Abstract execution

10 of 54

Q: Can this terminate? (SC: no)

t0:

w(x0, true)�r(x1, false)

t1:

w(x1, true)�r(x0, false)

10

Under sequential consistency, this is not feasible.

Proof sketch:

Assume WLOG r(x1, false) executes before r(x0, false)

Then x0 must’ve already been set to true.

Sequential consistency

11 of 54

Q: Can this terminate? (TSO: yes)

t0:

w(x0, true)�r(x1, false)

t1:

w(x1, true)�r(x0, false)

11

Under TSO, both writes can remain buffered, causing the both reads to read the initial value of false.

x86 (TSO)

12 of 54

Consistency Testing

Aim is to automate (part of) the previous slides�
Input: Abstract execution

the “observed behaviour” of the program

Output: Can it actually happen on a “real machine”?�
Applications:

Model checking (as we have seen)
Development of memory models themselves
... many others!

12

13 of 54

Consistency Testing

Input: Abstract execution

the “observed behaviour” of the program

Output: Can it actually happen on a “real machine”?

13

We focus on formalisations of the

C / C++ memory model in this work,

in particular RC20 and its variants.

14 of 54

Consistency Testing

Input: Abstract execution G = <E, po>

the “observed behaviour” of the program

Output: Concrete execution G = <E, po, mo, rf>

the “behind-the-scenes” behaviour explaining the observed behaviour

14

15 of 54

Consistency Testing

Input: Abstract execution G = <E, po>

the “observed behaviour” of the program

Output: Concrete execution G = <E, po, mo, rf>

the “behind-the-scenes” behaviour explaining the observed behaviour

t0:

w(x0, false)�w(x1, false)

w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

t0:

w(x0, false)�w(x1, false)

w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

15

Input

Output

16 of 54

Consistency Testing

Note: Sometimes one abstract execution (input)�can have multiple concrete executions (output).

We’re generally concerned with existence, not enumeration.

t0:

w(x0, false)�w(x1, false)

w(x0, false)�w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

t0:

w(x0, false)�w(x1, false)

w(x0, false)�w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

16

Input

Output 1

17 of 54

Consistency Testing

Note: Sometimes one abstract execution (input)�can have multiple concrete executions (output).

We’re generally concerned with existence, not enumeration.

t0:

w(x0, false)�w(x1, false)

w(x0, false)�w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

t0:

w(x0, false)�w(x1, false)

w(x0, false)�w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

17

Input

Output 2

18 of 54

Why study store-order consistency testing?

18

19 of 54

Bad news: Consistency Testing is NP-complete...

Turns out to be “very” hard because it does not even admit parameterisation!

Soham Chakraborty, Shankara Narayanan Krishna, Umang Mathur, and Andreas Pavlogiannis. 2024.�How Hard Is Weak-Memory Testing? Proc. ACM Program. Lang. 8, POPL, Article 66 (January 2024), 32 pages. https://doi.org/10.1145/3632908

But model checkers exist in real life! (e.g. GenMC)

How did we fix this issue? Reads-From Consistency Testing

19

20 of 54

Reads-From Consistency Testing

Input: Abstract execution G = <E, po, rf>

the “observed behaviour” of the program
a guess for the reads-from relation

Output: Concrete execution G = <E, po, mo, rf>

the “behind-the-scenes” behaviour explaining the observed behaviour

Well studied problem
O(# events * # threads) for RC20

Hünkar Can Tunç, Parosh Aziz Abdulla, Soham Chakraborty, Shankaranarayanan Krishna, Umang Mathur, and Andreas Pavlogiannis. 2023. Optimal Reads-From Consistency Checking for C11-Style Memory Models. Proc. ACM Program. Lang. 7, PLDI, Article 137 (June 2023), 25 pages. https://doi.org/10.1145/3591251

20

21 of 54

Store-Order Consistency Testing

Input: Abstract execution G = <E, po, mo>

the “observed behaviour” of the program
a guess for the store-order relation

Output: Concrete execution G = <E, po, mo, rf>

the “behind-the-scenes” behaviour explaining the observed behaviour

Under-studied problem...

21

22 of 54

Store-Order Based Methods Have Potential

Different programs work better under different tools,�so having variety is good!

It’s also possible enumerating store-order is cheaper!

t0:

w(x0, false)�w(x1, false)

w(x0, false)�w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

t0:

w(x0, false)�w(x1, false)

w(x0, false)�w(x0, true)�r(x1, false)

t1:

�

w(x1, true)�r(x0, false)

22

Input

1 mo, 2 rf

23 of 54

The complexity of store-order testing

23

24 of 54

Our results

24

25 of 54

Our results: Store-order testing is intractable...

25

26 of 54

Our results: ... but tractable under single-writer!

26

(Reasonably tight bounds!)

27 of 54

Our results: 5 proofs

27

28 of 54

Thanks for listening!

28

Link to extended abstract:

29 of 54

What is Store-Order Consistency Testing and why do we care?

29

30 of 54

What is Store-Order Consistency Testing and why do we care?

Setting: concurrent programs
Important application: Model checkers

Automatically verify if a concurrent program has no bugs

30

31 of 54

What is Store-Order Consistency Testing and why do we care?

(For our purposes in this work)
Concurrent program = several threads (black-boxes)
Shared memory (another black box)
Threads communicate with each other through shared memory

31

Shared memory

Thread 1

Thread 2

Thread 3

32 of 54

What is Store-Order Consistency Testing and why do we care?

(For our purposes in this work)
Concurrent program = several threads (black-boxes)
Shared memory (another black box)
Threads communicate with each other through shared memory

32

Shared memory

Thread 1

Thread 2

Thread 3

w(x, 1)

33 of 54

What is Store-Order Consistency Testing and why do we care?

(For our purposes in this work)
Concurrent program = several threads (black-boxes)
Shared memory (another black box)
Threads communicate with each other through shared memory

33

Shared memory

Thread 1

Thread 2

Thread 3

r(x, 42)

34 of 54

What is Store-Order Consistency Testing and why do we care?

Input: Abstract execution

Essentially a record of how each thread interacted with shared memory
Set of memory events: reads, writes, etc. performed by each thread
Program order (aka thread order)
The values read or written for each memory event

Output: Consistent concrete execution

A “witness” that the input can happen on a “real machine”
Or, “was the shared memory behaving according to specification?”

34

35 of 54

What is Store-Order Consistency Testing and why do we care?

Input: Abstract execution

Essentially a record of how each thread interacted with shared memory
Set of memory events: reads, writes, etc. performed by each thread
Program order (aka thread order)
The values read or written for each memory event

Output: Consistent concrete execution

A “witness” that the input can happen on a “real machine”
Or, “was the shared memory behaving according to specification?”

35

This is what a “memory model” is

36 of 54

What is Store-Order Consistency Testing and why do we care?

Basic model checker:

Input: Program (code)
Enumerate concrete executions directly
Output: Was any concrete execution buggy?

Problem: there are way too many concrete executions
Idea: “Run” each thread in “isolation”, enumerating abstract executions instead

36

37 of 54

What is Store-Order Consistency Testing and why do we care?

Improved model checker?

Input: Program (code)
Enumerate abstract executions
Output: Was any abstract execution buggy AND feasible?

Much better

There are exponentially fewer abstract executions!
(Many concrete executions correspond to the same abstract execution)

37

38 of 54

What is Store-Order Consistency Testing and why do we care?

Improved model checker?

Input: Program (code)
Enumerate abstract executions
Output: Was any abstract execution buggy AND feasible?

Much better

There are exponentially fewer abstract executions!
(Many concrete executions correspond to the same abstract execution)

38

Bad news:

Consistency Testing is NP-complete!

39 of 54

What is Store-Order Consistency Testing and why do we care?

Improved model checker?

Input: Program (code)
Enumerate abstract executions
Output: Was any abstract execution buggy AND feasible?

Much better

There are exponentially fewer abstract executions!
(Many concrete executions correspond to the same abstract execution)

39

Informally, the input has too little “information”

40 of 54

Brief detour: Memory models

One way to define a memory model is by explaining how shared memory should “work”
This is called “operational semantics”

40

Shared memory

Thread 1

Thread 2

Thread 3

41 of 54

Brief detour: Memory models (examples)

Sequential Consistency

41

main memory

Thread 1

Thread 2

Thread 3

42 of 54

Brief detour: Memory models (examples)

Sequential Consistency
Total Store Order (TSO)

42

main memory

Thread 1

Thread 2

Thread 3

store buffer

43 of 54

Brief detour: Memory models (examples)

Sequential Consistency
Total Store Order (TSO)

In these cases, testing�boils down to finding an�“execution order” for�the events.

43

main memory

Thread 1

Thread 2

Thread 3

store buffer

44 of 54

Brief detour: Memory models (axiomatic)

Abstract execution: <E, po>
Concrete execution: <E, po, mo, rf>

44

45 of 54

Brief detour: Memory models (axiomatic)

Abstract execution: <E, po>
Concrete execution: <E, po, mo, rf>

po: program order, the order each event was executed by each thread
mo: store-order, the order writes to each location are “processed”
rf: reads-from, relates each read with the write that it reads from

Memory model = Set of constraints on concrete executions
Possible constraint: “po U rf should be acyclic”

45

46 of 54

Brief detour: Memory models (axiomatic)

Possible constraint: “po U rf should be acyclic”
Load Buffering example: Can the program below print “11”?
Answer: NO, under the above constraint

(Initial values are 0)

46

Thread 1:

print(x)
y = 1

Thread 2:

print(y)
x = 1

47 of 54

“Vanilla” Consistency Testing

Input: <E, po>
Output: <E, po, mo, rf>

mo: store-order, the order writes to each location are “processed”
rf: reads-from, relates each read with the write that it reads from

Consistency defined by memory model
Does a consistent concretization of the input exist?
“Very” hard
NP-hard for most memory models
Still NP-hard even in bounded settings for many more

Soham Chakraborty, Shankara Narayanan Krishna, Umang Mathur, and Andreas Pavlogiannis. 2024. How Hard Is Weak-Memory Testing? Proc. ACM Program. Lang. 8, POPL, Article 66 (January 2024), 32 pages. https://doi.org/10.1145/3632908

47

48 of 54

Reads-From Consistency Testing

Input: <E, po, rf>
Output: <E, po, mo, rf>

mo: store-order, the order writes to each location are “processed”
rf: reads-from, relates each read with the write that it reads from

Consistency defined by memory model
Does a consistent concretization of the input exist?
Now easy for some memory models!
O(nk) for RC20!

Hünkar Can Tunç, Parosh Aziz Abdulla, Soham Chakraborty, Shankaranarayanan Krishna, Umang Mathur, and Andreas Pavlogiannis. 2023. Optimal Reads-From Consistency Checking for C11-Style Memory Models. Proc. ACM Program. Lang. 7, PLDI, Article 137 (June 2023), 25 pages. https://doi.org/10.1145/3591251

48

Input now provides the rf!

49 of 54

Store-Order Consistency Testing

Input: <E, po, mo>
Output: <E, po, mo, rf>

mo: store-order, the order writes to each location are “processed”
rf: reads-from, relates each read with the write that it reads from

Consistency defined by memory model
Does a consistent concretization of the input exist?
Under-studied!

49

Input now provides the mo!

50 of 54

What is Store-Order Consistency Testing and why do we care?

Improved model checker?

Input: Program (code)
Enumerate abstract executions
Output: Was any abstract execution buggy AND feasible?

Much better

There are exponentially fewer abstract executions!
(Many concrete executions correspond to the same abstract execution)

50

51 of 54

Choices for which executions to enumerate

# of executions Consistency testing

Abstract Fewest # NP-hard
Reads-from Medium # sometimes easy
Store-order Medium # ???
Concrete Most # Not required

51

52 of 54

Choices for which executions to enumerate

Enumerating store-order executions has the potential to save time over enumerating reads-from executions!

e.g. Input is very read-heavy

small mo, large rf

52

53 of 54

Our results

53

54 of 54

Thanks for listening!

54

Link to extended abstract: