Composing and Decomposing Op-Based CRDTs with Semidirect Products
Matthew Weidner, Heather Miller, and Christopher Meiklejohn
Carnegie Mellon University
Pittsburgh, PA, USA
ICFP 2020
Virtual Jersey City, NJ, USA
8/24-26/2020
1
SPLASH 2022
Auckland, NZ
8 Dec 2022
2
Composing and Decomposing Op-Based CRDTs with Semidirect Products
An alternative: CRDTs (Conflict-free Replicated Data Types)
3
// Using Legion CRDT library
var legion = new Legion(...);
legion.joinGroup(...); // sets objectStore
var shopping = objectStore.getOrCreate(
"Shopping", legion.Set
);
shopping.add("milk");
Update local state immediately
A “PL” way of solving a systems problem.
Lazily broadcast the operation to other users/replicas
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Pros
Cons
4
Composing and Decomposing Op-Based CRDTs with Semidirect Products
In this work
Composition
Decomposition
We present a compositional design technique for (op-based) CRDTs, the semidirect product.
5
(same state type)
(union of supported ops)
Composing and Decomposing Op-Based CRDTs with Semidirect Products
6
Make Ops Commute
Reason about Causality
Update local state immediately
Broadcast the op to other users
Alice: value = 3
Bob: value = 5
Conflict resolution techniques
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Causality in detail
The causal order on operations is a natural partial order < on operations:
CRDTs can query the causal order. Ex. Add-Wins Set CRDT: x is in the set if there is an add(x) operation that is not < any remove(x) operation.
7
add(‘milk’)
add(‘eggs’)
add(‘bread’)
add(‘milk’)
Alice
Bob
Charlie
remove(‘milk’)
<
<
<
concurrent
Composing and Decomposing Op-Based CRDTs with Semidirect Products
8
Reason about Causality
Make Ops Commute
Conflict resolution techniques
Our semidirect product provides a uniform, reusable way of doing this, so you don’t have to.
New CRDT design workflow: Design commutative data types for simple subsets of operations → Glue together with semidirect products.
Our experience: works for many existing CRDT types, plus novel ones.
Composing and Decomposing Op-Based CRDTs with Semidirect Products
An example of our semidirect product
(int register w/ add ops) + (int register w/ mult ops) → int register with both
9
Dave
Mary
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Let us choose that in the face of concurrency, “adds go before mults”.
Observation: If we receive concurrent add and mult operations in the “wrong” order (mult before add), we get the right result if we act on the add with the mult.
Goal: Integer register CRDT supporting add and mult operations
10
Dave
Mary
Dave
Mary
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Goal: Integer register CRDT supporting add and mult operations
When you receive an add operation, transform it by all concurrent mult operations you’ve already received.
When you receive an add operation, transform it by all concurrent mult operations you’ve already received.
11
Dave
Mary
Composing and Decomposing Op-Based CRDTs with Semidirect Products
New since ICFP 2020
12
SPLASH 2022
Composing and Decomposing Op-Based CRDTs with Semidirect Products
What’s the meaning of this?
James Riely (DePaul), paraphrased: How does the semidirect product relate to the behavior of some sequential data type spec?
Me in 2020: 🤷
Me in 2022: Maybe it’s not really about synthesizing new CRDTs; it’s a (premature?) optimization of one particular CRDT: the Unique Set with For-Each Ops
13
Jagadeesan & Riely, “Eventual Consistency for CRDTs”, ESOP 2018
Liang & Feng, “Abstraction for conflict-free replicated data types”, PLDI 2021
De Porre et. al, “ECROs: building global scale systems from sequential code”, OOPSLA 2021
SPLASH 2022
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Unique Set with For-Each Ops
“Unique Set” CRDT: ≈ list CRDT w/o order (but similar remarks apply to list CRDTs)
Unique set of CRDTs: values are themselves CRDTs
14
SPLASH 2022
Ex: Rich Text CRDT�= List of Rich Char CRDTs
Lorum ipsem dolor
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Unique Set with For-Each Ops
Sequentially, we can also do: “for each element of a Unique Set, apply op O”.
15
SPLASH 2022
Collaboratively, we can do: “for each element of a Unique Set of CRDTs, including ones added concurrently, apply op O”.
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Unique Set with For-Each Ops
This is an example of a semidirect product: arbitration order is add before forEach.�But now we have an obvious semantics (not just “whatever the semidirect product says”).
Empirical observation: many “useful” semidirect products are semantically equivalent to some Unique Set of CRDTs w/ causal+concurrent for-each ops.
16
SPLASH 2022
Literal semidirect products (e.g. add/mult) optimize state size, but maybe not worth conceptual overhead.
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Implementation: Collabs
17
SPLASH 2022
Huairui Qi
CMU (M.S. 2020)
Ignacio Maronna
CMU (M.S. 2020)
Maxime Kjaer
EPFL/CMU (M.S. 2021)
Ria Pradeep
CMU (B.S. 2021)
Benito Geordie
Rice
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Demo apps: Rich Text Editor
18
SPLASH 2022
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Demo app: Rich Text Editor
19
SPLASH 2022
Benito Geordie
Rice
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Demo app: Shapes POC
20
SPLASH 2022
Ria Pradeep
CMU (B.S. 2021)
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Recap
Thanks for watching :)
21
Questions?
Image credits: zombo.com, openclipart.org
More links: mattweidner.com
Composing and Decomposing Op-Based CRDTs with Semidirect Products
22
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Relation to Operational Transformation
Recall: OT is a competing approach to eventually consistent replicated data types. Instead of making concurrent operations commute, you define a transformation function tf_1: O x O -> O.
The semidirect product is the special case of OT when O = O1 ⊔ O2 and tf1(p, q) = p except when p ∊ O1, q ∊ O2. (O2 ops transform concurrent O1 ops, but that’s it.)
It is interesting that we get common CRDT semantics from this special case of OT.
23
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Generality
Q: When can a CRDT be decomposed as a semidirect product of simpler CRDTs?
Let C be a CRDT in the POLog model of pure op-based CRDTs. I.e., C is defined by a function f mapping the partially ordered log of operations to a visible state.
Suppose the ops of C can be partitioned as O1 U O2 so that for any POLog L, f(L) = f(L’) whenever L = L’ except that we have added some edges o1< o2 with o1 ∊ O1, o2 ∊ O2. Then C is the semidirect product of a CRDT with only O1 ops and a CRDT with only O2 ops.
This expresses the rule that O1 ops “go before” concurrent O2 ops.
24
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Name Origin: Semidirect Product of Groups
In abstract algebra, a group is a set G with a binary operation ∘ : G x G → G satisfying multiplication-like properties.
The semidirect product combines two groups G, H into a group with set G x H. It contains copies of G and H which commute as
h ∘ g = (h ⊳ g) ∘ h
for a suitable action ⊳ of H on G.
The semidirect product of CRDTs uses an analogous rule to reorder concurrent ops from the components.
25
Composing and Decomposing Op-Based CRDTs with Semidirect Products