Multi Paxos
CSE 452 Spring 2026
Announcements
Part-Time Parliament
“My attempt at inserting some humor into the subject was a dismal failure. People who attended my lecture remembered Indiana Jones, but not the algorithm. People reading the paper apparently got so distracted by the Greek parable that they didn’t understand the algorithm. Among the people I sent the paper to, and who claimed to have read it, were Nancy Lynch, Vassos Hadzilacos, and Phil Bernstein. A couple of months later I emailed them the following question:
Can you implement a distributed database that can tolerate the failure of any number of its processes (possibly all of them) without losing consistency, and that will resume normal behavior when more than half the processes are again working properly?
None of them noticed any connection between this question and the Paxos algorithm.” –Leslie Lamport
Basic Paxos -> Multi-Paxos
Leader Election
Leader Election
Leader Election
Slot 1
Slot 2
Slot 3
Simple Paxos
Slot 1
Slot 2
Slot 3
Multi-Paxos
Leader Election
Why? Leader election can cause contention!
Possible order for implementation of Paxos
Executing RPCs: Stable Leader (P2)
Multi-Paxos Edition
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
Acceptor Ballot
Leader Ballot
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
P2A
ballot: 1.3
slot number: 1
command: A
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
P2A
ballot: 1.3
slot number: 2
command: B
B
S3
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
P2B
ballot: 1.3
slot number: 2
B
S3
B
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S2, S3
B
Can server 3 execute B?
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S2, S3
B
P2A
ballot: 1.3
slot number: 1
command: A
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S2, S3
B
P2B
ballot: 1.3
slot number: 1
A
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S2, S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S2, S3
B
A
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S2, S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S2, S3
B
A
Heartbeat
ballot: 1.3
chosen: {1 → A, 2 → B}
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S2, S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S2, S3
B
A
Propose phase implementation notes
Leader Election
Multi-Paxos Edition
Overview
Leader election will begin upon:
P1a
P1b
P2a
1
2
3
4
...
Server 3
Command:
Ballot: 1.3
Active: true
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 0.0
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
A
B
C
A
C
Heartbeat()
Heartbeat()
1
2
3
4
...
Server 3
Command:
Ballot: 1.3
Active: true
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 0.0
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
Server 1 detects server 3 timeout!
A
B
C
A
C
Heartbeat()
Heartbeat()
1
2
3
4
...
Server 3
Command:
Ballot: 2.1
Active: false
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 2.1
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 2.1
Active: false
A
B
C
A
C
P1a(2.1)
P1a(2.1)
P1a(2.1)
1
2
3
4
...
Server 3
Command:
Ballot: 2.1
Active: false
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 2.1
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 2.1
Active: false
A
B
C
A
C
P1b([<1.3, 1, A>, <1.3, 3, C>])
P1b([])
P1b([<1.3, 1, A>, <1.3, 2, B>, <1.3, 3, C>])
1
2
3
4
...
Server 3
Command:
Ballot: 2.1
Active: false
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 2.1
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 2.1
Active: false
A
B
C
A
C
P1b([<1.3, 1, A>, <1.3, 3, C>])
P1b([])
P1b([<1.3, 1, A>, <1.3, 2, B>, <1.3, 3, C>])
What should the status of slot 1 be?
1
2
3
4
...
Server 3
Command:
Ballot: 2.1
Active: false
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 2.1
Active: true
1
2
3
4
...
Server 2
Command:
Ballot: 2.1
Active: false
A
B
C
A
C
P1b([<1.3, 1, A>, <1.3, 3, C>])
P1b([])
P1b([<1.3, 1, A>, <1.3, 2, B>, <1.3, 3, C>])
A
B
C
1
2
3
4
...
Server 3
Command:
Ballot: 2.1
Active: false
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 2.1
Active: true
1
2
3
4
...
Server 2
Command:
Ballot: 2.1
Active: false
A
B
C
A
C
p2a(2.1, 2, B)
A
B
C
p2a(2.1, 2, B)
p2a(2.1, 2, B)
Leader Notes
Let’s take a deeper look
“Log Merging” rules
For each slot in an incoming P1b message:
Warning: Do not try to merge directly into your log!
1
2
3
4
...
P1b
(1.3, 1, A)
(2.1, 2, C)
(2.1, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
P1b
(1.3, 2, B)
1
2
3
4
...
Temp Log
Ballot: 3.2
(1.3, 1, A)
(1.3, 3, E)
Candidate’s temporary log state
1
2
3
4
...
P1b
(1.3, 1, A)
(2.1, 2, C)
(2.1, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
Temp Log
Ballot: 3.2
(1.3, 1, A)
(1.3, 3, E)
Candidate’s temporary log state
(1.3, 2, B)
1
2
3
4
...
P1b
(1.3, 1, A)
(2.1, 2, C)
(2.1, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
Temp Log
Ballot: 3.2
(1.3, 1, A)
(1.3, 3, E)
Candidate’s temporary log state
(1.3, 2, B)
1
2
3
4
...
P1b
(1.3, 1, A)
(2.1, 2, C)
(2.1, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
Temp Log
Ballot: 3.2
(1.3, 1, A)
(1.3, 3, E)
Candidate’s temporary log state
(2.1, 2, C)
1
2
3
4
...
P1b
(1.3, 1, A)
(2.1, 2, C)
(2.1, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
Temp Log
Ballot: 3.2
(1.3, 1, A)
(1.3, 3, E)
Candidate’s temporary log state
(2.1, 2, C)
(2.1, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
Temp Log
Ballot: 3.2
Slot in:
Slot out:
(1.3, 1, A)
(3.2, 3, E)
Merge Temporary log into actual log!
(3.2, 2, C)
(3.2, 4, D)
Not started
Accepted
Chosen
Incoming Messages To Merge
1
2
3
4
...
Temp Log
Ballot: 3.2
Slot in:
Slot out:
(1.3, 1, A)
(3.2, 3, E)
Should we execute slot 1?
(3.2, 2, C)
(3.2, 4, D)
“Log Merging” rules w/ extra note
For each slot in an incoming P1b message:
Additional Notes on “Log Merging”
Several ways to implement:
Log Merging:
Dealing with Log “Holes”/Proposing no-ops
If we have in our log:
put(a, 1), ______, ______, append(a, 2)
Do we know those empty log slots are decided? How can we push these log slots to completion and execute? Why do we need no-ops?
Your implementation needs to be able to handle "holes" in the Paxos log. That is, at certain points, a server might see agreement being reached on a slot but not previous slots. Your implementation must still get to where it can execute the append, even if no more client requests arrive.
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
P2A
ballot: 1.3
slot number: 1
command: A
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
P2A
ballot: 1.3
slot number: 2
command: B
B
S3
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
P2B
ballot: 1.3
slot number: 2
B
S3
B
1
2
3
4
...
Server 3
Command:
Recv’d P2Bs:
Ballot: 1.3
Active: true
A
S3
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.2
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
S3
B
X
Log hole at slot 1
Garbage Collection
Garbage Collection: 3 servers in the system
Server1 (Leader):
Latest executed slot: 4
Server2:
Latest executed slot: 4
Server3:
Latest executed slot: 3
Hasn’t heard about slot 4
Can garbage collect slots 1-3
1
2
3
4
...
Server 3
Command:
Ballot: 1.3
Active: true
A
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.3
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
B
A
A
1
2
3
4
...
Server 3
Command:
Ballot: 1.3
Active: true
A
Not started
Accepted
Chosen
1
2
3
4
...
Server 1
Command:
Ballot: 1.3
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
B
A
A
firstNotExecuted: 3
firstNotExecuted: 2
firstNotExecuted == slot_out
2
3
4
...
Server 3
Command:
Ballot: 1.3
Active: true
Not started
Accepted
Chosen
Server 1
Command:
Ballot: 1.3
Active: false
1
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
B
A
A
firstNonCleared: 2
...
2
3
4
1
2
3
4
...
Server 3
Command:
Ballot: 1.3
Active: true
Not started
Accepted
Chosen
Server 1
Command:
Ballot: 1.3
Active: false
2
3
4
...
Server 2
Command:
Ballot: 1.3
Active: false
B
B
...
2
3
4
In case we have time
Possible order for implementation of Paxos
Recommendation: Ballots
Even though doing something like having ballots being 1.1, 1.2, 1.3, 2.1, ... would work, we recommend making ballot a class like:
E.g. (3, server1)
Then making that comparable and defining a compareTo.
We recommend using a single ballot on each server (not for each role).
If you define the Ballot as an inner class, remember to make it static! If not, it might lead to failed garbage collection or lead to inefficiencies in testing and cause you to fail tests you would otherwise pass. Also make sure it’s serializable and has equals and hashcode!
Recommendation: Roles
Recommendation: Logs
For your logs, store it as
Map<Integer, LogEntry> log; // Where slot number -> log entry
Where LogEntry contains:
If you define the LogEntry as an inner class, remember to make it static! If not, it might lead to failed garbage collection or lead to inefficiencies in testing and cause you to fail tests you would otherwise pass. Also make sure it’s serializable and has equals and hashcode!
Some cases/questions to consider when writing your code