Paxos Made Simple

Reference: Lamport, Leslie. "Paxos made simple." ACM Sigact News 32.4 (2001): 18-25.

0. Abstract & Introduction

The Paxos algorithm, when presented in plain English, is very simple. The Paxos algorithm for implementing a fault-tolerant distributed system has been regarded as difficult to understand, perhaps because the original presentation was Greek to many readers. In fact, it is among the simplest and most obvious of distributed algorithms. At its heart is a consensus algorithm—the “synod” algorithm of. The next section shows that this consensus algorithm follows almost unavoidably from the properties we want it to satisfy. The last section explains the complete Paxos algorithm, which is obtained by the straightforward application of consensus to the state machine approach for building a distributed system—an approach that should be well-known, since it is the subject of what is probably the most often-cited article on the theory of distributed systems.

1. The Consensus Algorithm

2.1 Problem

  • Safety Requirements: (1) only a value that has been proposed may be chosen; (2) only a single value is chosen; (3) a process never learns that a value has been chosen unless it actually has been.
  • Liveness: (1) some proposed value is eventually chosen; (2) if a value has been chosen, then a process can eventually learn the value.
  • Three classes of agents: (1) proposers; (2) acceptors; (3) learners.
  • Asynchronous, non-Byzantine Model: (1) agents operate at arbitrary speed, may fail by stopping, and may restart; (2) messages can take arbitrarily long to be delivered, can be duplicated, and can be lost, but they are not corrupted.

2.2 Choosing a Value

Requirements

  • P1: An acceptor must accept the first proposal that it receives.
  • P2: If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v.
  • (Strengthened from P2): If a proposal with value v is chosen, then every higher-numbered proposal accepted by any acceptor has value v.
  • (Strengthened from ): If a proposal with value v is chosen, then every higher-numbered proposal issued by any proposer has value v.
  • (Strengthened from ): For any v and n, if a proposal with value v and number n is issued, then there is a set S consisting of a majority of acceptors such that either (a) no acceptor in S has accepted any proposal numbered less than n, or (b) v is the value of the highest-numbered proposal among all proposals numbered less than n accepted by the acceptors in S.
  • (Strengthened from P1): An acceptor can accept a proposal numbered n iff it has not responded to a prepare request having a number greater than n.

Phase 1

  • A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors.
  • If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered proposal (if any) that it has accepted.

Phase 2

  • If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.
  • If an acceptor receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request having a number greater than n.

Optimizations

  • An acceptor needs to remember only the highest-numbered proposal that it has ever accepted and the number of the highest-numbered prepare request to which it has responded.
  • If an acceptor ignores a prepare or accept request because it has already received a prepare request with a higher number, then it should probably inform the proposer, who should then abandon its proposal.

2.3 Learning a Chosen Value

  • More generally, the acceptors could respond with their acceptances to some set of distinguished learners, each of which can then inform all the learners when a value has been chosen.
  • Because of message loss and failure of acceptors, a value could be chosen with no learner ever finding out. In that case, learners will find out what value is chosen only when a new proposal is chosen.

2.4 Progress

  • To guarantee progress, a distinguished proposer must be selected as the only one to try issuing proposals.

2.5 The Implementation

  • An acceptor records its intended response in stable storage before actually sending the response.
  • Different proposers choose their numbers from disjoint sets of numbers, so two different proposers never issue a proposal with the same number.

3. Implementing a State Machine

Written on March 21, 2017