Understanding the Raft Consensus Protocol
Posted on In Computing systems, Systems, Systems 101, TutorialThe Raft consensus protocol is a distributed consensus algorithm designed to be more understandable than other consensus algorithms like Paxos. It ensures that a cluster of servers can agree on the state of a system even in the presence of failures.
Table of Contents
Key Concepts
Raft divides the consensus problem into three relatively independent subproblems:
- Leader Election: Ensures one leader is elected at a time.
- Log Replication: The leader appends client requests to its log and replicates them across the cluster.
- Safety: Keeps logs consistent across servers, even during failures.
Raft Roles
Nodes in a Raft cluster can be in one of three roles:
- Leader: Manages client interactions, log replication, and sends heartbeats to followers.
- Follower: Passive nodes responding to leader and candidate requests.
- Candidate: A follower that starts an election when it doesn’t receive heartbeats.
Detailed Algorithm and Processes
Leader Election
- Election Timeout:
- If a follower doesn’t receive a heartbeat from the leader before the election timeout, it becomes a candidate.
- Starting an Election:
- Increments its term.
- Votes for itself.
- Sends RequestVote RPCs to other nodes.
- Voting:
- A node grants its vote to the first candidate it receives a request from in a term.
- It denies subsequent requests in the same term.
- Election Result:
- A candidate becomes the leader if it receives votes from a majority of nodes.
- If no candidate wins, a new election starts.
Log Replication
- Client Requests:
- The leader receives requests and appends them to its log.
- Each log entry contains a command for the state machine, the term number, and a unique index.
- Append Entries:
- The leader sends AppendEntries RPCs to followers.
- Followers append the entry to their logs and acknowledge.
- Commitment:
- Once an entry is replicated on a majority of servers, it’s considered committed.
- The leader updates its commitIndex and notifies followers.
Safety Features
- Term Numbers: Each term is uniquely numbered and increases monotonically. It’s crucial for maintaining consistency.
- Log Matching: Logs are consistent if two entries with the same index and term are identical.
- Leader Completeness: A newly elected leader must have all committed entries from previous terms.
Handling Failures
- Follower Failure: The leader continues operation; a failed follower catches up upon recovery.
- Leader Failure: A new leader is elected if the current leader fails. The system remains available if a majority of nodes are operational.
- Network Partitions: Raft ensures only nodes in the partition with a majority can elect a leader, maintaining consistency.
Conclusion
Raft provides a clear and robust framework for distributed consensus, making it easier to understand and implement. Its separation of concerns into leader election, log replication, and safety ensures both reliability and simplicity.
For more detailed information, refer to the original Raft paper, which offers in-depth explanations and formal definitions.