"Raft - The Understandable Distributed Protocol" by Ben Johnson (2013)
Ғылым және технология
For the last decade, Paxos has been the de facto standard in distributed protocols. Unfortunately, Paxos is difficult to understand and even harder to implement. The implementors of Google's distributed lock system called "Chubby" even stated that there were many gaps in Paxos when it came to real world implementation.
Recently a new distributed protocol has come out of research at Stanford called Raft. Raft is built for real world applications and a primary concern in the development of the protocol was understandability. This talk will walk you through the Raft protocol and how it works.
Ben Johnson
Skyland Labs, LLC
@benbjohnson
Ten years of software development experience working in database architecture, distributed systems and data visualization. Lead developer of the Sky behavioral database project (skydb.io/) and lead developer of the Go implementation of the Raft protocol (github.com/benbjohnson/go-raft).
Recorded at Strange Loop conference (thestrangeloop.com) in St. Louis, MO, Oct 2013.
Пікірлер: 8
Having the Questions is really good for a better understanding of the concept.
Excellent presentation, the explanation of Raft was clear and not hard to follow. Also good questions!
26:59 was that Martin Kleppmann speaking?
@arno.claude
Жыл бұрын
Judging from his voice, yes.
What will happen if network partition happen after leader get consensus from follower and commit in its store but when it tries to send confirmation to follower to commit?
@WillCodeForViews
2 жыл бұрын
i have the same question! if you find out, please let me know
@ssujeen3034
2 жыл бұрын
For simplicity let's just assume the leader is unable to communicate with any other node because of the network partition. Then the election timeout happens in any of the followers because the leader can't send heartbeat messages anymore. The constraints in election guarantee that if a new leader is elected, it must be one of the nodes which have the committed entries of the old leader from the last term. Those entries are not yet committed in the new leader yet because the rpc message sent by the old leader with the updated commit index was lost because of the network partition. There is another constraint in raft by which a leader can't commit entries from the older terms until it commits an entry in the current term. Once it commits an entry in the current term, it implies previous entries can also be safely committed because of the log matching property. Hope this helps.
How gossip can work in integration with it