distributed system - How do raft nodes learn about peers?

Question

Welcome To Ask or Share your Answers For Others

distributed system - How do raft nodes learn about peers?

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

distributed system - How do raft nodes learn about peers?

I just finished the raft paper and am beginning work on an implementation. However, I realized I am a bit confused on one crucial detail. How do raft nodes “know” about their peers? I didn’t see any mention of this in the paper so I assume it is implementation specific but in my mind, it leads to a number of questions:

Is the size of a raft cluster static? Since each node must know about every other peer (in order to send RPCs), how would a new node join an existing cluster? How would the existing nodes learn about this new node?
Must each node’s network location be hardcoded into every other node at initialization? How does a node know where to send its RPCs?

Would greatly appreciate some help with this. I am really interested in understanding raft completely and am excited to be implement it but I am a bit lost on this part of the system architecture. It doesn’t seem right to me that the nodes should be statically configured with hardcoded network locations since in the real world, I could definitely envision needing to add a new node into an existing cluster. Thanks!

question from:https://stackoverflow.com/questions/65914261/how-do-raft-nodes-learn-about-peers

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:10:27+0000

Membership change is a core component of the Raft protocol (specified in section 6 of the extended paper and discussed at length in Diego’s dissertation). But you bring up some good questions. In practice, there are certainly some requirements for safe configuration and a few different approaches common in real-world Raft implementations.

Generally, there are a couple of ways to bootstrap a Raft cluster: initialize the nodes with a configuration identifying each member of the cluster, or start the cluster with a single node and add nodes to the configuration (using the membership change protocol) to scale the cluster up to its intended size. Both will give you the same end result, it’s just a matter of preference.

One requirement for the cluster configuration is that each member has a fixed identity. If a follower acknowledges it persisted entries up to some index i and the leader marks that index committed, the leader should be able to assume entries 1-i will exist on that follower in perpetuity, even if the follower restarts. So, the replica with that identity must always have that log.

But this requirement brings us to another use case for membership changes: replacing failed members. I’d that follower’s log gets corrupted or the host crashes and never returns, it should only be replaced by executing the membership change protocol: adding a new replica and removing the old one. Again, it’s important that one of the membership change protocols discussed in the Raft literature be used.

Keep in mind that changing the number of nodes in the cluster can mean the quorum size changes as well, and this is what makes membership changes difficult to handle. When changing the quorum size the protocol needs to ensure commits are still stored on a majority of nodes. To resize the quorum safely to avoid disruptions, the membership protocol must be implemented precisely.

Categories

distributed system - How do raft nodes learn about peers?

distributed system - How do raft nodes learn about peers?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags