[Spread-users] Membership algorithm--continued
Ryan Caudy
caudy at jhu.edu
Fri Apr 16 00:11:09 EDT 2004
Hi,
My responses are inlined below. Hope this helps.
Cheers,
Ryan
kevin Tian wrote:
> 1.
> In function Form_or_fail, which case can lead to
> Token_alive =1 and F_reps.num_reps= 1? I guess there
> is only one daemon?
I believe that you mean this code:
if( Smallest_rep( &F_reps, &dummy ) == My.id )
{
if( Token_alive && F_reps.num_reps == 1 )
{
/* clear everything and go back to op */
This corresponds to the case where a ring representative called
Lookup_new_members(), found that there were members not in its ring,
shifted to GATHER to try to find them, and then timed-out without
finding any other representatives to try to merge with. Thus, this case
is simply a failed attempt to increase the size of the ring from a
currently healthy membership.
> 2.
> If daemon is not the smallest rep and token is not
> alive, why we conclude the gather process fails and
> then restart to gather?
Ideally, the smallest rep will timeout first and send a form1 token
(because everyone else pushes their timeout back when they notice a new
smallest rep). If this isn't the case, then the other reps can only
conclude that gather failed, and try again. Of course, if there's just
a time delay here, this doesn't necessarily hurt the membership...
they'll handle the form1 if it comes.
> 3.
> In addition, I think the implementation do not check
> whether agreement is reached before creating and
> circulating the form token, thus violating the
> algorithm description.
I'm not sure exactly what you mean, here. Distributed agreement is
impossible in an asynchronous environment. Could you explain what you
mean by violating the algorithm description? Basically, whenever a
daemon gets the form token, it makes sure it can go forward with the
membership (doing basic validity checking), and appends information to
it (if it was a representative). Then end result of the two form rounds
is simply that the daemons have a circulating token, and know what the
membership will be if all goes well, as well as the information needed
to start the EVS stage (exchanging messages to be able to meet the needs
of the EVS specification). If something fails, a new attempt will be
made. At the end of EVS, daemons know that all of the other daemons
have the appropriate messages, and will deliver them or crash.
> 4.
> And it seems that there is no business of regular
> group members---the actual clients do NOT participate
> in the membership algorithm. How do they get
> membership information then? E.g., Who are the clients
> of my group? Is it handled in group.c?
> I look through the code, but it is too much for me.
Correct, clients don't participate in the membership algorithm at all.
The code for the group membership algorithm, which includes state
exchange (when necessary), building and maintaining the list of groups,
and client notifications (i.e. transitional and regular membership
messages) are all done in groups.c. If you have more questions about
this code, I can certainly answer them.
--
Ryan W. Caudy
Center for Networking and Distributed Systems
Department of Computer Science
Johns Hopkins University
More information about the Spread-users
mailing list