[Spread-users] Adding info to Transitional EVS messages

Yair Amir yairamir at cnds.jhu.edu
Sun Aug 28 21:06:25 EDT 2005


Hi,

Good discussion.

To add a bit - even algorithm we developed outside of the open source
version will not achieve this property. The reason is we always
maintain the FIFO property of not acking a message higher than a message
sent by the same guy that we did not receive. The advanced protocol
described in the DSN 2000 paper is such a protocol.

Theoretically you could write a protocol that achieves this, which can
be done in two ways:
1. Separating the control of different groups - this is similar to just
    instantiating a new Spread network.
2. Acknowledging messages explicitly, which is generally very wasteful
    as it does not use aggregation and implicit acks.

I am of the opinion that if you really care about it, separate the
networks. Otherwise, perhaps you need a protocol tailored to a specific
need that can allow an inefficiency for the sake of a more accurate
signaling in edge cases.

Cheers,

	:) Yair.

John Lane Schultz wrote:
> Nilo Rivera wrote:
> 
>> Consider the following scenario:
>>
>> 1. Clients X and Y (Cx and Cy) are members of groups A and B, each 
>> running on its own daemon.
>> 2. Cx sends a safe message Ma to group A, followed by another safe 
>> message Mb to group B.
>> 3. Cy receives Mb, followed by a partition which leaves him alone. 4. 
>> Cy established "I know that everybody knows" for Mb before the 
>> partition. (may not be the case in Spread)
>> 5. Cy delivers Mb in a transitional membership because it does not 
>> have Ma..
>>
> 
> Within a heavy-weight daemon membership/view, a control token is 
> circulated to assign sequence numbers to messages sent within that view. 
>  This token/sequence assignment establishes a total order on the 
> messages sent within that view.  Similiarly, safe messages are also 
> noted on this token through a single ARU (Acknowledge Receipt & 
> Understanding) counter.  This ARU basically notes the minimum message 
> sequence number up through which all daemons in the view have received ...
> 
>> First, can (4) above happen in Spread?....If not, is it implementation 
>> dependent (a vector or fined-grained-token could do it)?.
> 
> 
> As such, I don't believe that (4) can be achieved in Spread as the ARU 
> will still be stuck on trying to make message Ma safe and Mb is later in 
> the total order (needs a higher ARU than Ma).  This mechanism is 
> implementation dependent as an all-ACK (inefficient) or similiar 
> algorithm could locally achieve such knowledge for Mb but not Ma, which 
> I don't believe can occur in Spread.
> 
>> Here is the reason: If group A and B are independent state machines, 
>> then group B gets penalized for a message unrelated to his state 
>> (cannot apply Mb). However, Mb could have been safely applied if we 
>> know that is in order and everyone in the group received it.  We have 
>> such a case where an application is a member of hundreds of groups 
>> with a few independent state machines.   I guess each state machine 
>> could use a different spread network to solve the above, but is not 
>> possible as some events are sent to multiple state machines.
> 
> 
> The different Spread process groups are "simulated" through the use of a 
> single daemon "group."  Because we do not run independent state machines 
> per group but rather 1 per daemon membership, Spread can efficiently run 
> with thousands of process groups running inside it.  It also allows 
> Spread to easily make guarantees across groups and to give well-defined 
> meaning to open group semantics.
> 
> It is true that in edge cases like the one you describe a different 
> algorithm could be slightly more optimal in some senses, but that 
> optimization would come at a too heavy performance and capability cost.
> 
> Cheers!
> John
> 
> -- 
> John Schultz
> Spread Concepts LLC
> 
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
> 
> 





More information about the Spread-users mailing list