[Spread-users] Re: Understanding SAFE messages (Dave Viner)

Fri Apr 4 02:26:19 EST 2003

On Thu, 03 Apr 2003 14:51:02 -0500, <spread-users-request at lists.spread.org> 
wrote:

Hi Dave,

as far as I can tell, your understanding of SAFE messages is not correct.

The idea is, very informally, this: if a group member p receives a SAFE 
message m, then
p can conclude that each group member (better said: each member of the view 
in which
p receives m) will receive m unless it crashes before. So, if p receives m 
when the membership
of its view is p,q,r,s,t, then p can conclude, "ok, m is safe, so each of 
q,r,s,t will
also receive m in v, unless it crashes before receiving m".

With a RELIABLE message, p does not have this guarantee. All that p can 
tell is this: upon receiving the next view w following v, then p can 
conclude "ok, all members of w that were also
members of v have received in v the same messages that I have received, 
e.g., m" (actually things
are subtler than this, but this is the basic idea). Note that when p 
receives m, p CANNOT tell
anything about who will also receive m. In the limit, p might even be the 
ONLY group member that receives m.

While the guarantees of SAFE messages are indeed strong, useful and 
relatively easy to understand, enforcing them in practice is not possible. 
To make a long story short, the problem is that the communication 
infrastructure might find itself in a situation in which it cannot deliver 
a message as SAFE. This can happen only as a result of certain failures 
during the protocol for delivering a message. So, what might happen in 
practice is this: p receives a transitional view and then receives m; in 
this case p can conclude something much weaker than before: "ok, m is safe, 
so each of the members of my transitional view will also receive m, unless 
it crashes before; but I CANNOT tell anything regarding members of the 
regular view that are not in my transitional view; they MAY OR MAY NOT have 
delivered the message". Reasoning about messages received in a transitional 
view is quite complex.

Hope this helps.

>
> Message: 2
> From: "Dave Viner" <dviner at yahoo-inc.com>
> To: <spread-users at lists.spread.org>
> Date: Thu, 3 Apr 2003 11:02:33 -0800
> Subject: [Spread-users] Understanding SAFE messages
>
> Hi,
> 	I'm having some difficulty understanding SAFE messages and how they 
> differ
> from RELIABLE messages.  Let's assume that I have 2 spread daemons on
> different machines (so, 2 ips listed in a single Spread_Segment), and 
> each
> daemon has 2 client programs (or subscribers, or recipients) all of whom 
> are
> in the same "group".  So the picture looks something like:
>
> Prog1 --                   -- Prog3
> |-SpD1 --- SpD2 -|
> Prog2 --                   -- Prog4
>
> Then Prog1 sends message (M1) to SpD1.  Then SpD2 dies for some reason.  
> My
> understanding of SAFE versus RELIABLE is that if the M1 message is SAFE,
> then SpD1 will broadcast a transitional membership (TRANSITION_MESS) to
> Prog1 and Prog2, and then send M1 message to Prog2.  On the other hand, 
> if
> the M1 message is RELIABLE, SpD1 will simply broadcast the M1 message to
> Prog2.  Is that accurate?  If so, that implies that Prog3 and Prog4 would
> never see the M1 message, even if it were SAFE, and SpD2 had, say, some 
> sort
> of hardware hiccup (possibly non-fatal, short-lived).  Are there any
> facilities which allow the system to keep track of what SAFE messages 
> SpD2
> missed while it was away?

-- 

     Alberto

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

NEW PHONE NUMBER 040 676 XXXX ---> 040 558 XXXX

Alberto Bartoli
Associate Professor / Profesor Titular
Faculty of Engineering, University of Trieste
http://www.univ.trieste.it/bartolia
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%