[Spread-users] Confirmation for Reliable Delivery

Fri Aug 8 19:21:29 EDT 2008

Hi John,

I see. Thanks a lot for the reply.  So

>PS - When SP_multicast returns success, that only means that your message
was successfully queued (and possibly sent) to go to your Spread daemon and
nothing more.

Does this mean that even if a client is partitioned from the Spread daemon
that it is connected to, the multicast might still return a success?!

Thanks a lot;
Vina

On Fri, Aug 8, 2008 at 3:53 PM, John Lane Schultz <
jschultz at spreadconcepts.com> wrote:

> Your conception of causality is, I think, a bit different than what Spread
> actually provides.  In particular, there is no strong linkage between a
> client dying and its daemon eventually detecting that fact.  It is entirely
> possible that a client sends a message to its daemon, then one of the
> intended recipients crashes while Spread is oblivious to that fact for some
> period of time and continues trying to deliver messages to the recipient
> believing it to be alive.  There is no cheap way around this problem as
> there is no common knowledge in distributed systems.
>
> This points to one issue with your design: one of the "servers" can crash
> while the others still assume it is handling it's modulo spot in the group,
> when it is not in fact.  Eventually, the "server's" failure will be detected
> by Spread and a membership change will be issued, but in the mean time those
> requests assigned to it by your load balancing algorithm would be lost.
>
> Another way requests could be lost in your system is through network
> partitions.  If you client's daemon was partitioned away from all of the
> alive "servers'" daemons, then its requests would fall on deaf ears (i.e. -
> an empty group).
>
> The RELIABLE message service essentially means that your daemon will ensure
> that any other daemons (and their alive clients) to which it remains
> connected will get your message (i.e. - intermittent network losses will be
> overcome).  If your daemon disconnects from other daemons (i.e. -
> CAUSED_BY_NETWORK membership change), then your message may or may not get
> to those daemons (and their clients) depending on the exact chain of events.
>
> The only 100% certain way for your client to know its message was received
> and processed is for it to receive an explicit client level ACK from a
> recipient indicating exactly that.
>
> PS - When SP_multicast returns success, that only means that your message
> was successfully queued (and possibly sent) to go to your Spread daemon and
> nothing more.
>
> Cheers!
> John
>
> ---
> John Lane Schultz
> Spread Concepts LLC
> Phn: 443 838 2200
> Fax: 301 560 8875
>
> Friday, August 8, 2008, 6:01:57 PM, you wrote:
>
> > Hi,
>
> > I am new to spread, and I am trying to understand how the reliable
> > delivery works. Here is what I want to do: A client sends a message
> > to a group of servers (the client is not a member of the Servers
> > group) and one member in the Servers group will process the request
> > based on an internal load balancing function (mod # servers in the
> > group). If the spread client multicasts an AGREED message to the
> > Servers group, the multicast will return with a success as soon as
> > the message is received by one spread daemon.  Then the spread
> > daemon will multicast the message to all the members of the group.
> > Since AGREED service type follows causality, if any member in the
> > Servers group dies after client sends the message, and before Spread
> > delivers the message to all group members, the alive servers will
> > receive the membership change before receiving the client's message,
> > and can rearrange the load balancing function to handle all the
> > requests. The only time that a client's request might be lost is
> > when all members of the Servers group die, in which case client's
> > multicast will still return with success, while no server has
> > received it. So can I assume that as long as at least one server is
> > alive, the client can be sure 100%  that all alive servers will
> > always receive a client's request? or am I missing some edge cases
> > where the message might still get lost somewhere in the queues?
>
> > The same question for multicasting to private groups. If a client
> > wants to talk to another client, using a central Spread daemon, as
> > soon as the daemon receives the message, multicast will return
> > success for the sender client. the daemon then will send the message
> > to the receiver client reliably. Therefore the only case that the
> > receiver might not receive the message is when it dies before the
> > message is delivered. So can I assume that as long as the receiver
> > client is alive reliable delivery is 100% ? or am I missing some edge
> cases here?
>
> > I also saw some threads about Spread recognizing the membership
> > change (leave) with delay. does this mean that the alive members of
> > a group might receive the membership message after a regular message
> > that was sent after the actual death of a certain member? doesn't this
> contradict the causality?
>
> > I would be grateful if someone can explain these cases.
>
> > Thank you;
> > Vina
>
>

-- 
Vina Ermagan

https://sosa.ucsd.edu/people/vermagan/
CSE Department
University of California San Diego
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.spread.org/pipermail/spread-users/attachments/20080808/77963a09/attachment.html