[Spread-users] lost flooder messages

Kelvin Fedrick Kelvin.Fedrick at noaa.gov
Fri May 9 15:36:31 EDT 2003



John Schultz wrote:

> Could it be that Spread sometimes detects that you have left (broken
> pipe) before it actually sends your msgs out to the other daemons and
> therefore discards your msgs? This would explain all of the symptoms. On
> what kind of system are you running? On what kind of system is the
> daemon running? What kind of communication are you using to the daemon
> (TCP/IP remote or UNIX domain socket to local)?

We've experienced this in several different configuration including:

   - single daemon on Linux w/clients presumably using domain sockets (whatever
the standard test clients use)
   - single daemon on Linux w/remote client from either same Linux box, different
Linux box, or Windows XP
   - two daemons on different Linux boxes each with local client

>
>
> BTW, Spread discarding your msgs like this, though not desirable, is
> allowed by the safety and liveness properties of the system. To gurantee
> that all of your messages are actually sent in the system, the sender
> must stay in the group until it receives back its own messages. You can
> do this by making your flooder both a sender and a receiver.

I'm not sure I understand what you're saying. In spread, a sender need not even
be a group member to multicast to it, so why would the sender be required to
stay in the group if it happened to be a member for a delivery guarantee? Also,
the default flooder is both a sender a receiver and that is exactly when the
problem
occurs. It has not occurred when I use the -wo write-only flag. I would assume
(naively perhaps) that if SP_multicast returns with no error, the daemon should
have the message and deliver it no matter what then happens to the sender.

>

>
>
> John
>
> Kelvin Fedrick wrote:
>
> > I just tried it and it doesn't seem to help.
> >
> > ./spflooder -m 500 send 500 messages with a sequence number and the spuser
> > only received message 1 - 498. The number of messages received is variable;
> > occassionally it gets them all but there are usually a few missing (e.g.
> > 4994 of 5000
> > were received on a run I just made).
> >
> > Kelvin
> >
> >
> > Joshua Goodall wrote:
> >
> >     On Thu, May 08, 2003 at 02:44:37PM -0600, Kelvin Fedrick wrote:
> >      > I saw a few previous post on this from July 2002, but I never saw a
> >      > definitive resolutions. We've experienced the same problem. We
> >      > modified the spflooder to send a message sequence number and find
> >      > that often a small percentage of messages at the end are never
> >     delivered
> >      >
> >      > (e.g. 400 sent but only 398 delivered).
> >      > Placing a 1 second sleep at the end of the flooder program
> >     main(), just
> >      > before exit seems to fix this. Also I haven't been able to
> >     reproduce it
> >      > so
> >      > far running spflooder as write-only. Any ideas?
> >
> >     Instead of a sleep, does adding a SP_disconnect(Mbox) also fix it
> >     for you?
> >
> >     J
> >
> >     --
> >     Joshua Goodall                                      "tea makes itself"
> >     joshua at roughtrade.net                                       - Ana Susanj
> >
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.spread.org/pipermail/spread-users/attachments/20030509/13b0b8ee/attachment.html 


More information about the Spread-users mailing list