[Spread-users] lost flooder messages

Ryan Caudy caudy at jhu.edu
Fri May 9 16:09:11 EDT 2003


I just peeked at flooder's source code.  Immediately after it decides 
that it has finished sending, it exit()s, which should close the mbox 
(read: socket) returned by SP_connect().  Is it possible that the 
operating system chooses not to finish sending data that remains in the 
send buffer for a socket when that socket is closed?  I'm not sure the 
answer to this question; it depends on the semantics of a blocking 
send() call on a tcp socket, and of closing a socket without calling 
close() explicitly.  Waiting a very short amount of time, or making sure 
to receive back all of the messages sent before exit()ing, should solve 
this problem, if I'm correct.

--Ryan

Kelvin Fedrick wrote:
>  
> 
> John Schultz wrote:
> 
>> Could it be that Spread sometimes detects that you have left (broken
>> pipe) before it actually sends your msgs out to the other daemons and
>> therefore discards your msgs? This would explain all of the symptoms. On
>> what kind of system are you running? On what kind of system is the
>> daemon running? What kind of communication are you using to the daemon
>> (TCP/IP remote or UNIX domain socket to local)?
> 
> 
> We've experienced this in several different configuration including:
> 
>    - single daemon on Linux w/clients presumably using domain sockets 
> (whatever the standard test clients use)
>    - single daemon on Linux w/remote client from either same Linux box, 
> different Linux box, or Windows XP
>    - two daemons on different Linux boxes each with local client
> 
>>  
>>
>> BTW, Spread discarding your msgs like this, though not desirable, is
>> allowed by the safety and liveness properties of the system. To gurantee
>> that all of your messages are actually sent in the system, the sender
>> must stay in the group until it receives back its own messages. You can
>> do this by making your flooder both a sender and a receiver.
>>
> I'm not sure I understand what you're saying. In spread, a sender need 
> not even
> be a group member to multicast to it, so why would the sender be 
> required to
> stay in the group if it happened to be a member for a delivery 
> guarantee? Also,
> the default flooder is both a sender a receiver and that is exactly when 
> the problem
> occurs. It has not occurred when I use the -wo write-only flag. I would 
> assume
> (naively perhaps) that if SP_multicast returns with no error, the daemon 
> should
> have the message and deliver it no matter what then happens to the sender.
> 
>>  
> 
>>  
>>
>> John
>>
>> Kelvin Fedrick wrote:
>>
>> > I just tried it and it doesn't seem to help.
>> >
>> > ./spflooder -m 500 send 500 messages with a sequence number and the 
>> spuser
>> > only received message 1 - 498. The number of messages received is 
>> variable;
>> > occassionally it gets them all but there are usually a few missing 
>> (e.g.
>> > 4994 of 5000
>> > were received on a run I just made).
>> >
>> > Kelvin
>> >
>> >
>> > Joshua Goodall wrote:
>> >
>> >     On Thu, May 08, 2003 at 02:44:37PM -0600, Kelvin Fedrick wrote:
>> >      > I saw a few previous post on this from July 2002, but I never 
>> saw a
>> >      > definitive resolutions. We've experienced the same problem. We
>> >      > modified the spflooder to send a message sequence number and 
>> find
>> >      > that often a small percentage of messages at the end are never
>> >     delivered
>> >      >
>> >      > (e.g. 400 sent but only 398 delivered).
>> >      > Placing a 1 second sleep at the end of the flooder program
>> >     main(), just
>> >      > before exit seems to fix this. Also I haven't been able to
>> >     reproduce it
>> >      > so
>> >      > far running spflooder as write-only. Any ideas?
>> >
>> >     Instead of a sleep, does adding a SP_disconnect(Mbox) also fix it
>> >     for you?
>> >
>> >     J
>> >
>> >     --
>> >     Joshua Goodall                                      "tea makes 
>> itself"
>> >     joshua at roughtrade.net                                       - 
>> Ana Susanj
>> >
>>
>> _______________________________________________
>> Spread-users mailing list
>> Spread-users at lists.spread.org
>> http://lists.spread.org/mailman/listinfo/spread-users
>>





More information about the Spread-users mailing list