[Spread-users] lost flooder messages

Fri May 9 16:14:46 EDT 2003

Kelvin Fedrick wrote:

>  
> John Schultz wrote:
> 
>     Could it be that Spread sometimes detects that you have left (broken
>     pipe) before it actually sends your msgs out to the other daemons and
>     therefore discards your msgs? This would explain all of the
>     symptoms. On
>     what kind of system are you running? On what kind of system is the
>     daemon running? What kind of communication are you using to the daemon
>     (TCP/IP remote or UNIX domain socket to local)?
> 

Jonathan, Yair do you have any ideas why such behavior could be 
occurring? Could this be a operating system glitch where it drops msgs 
still in the OS buffers if the sending descriptor is closed "early?"

> 
> We've experienced this in several different configuration including:
> 
> - single daemon on Linux w/clients presumably using domain sockets 
> (whatever the standard test clients use)
> - single daemon on Linux w/remote client from either same Linux box, 
> different Linux box, or Windows XP
> - two daemons on different Linux boxes each with local client
> 

Could you post your modified flooder to the list with instructions on 
how to run it? This way we can quickly reproduce and diagnose the 
problem you are describing.

>     BTW, Spread discarding your msgs like this, though not desirable, is
>     allowed by the safety and liveness properties of the system. To
>     gurantee
>     that all of your messages are actually sent in the system, the sender
>     must stay in the group until it receives back its own messages. You can
>     do this by making your flooder both a sender and a receiver.
> 
> I'm not sure I understand what you're saying. In spread, a sender need 
> not even
> be a group member to multicast to it, so why would the sender be 
> required to
> stay in the group if it happened to be a member for a delivery 
> guarantee? Also,
> the default flooder is both a sender a receiver and that is exactly when 
> the problem
> occurs. It has not occurred when I use the -wo write-only flag. I would 
> assume
> (naively perhaps) that if SP_multicast returns with no error, the daemon 
> should
> have the message and deliver it no matter what then happens to the sender.
> 
I was just pointing out that the hard gurantees (the safety and liveness 
properties) of the system technically could allow Spread to drop messages 
when it detects a sender "crash." By the properties, the only way the 

system is compelled to deliver a sender's messages is by the Self-Delivery 

property. This property states that the system must eventually deliver a 

sender's msg back to the sender if it doesn't crash.

There are no formulated hard gurantees about open group sends (sending to 

groups of which you are not a member). Practically speaking however, I'm sure 

Spread does a best effort sort of thing.

John