[Spread-users] SP_kill() calls in SP_receive and multicast calls

Alec H. Peterson alec.peterson at messagesystems.com
Fri Sep 15 10:46:43 EDT 2006


Greetings all,

We use Spread as part of an event-based system, and we perform select 
()-ish behavior on the mbox file descriptor.  We ran into a problem  
recently where our scheduler will assert a 'duplicate file  
descriptor' crash when Spread is unexpectedly disconnected.  We  
tracked this down to the fact that if any of the SP_ calls detect a  
system call failure on the mbox, they call SP_kill() on the mbox  
before returning.  This is a very, very bad race condition for  
systems (like ours) that track events based on file descriptor.   
We've modified our local version of Spread to just not call SP_kill()  
in these situations, but we're curious if this behavior could be  
modified in future releases of Spread.  Specifically, some sort of  
session-wide option could be set after connect to preclude the  
closing from happening, which would preserve the prior behavior for  
backward compatibility.  However, I am not sure just removing the  
SP_kill() calls is that bad a thing, as I believe an application is  
supposed to call SP_disconnect() whenever it receives a fatal error  
from one of the SP_ calls...

Thanks,

Alec





More information about the Spread-users mailing list