[Spread-users] SP_kill() calls in SP_receive and multicast calls
Alec H. Peterson
alec.peterson at messagesystems.com
Fri Sep 15 10:46:43 EDT 2006
Greetings all,
We use Spread as part of an event-based system, and we perform select
()-ish behavior on the mbox file descriptor. We ran into a problem
recently where our scheduler will assert a 'duplicate file
descriptor' crash when Spread is unexpectedly disconnected. We
tracked this down to the fact that if any of the SP_ calls detect a
system call failure on the mbox, they call SP_kill() on the mbox
before returning. This is a very, very bad race condition for
systems (like ours) that track events based on file descriptor.
We've modified our local version of Spread to just not call SP_kill()
in these situations, but we're curious if this behavior could be
modified in future releases of Spread. Specifically, some sort of
session-wide option could be set after connect to preclude the
closing from happening, which would preserve the prior behavior for
backward compatibility. However, I am not sure just removing the
SP_kill() calls is that bad a thing, as I believe an application is
supposed to call SP_disconnect() whenever it receives a fatal error
from one of the SP_ calls...
Thanks,
Alec
More information about the Spread-users
mailing list