[Spread-users] 1 problem with Spread.pm and 1 problem with spread daemon, was

Tim Peters tim at zope.com
Thu Jan 17 22:59:08 EST 2002


[Tom Mornini]
> ...
> 2) Somewhat more mysteriously, receive() only processes get
> spontaneously disconnected and receive() returns a CONNECTION_CLOSED
> correctly whenever a receiver is hammered continuously from more than 1
> process on the same box. The sending processes DO NOT get disconnected,
> however.
> ...
> A debug ALL output from the Spread daemon itself can be found here:
>
> http://www.mornini.com/spread.log.gz
>
> The offending event seems to be summarized by these lines (prefixed by
> line numbers):
>
> 344821:[Fri 18 Jan 2002 00:52:53] Sess_write: killing mbox 9 for not
> reading
> 347827:[Fri 18 Jan 2002 00:52:53] Sess_kill: killing session r0-9
> ( mailbox 9 )

That pair of lines means the mailbox for connection with private name r0-9
has 1000 unread msgs in its queue.  Spread automatically disconnects the
offending client at that point.  The aggregate size of the messages doesn't
matter, and neither does their age, it's solely the message count that
matters.  This isn't documented (AFAIK), but Yair explained it here
recently.  We've been chasing the same thing in our project the past two
days (although we didn't know that's *what* we were chasing until today --
our clients are read/write and multi-threaded, and once that disconnect
happens, it has Obfuscating Effects on other threads also accessing the
suddendly-dead mbox).

> 347895:[Fri 18 Jan 2002 00:52:53] G_handle_kill: #r0-9#localhost is
> killed
> 347896:[Fri 18 Jan 2002 00:52:53] G_handle_kill in GOP

Sorry, I haven't bumped into those yet; offhand it sure looks like they're
just telling you that Spread is removing ro-9 from the group(s) it's in, due
to disconnection (we've only had SESSION debugging turned on).






More information about the Spread-users mailing list