[Spread-users] 1 problem with Spread.pm and 1 problem with spread daemon, was
Tom Mornini
tmornini at infomania.com
Thu Jan 17 21:47:21 EST 2002
On Thursday, January 17, 2002, at 03:00 PM, John David Duncan wrote:
> My spread error condition recurred again today -- under relatively heavy
> traffic, spread seemed to stop. I could not connect with spuser to
> spread on any server, but I could get output from spmonitor. I had to
> kill and restarted spread on all three servers in the segment.
We are also continuing to have problems. Changes to our system have
prevented (for the moment)
Spread from actually hanging up, so I cannot provide further spmonitor
output.
However, I have done some careful harness testing and have discovered
two interesting situations. Both situations were discovered while using
the Spread.pm module.
1) Spread.pm Perl module returns a Perl undef from multicast() in some
circumstances. I've seen it when I call multicast() repeatedly without
calling receive() in circumstances when there are incoming errors
messages. It *might* be related to issue #2 below, however.
2) Somewhat more mysteriously, receive() only processes get
spontaneously disconnected and receive() returns a CONNECTION_CLOSED
correctly whenever a receiver is hammered continuously from more than 1
process on the same box. The sending processes DO NOT get disconnected,
however.
What would cause this? Is this a matter of not emptying the queue fast
enough and buffers overflowing? If so, it would seem better to me to
block on multicast() in the senders under this circumstance, subject to
the connect() timeout value, of course.
A debug ALL output from the Spread daemon itself can be found here:
http://www.mornini.com/spread.log.gz
The offending event seems to be summarized by these lines (prefixed by
line numbers):
344821:[Fri 18 Jan 2002 00:52:53] Sess_write: killing mbox 9 for not
reading
347827:[Fri 18 Jan 2002 00:52:53] Sess_kill: killing session r0-9
( mailbox 9 )
347895:[Fri 18 Jan 2002 00:52:53] G_handle_kill: #r0-9#localhost is
killed
347896:[Fri 18 Jan 2002 00:52:53] G_handle_kill in GOP
--
-- Tom Mornini
-- eWingz Systems, Inc.
More information about the Spread-users
mailing list