[Spread-users] spread daemon rejects connections from a disconnected client

Shlomi Yaakobovich Shlomi at exanet.com
Wed Nov 10 07:15:05 EST 2004


Hi all,

We are still running spread 3.17.2 in our systems, but for my specific problem, I don't think that matters.

There are two nodes in our system, each running a spread daemon, and a client is connected locally to each daemon, so the group has only two members. Every few minutes we run spflooder on each node, to see if the daemon is responsive and if messages are sent. The nodes have been up without a problem for about a month. The spread client specifically is running for a month.

The problem we see is that one of the nodes was shut down, and the other node started to experience spread problems. The spread client disconnected from the spread daemon (good), but it failed to reconnect to it (bad). SP_connect would return a -2 COULD_NOT_CONNECT error, and it would repeat itself whenever this specific client tried to reconnect. The funny thing is that spflooder seem to be unaffected - everything works well for it.

I am attaching the spread.log and also strace report that shows the problem, the first accepted connection is from spflooder and it works well, the next connection is rejected. Looking at the code - I believe this happens in session.c : Sess_accept then call Sess_accept_continue, and fails in the block that deals with REJECT_VERSION. I don't know if this is accurate, but if it is, it is strange, because the client uses the same data over and over again, and it usually works. Looking at the strace, recv returned 0, and I don't see this case handled in Sess_accept_continue...

The workaround for this problem is that the spread daemon has been restarted, and everything went back to normal, the client succeeded to reconnect. The fact that this solved the problem might suggest that there is something bad here somewhere.

Any ideas ?

Shlomi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spread.log
Type: application/octet-stream
Size: 7345 bytes
Desc: spread.log
Url : http://lists.spread.org/pipermail/spread-users/attachments/20041110/34f345ca/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spread.strace
Type: application/octet-stream
Size: 79705 bytes
Desc: spread.strace
Url : http://lists.spread.org/pipermail/spread-users/attachments/20041110/34f345ca/attachment-0001.obj 


More information about the Spread-users mailing list