[Spread-users] Spread Pause (Then Burst) On Daemon Disconnect

John Schultz jschultz at spreadconcepts.com
Wed Dec 1 15:36:19 EST 2010


It is expected that when a daemon fails there will be a pause in the system as it (a) determines there has been a membership failure and (b) establishes a new membership.

How long that process takes is determined by the timeouts at the top of membership.c  If you are on a decent LAN, you could try lowering all the timeouts by a constant factor (e.g. - 2, 3, etc.).

Your system will still pause and there will be a burst of messages after the pause but the pause will be shorter and the burst should be smaller.

Cheers!

-----
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200

On Dec 1, 2010, at 11:46 AM, echristianson at nssc.com wrote:

Hello,

I am working on a distributed system which incorporates Spread and I have run into the following issue:  when the Spread Daemon running on a machine is shut down, the entire Spread network will pause (between 5-15 seconds), followed by a burst of packets received.  The system then returns to operating normally.  This does not happen when a Spread Daemon connects.

When I was looking through the mailing list archives I found a similar issue noted, with no replies, here:
http://lists.spread.org/pipermail/spread-users/2010-September/004335.html

My issue is different in that connectivity returns after a pause and burst.

I was hoping that more information may be available about this issue (or if this behavior is expected) so that I could resolve the problem through reconfiguring spread, switching to a different networking model or some other solution.  At any rate, hopefully I can provide some information that can contribute to understanding the issue better.

Here is the configuration file used on each of the machines (there is one daemon per machine):

Spread_Segment  255.255.255.255:4803 {
	CASSRVR		10.46.97.1
	SASSRVR		10.46.97.2
	CAS-WS1	    	10.46.97.40
	CAS-WS2	    	10.46.97.41
	CAS-WS3	    	10.46.97.42
	CAS-WS4	    	10.46.97.43
	CAS-WS5	    	10.46.97.44
	CAS-WS6	    	10.46.97.45
	SAS-WS1	    	10.46.97.46
	SAS-WS2	    	10.46.97.47
	SAS-WS3	    	10.46.97.48
	SAS-WS4	    	10.46.97.49
	SAS-WS5	    	10.46.97.50
	SAS-WS6	    	10.46.97.51
	CASCTRL    	10.46.97.52
	SASCTRL	    	10.46.97.53
       FAC             10.46.97.60     
}

CASSRVR and SASSRVR are running Windows Server 2008
CASCTRL and SASCTRL are running Windows 7 Ultimate
All the other workstations are running windows XP

Is there any other information that would be helpful?

Cheers,
Erik Christianson



_______________________________________________
Spread-users mailing list
Spread-users at lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3805 bytes
Desc: not available
Url : http://lists.spread.org/pipermail/spread-users/attachments/20101201/2c54d46f/attachment.bin 


More information about the Spread-users mailing list