[Spread-users] Spread delay on daemon death

John Schultz jschultz at spreadconcepts.com
Wed Apr 8 09:10:51 EDT 2009


Token_timeout controls how long it takes for daemons to suspect and
declare a failure.  Alive_timeout through Form_timeout control how
long it takes to form a new membership.

I recommend that you keep the relative ratios between your timeouts
as they are now.  So, for example, you could divide the original timeouts
by a factor of 2 and the time it would take to declare and heal from
a failure would be cut in about half.  I noticed you've already
lowered Hurry_timeout considerably, so you might want to go back to
the vanilla timeouts before you divide by a factor, or leave
Hurry_timeout where it is and just adjust the others.

The only problem with lowering the timeouts is that if you take them
too low, then you can false-positives on failure detection and/or you
can cause the forming of a membership to fail repeatedly.  So, you
need to experiment with your network and make sure you don't take it
too low.

On a clean LAN, you could probably divide all the !Wide_network
timeouts by a factor of 5 and be OK.  But you'd need to test to
verify this for your environment.

Cheers!
John

---
John Lane Schultz
Spread Concepts LLC
Phn: 443 838 2200 
Fax: 301 560 8875

Wednesday, April 8, 2009, 7:05:08 AM, you wrote:

> Hi

> We have a single segment spread network, and when a box dies we get a 
> pause in the spread messages which is approx 12s
> We would like to reduce this to 1s

> Which time outs control this?

> Regards
> Adrian

> membership.c is

>     if( Wide_network )
>     {
>         Token_timeout.sec  =  20; Token_timeout.usec  = 0;
>         Hurry_timeout.sec  =   6; Hurry_timeout.usec  = 0;

>         Alive_timeout.sec  =   1; Alive_timeout.usec  = 0;
>         Join_timeout.sec   =   1; Join_timeout.usec   = 0;
>         Rep_timeout.sec    =   5; Rep_timeout.usec    = 0;
>         Seg_timeout.sec    =   2; Seg_timeout.usec    = 0;
>         Gather_timeout.sec =  10; Gather_timeout.usec = 0;
>         Form_timeout.sec   =  10; Form_timeout.usec   = 0;
>         Lookup_timeout.sec =  90; Lookup_timeout.usec = 0;
>     }else{
>         Token_timeout.sec  =   5; Token_timeout.usec  = 0;
>         Hurry_timeout.sec  =   0; Hurry_timeout.usec  = 100000;

>         Alive_timeout.sec  =   1; Alive_timeout.usec  = 0;
>         Join_timeout.sec   =   1; Join_timeout.usec   = 0;
>         Rep_timeout.sec    =   2; Rep_timeout.usec    = 500000;
>         Seg_timeout.sec    =   2; Seg_timeout.usec    = 0;
>         Gather_timeout.sec =   5; Gather_timeout.usec = 0;
>         Form_timeout.sec   =   5; Form_timeout.usec   = 0;
>         Lookup_timeout.sec =  60; Lookup_timeout.usec = 0;
>     }


> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> ______________________________________________________________________

> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users





More information about the Spread-users mailing list