[Spread-users] Circular token over spread: 2 seconds lap time?

Andreu Moreno i Vendrell amvendrell at yahoo.es
Wed Jul 28 13:01:33 EDT 2004


Hello,

We have changed only Hurry_timeout to 40 ms and the lap time goes to  
milisecond range.

Thanks for you help.

So the desicion is to selecte the suitable value of Hurry_timeout!

Thanks,

Andreu Moreno


>The protocol layer code isn't the area of Spread that I'm most
>familiar with, but as far as I understand it, if the network is doing
>great and there aren't a lot of packets being sent/lost, the network
>leader will hold the token for Hurry_timeout.  To see the code I just
>scanned through to try to figure this out, look at
>Prot_handle_token(), To_hold_token(), and Prot_token_hurry() in
>protocol.c.
>
>This is (I think) a performance decision, to avoid wasting too many
>resources rotating the token when it isn't necessary... the goal of
>the system is more throughput, than latency.
>
>The other timeouts in membership.c are definitely unrelated...
>collectively, they represent the time at which Spread assumes the
>token is lost, and the times of several phases of the daemon
>membership algorithm.  You may want to update them (carefully) in
>order to improve the performance of the daemon membership algorithm on
>a low-latency network.  In general, only do so proportionally.
>
>I suspect that decreasing Hurry_timeout should make your problem go
>away, although there are reasons not to do so if you're using Spread
>for real.  Let the list know if this works for you.
>
>Cheers,
>Ryan
>
>
>On Tue, 27 Jul 2004 14:15:47 -0700, Steven Dake <sdake at mvista.com> wrote:
>  
>
>>Gautam
>>
>>I have tried a similiar protocol (http://developer.osdl.org/dev/openais)
>>with 12 processors and find the token rotation time under full network
>>load to be about 10msec.  Thus, I'd not suggest setting your token
>>timeout to lower then this value and you may even want a much larger
>>value.  I settled on 100msec, although I do not support WAN
>>configurations which would warrant much larger timeouts.
>>
>>I find each node takes about 0.5msec to handle the token under full
>>network load (10MB/sec throughput on a 100mbit network, 1472 sized
>>packets) takes about 6msec for one node to send the messages and other
>>nodes to process them per token rotation.
>>
>>If spread uses the same algorithms as Yair Amir's PHD thesis suggests,
>>none of those timer values should have any effect on performance of the
>>token rotation.  These timeouts are only for determining a configuration
>>and determining a faulty processor.
>>btw, I am not familiar with some of the timeouts below so I could be
>>wrong :).
>>
>>Thanks
>>-steve
>>
>>
>>
>>On Tue, 2004-07-27 at 08:19, Gautam H. Thaker wrote:
>>    
>>
>>>The "2 second" value is a results of the default spread timing
>>>parameters which are:
>>>
>>>Default Spread parameters:
>>>
>>>Token_timeout.sec  =   5; Token_timeout.usec  = 0;
>>>Hurry_timeout.sec  =   2; Hurry_timeout.usec  = 0;
>>>Alive_timeout.sec  =   1; Alive_timeout.usec  = 0;
>>>Join_timeout.sec   =   1; Join_timeout.usec   = 0;
>>>Rep_timeout.sec    =   2; Rep_timeout.usec    = 500000;
>>>Seg_timeout.sec    =   2; Seg_timeout.usec    = 0;
>>>Gather_timeout.sec =   5; Gather_timeout.usec = 0;
>>>Form_timeout.sec   =   5; Form_timeout.usec   = 0;
>>>Lookup_timeout.sec =  60; Lookup_timeout.usec = 0;
>>>
>>>In my tests I have noted that these values results in Spread
>>>communications suffering a maximum latency of 2 seconds. When I change
>>>these parameters to values below the maximum latencies I observe are
>>>much less.
>>>
>>>"Very Fast" Spread parameters:
>>>
>>>Token_timeout.sec  =   0; Token_timeout.usec  = 100000;
>>>Hurry_timeout.sec  =   0; Hurry_timeout.usec  =  40000;
>>>Alive_timeout.sec  =   0; Alive_timeout.usec  =  20000;
>>>Join_timeout.sec   =   0; Join_timeout.usec   =  20000;
>>>Rep_timeout.sec    =   0; Rep_timeout.usec    =  60000;
>>>Seg_timeout.sec    =   0; Seg_timeout.usec    =  40000;
>>>Gather_timeout.sec =   0; Gather_timeout.usec = 100000;
>>>Form_timeout.sec   =   0; Form_timeout.usec   = 100000;
>>>Lookup_timeout.sec =   1; Lookup_timeout.usec = 200000;
>>>
>>>
>>>The latencies ranges observed for a variety of message sizes for these
>>>two parameter values are shown in the attached graphic. (All our test
>>>results are also available online at:
>>>
>>>http://www.atl.external.lmco.com/projects/QoS/compare/cgi-bin/left2_part1.cgi?filter=emulab.*%28spread%7Ctcp%29
>>>
>>>I was wondering if anyone has pushed Spread parameter to even much lower
>>>than "very fast" values. Certainly on Linux 2.6 kernel or on Solaris
>>>both of which have 1000 HZ clocks the lowest value of parameter should
>>>be settable at about 2 msec (rather than 20 msec in "very fast" above.)
>>>
>>>Gautam
>>>
>>>Andreu Moreno i Vendrell wrote:
>>>      
>>>
>>>>Hello,
>>>>
>>>>We have 2 seconds lap time in a circular token over spread. Do you know what's
>>>>wrong?
>>>>
>>>>Test description:
>>>>
>>>>a) 3 computers in an isolated LAN: Machine 1, Machine 2 and Machine 3.
>>>>b) Spread 3.17.2 version installed in every machine.
>>>>c) RedHat 8.0 Linux installed in every machine.
>>>>d) Machine 1: runs a program that joins group "1" and on reception of a
>>>>message it sends a message to group "2".
>>>>e) Machine 2: runs a program that joins group "2" and on reception of a
>>>>message it sends a message to group "3".
>>>>f) Machine 3: runs a program that joins group "3" and on reception of a
>>>>message it sends a message to group "3". This program is the last to be
>>>>executed and also sends a message to group "1" to start the token to
>>>>circulate.
>>>>
>>>>Results:
>>>>
>>>>The lap time is about 2 seconds?????
>>>>
>>>>Thanks,
>>>>
>>>>Andreu
>>>>
>>>>        
>>>>






More information about the Spread-users mailing list