[Spread-users] Disabling Spread's WAN flow control ...

John Schultz jschultz at spreadconcepts.com
Mon Mar 28 18:24:08 EST 2005


A Spread user recently found that his Spread performance 
(throughput/latency) dropped considerably when he used multiple 
Spread_Segments in his configuration compared to a single Spread_Segment 
configuration.  His performance drop occurred even if all the active 
daemons were running in a single segment.

What he ran into was Spread's WAN flow control, which attempts to reduce 
the likelihood of Spread losing packets due to network congestion on 
WANs by globally slowing sending.  This purposeful behavior only occurs 
in configurations where two or more Spread_Segments have machines in 
different class B (a.b.c.d) network IP address ranges (e.g. - one 
segment has a machine with an IP address starting with 128.220 and a 
machine in another segment has an IP address starting with 155.100). 

After examining the code we believe that the current WAN flow control 
can be improved and we are considering addressing this issue better in 
one of the next releases of Spread. In the mean time, if you are using a 
multi-segment configuration with machines in different class B IP 
address ranges, but your configuration is not actually a WAN (e.g. - one 
segment has machines in the 10.1 range and another has machines in the 
10.2 range, but all these private IPs are routable to one another within 
a LAN type environment), then a short term fix to disable this behavior 
is at the bottom of this email.

Please note that if you disable this behavior on an actual WAN network, 
that the stability (i.e. - liveness: the ability to make good forward 
progress in passing messages) of Spread may be impaired.

On line 165 of membership.c (v3.17.3) add the line:

   Wide_network = 0;

In diff/patch format:

*** membership.c       2005-03-28 15:59:43.260869935 -0500
--- membership2.c       2005-03-28 15:59:56.571312424 -0500
***************
*** 155,174 ****
--- 155,175 ----
       for( i=1; i < Cn.num_segments; i++ )
       {
               current_subnet = Cn.segments[i].procs[0]->id;
               current_subnet = current_subnet & 0xffff0000;
               if( current_subnet != reference_subnet )
               {
                       Wide_network = 1;
                       break;
               }
       }
+      Wide_network = 0;

       if( Wide_network )
       {
               Token_timeout.sec  =  20; Token_timeout.usec  = 0;
               Hurry_timeout.sec  =   6; Hurry_timeout.usec  = 0;

               Alive_timeout.sec  =   1; Alive_timeout.usec  = 0;
               Join_timeout.sec   =   1; Join_timeout.usec   = 0;
               Rep_timeout.sec    =   5; Rep_timeout.usec    = 0;
               Seg_timeout.sec    =   2; Seg_timeout.usec    = 0;

-- 
John Lane Schultz
Spread Concepts LLC
Phn: 443 838 2200






More information about the Spread-users mailing list