[Spread-users] controlling network split and merge

Thu Feb 12 17:59:37 EST 2004

Xudong,

The timings of the membership protocol are controlled in the
membership.c file at function Memb_init.  When changing times, I'd
suggest changing the times by an equal percentage.

The details of the timing variables are explained partly in Yair Amir's
PHD thesis on the spread website.  Others are not so clear, as the
spread membership algorithm is not completely documented.  I'd suggest
keeping the minimum at 100 msec since most unix-like platforms have 10
msec scheduling (and hence 10 msec is the minimum poll can sleep). 
Sometimes bad drivers may cause real-time performance to be poor
(greater then 40 msec on platforms such as Linux 2.4).  If your using
Linux 2.6, you should be fine with lower timeouts.  Your mileage may
vary on other platforms.

Thanks
-steve

On Thu, 2004-02-12 at 12:42, Xudong Yan wrote:
> Hi,
> 
> I am trying to figure out how to control the timing of the network split / merge messages.
> 
> I have a pretty simple config. 2 machines, spread daemon running on each machine.
> 
> Then the connection between the two machines gets dirupted for some time and then gets restored.
> 
> 
> When the connection breaks, I get a TRANSITION_MESS followed by a REG_MEMB_MESS | CAUSED_BY_NETWORK
> 
> 
> When the connection restores, I get a REG_MEMB_MESS | CAUSED_BY_NETWORK
> 
> 
> The questions, how do I control (I assume patches to spread is needed) how fast I get the two messages
> upon the network event?
> 
> The problem I am running into right now, is that it takes around 10 / 20 seconds after the network
> connection is restored before I get the REB_MEMB_MESS.
> 
> 
> Thanks
> 
> 
> Xudong Yan
> 
> xudong at neoteris.com
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
> 
>