[Spread-users] More on messge freezes

Doug Palmer Doug.Palmer at csiro.au
Wed Jun 20 21:28:15 EDT 2007


On Tue, 2007-06-19 at 09:03 -0400, Matthew Gillen wrote:

> There could be a few things going on.  If one of the machines has a high load
> (that also happens to have high priority) and the spread daemon doesn't get
> any cpu time to execute, I imagine you might see behavior like you describe
> (ie sending traffic one direction is very low latency, the other very high).

I've been watching the CPU level on the machines and nothing seems
particularly noteworthy. The daemons are set to Normal priority (on
Windows XP).

> You might double check to see if the broadcast and/or multicast addresses are
> set up right on each machine (depending on which you're using).

I think they are:

The broadcast address is x.y.z.255 the netmask is x.y.z.192 and the four
machines are x.y.z.77, x.y.z.78, x.y.z.79 and x.y.z.99. I've checked the
registry settings and all the interfaces are set to use 1s for broadcast
addresses.

> Finally, try pumping up the debug output level to see what's going on w.r.t.
> membership messages (ie if you notice a certain node keeps getting kicked out,
> then joining, and so on, you can be reasonably sure that node is the problem.
> ie try "DebugFlags =  { PRINT EXIT STATUS FLOW_CONTROL MEMBERSHIP }" in your
> spread.conf

I've done that (after working out that I should be using pDEBUG and
removing the braces in EventPriority). I'm certainly getting data
logged, since I can see the spread daemons appear and disappear as I
start/stop them. However, nothing seems to be getting kicked out when I
run my test programs.




More information about the Spread-users mailing list