[Spread-users] Issue with Spread going silent

Luke Marsden luke-lists at hybrid-logic.co.uk
Sun Nov 7 09:20:31 EST 2010


By the way this is what the stable failure states looks like:

http://lukemarsden.net/stable-fail.png

To my untrained eye, this seems to match up somewhat with the
point-of-view of the spmonitor output.

Is there anything we could do to stop Spread locking up in this way?

-- 
Best Regards,
Luke Marsden
CTO, Hybrid Logic Ltd.

Web: http://www.hybrid-cluster.com/
Hybrid Web Cluster - cloud web hosting

Mobile: +447791750420


On Sun, 2010-11-07 at 14:13 +0000, Luke Marsden wrote:
> Hi Yair,
> 
> Thank you. I agree 4% packet loss is high. I get quite a bit of packet
> loss when saturating the network interfaces (spsend/recv or ping -f),
> but none at all when transmitting just a small amount of traffic. Since
> in normal operation Spread shouldn't go near saturating the network
> interfaces, I agree that this is unlikely to be the cause of the
> problem. An interesting artefact of the virtualisation though.
> 
> I have rearranged the machines in the spread.conf. They are using their
> public IPs for this test, not the 10.0.0.* addresses (although they
> exhibit the same behaviour either way):
> 
> Spread_Segment 178.22.66.147:4803 {
>     2f20196c853548e7 178.22.66.147
> }
> Spread_Segment 178.22.67.102:4803 {
>     27edda570dce48bb 178.22.67.102
> }
> Spread_Segment 178.22.67.48:4803 {
>     fff0bbd5e0da4103 178.22.67.48
> }
> 
> I've added the MEMBERSHIP debug flag, and this is the output. I started
> the spread daemons from left-to-right, which now corresponds to
> top-to-bottom :-)
> 
> http://lukemarsden.net/debugging.png
> 
> Does this shed any light?
> 





More information about the Spread-users mailing list