[Spread-users] 3 servers but 2 can't communicate (text...)

John Schultz jschultz at spreadconcepts.com
Tue Jan 15 16:55:23 EST 2013


Fabrice,

There is nothing wrong with your configuration and, unfortunately, you cannot *generally* solve this problem with Spread.

Spread assumes that it is possible for each and every daemon to bidirectionally communicate point-to-point with every other daemon in the configuration.  If you violate this assumption, then you can get unexpected behavior.  Spread does not actively try to verify that this assumption holds true in your deployment -- that is left to system administrators.

Spread does not explore all possible overlay paths of interconnecting daemons.  Instead, Spread uses a token ring for its control traffic.  The token will cross the network paths you implicitly define in the order listed in your configuration file.  Using your .conf file for example, if all the links and daemons were up then the token would traverse the following links:

.233 -> .232 -> .229 -> .233 -> ... and so on.

If you now disable one of these network links, then that ring cannot form and so Spread will try to find another ring(s) that can form.  If there are a lot of different rings that could form (imagine a far larger potential membership), then there is a race to see which ones will form first.  Once rings form, then they will continue to persist until a larger ring(s) proves it can form.

Going back to your example, after you broke the link .233 <-> .232 let's say .233 and .229 reformed a ring first and .232 formed a singleton membership by itself.  Then, even though .232 and .229 can communicate and are probing one another, none of the already established rings will be abandoned because larger rings cannot be established.

Let me give you an even nastier scenario.  You don't bidirectionally break the link .233 <-> .232.  Instead, you only unidirectionally break .233 <- .232.  

This means the .233 -> .232 -> .229 token ring can and will form.  However, if/when .232 sends a data msg, .233 will never receive it (because it is sent unicast from .232).  However, .233 will see that it is missing a msg from the control token and request retransmission on the token.  I believe .232 will see this retransmit request on the token, retransmit it (hopelessly) and remove the retransmit request from the token.  And round and round it will go.  Eventually, the transmission window will fill up and all forward progress will stop because of this pathological retransmission failure.

Cheers!

-----
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200

On Jan 15, 2013, at 12:17 PM, Fabrice Clement wrote:

Hi,

    I ( would ) like to use Spread in a Wan (but i'm testing it on a 
lan) with the following configuration :

Spread_Segment  X.Y.Z.233:4803 {
        s1 X.Y.Z.233
}
Spread_Segment X.Y.Z..232:4803 {
        s2 X.Y.Z..232
}
Spread_Segment X.Y.Z..229:4803 {
        s3 X.Y.Z..229
}

    If I disable the link (using a firewall rule) between s1 and s2, 
then s1 or s2 leaves the group.

    Can I configure spread to use s3 as a relay between s1 and s2 or is 
there somethin wrong with my configuration ?

Regards
--
Fabrice


_______________________________________________
Spread-users mailing list
Spread-users at lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3805 bytes
Desc: not available
Url : http://lists.spread.org/pipermail/spread-users/attachments/20130115/58124fbf/attachment.bin 


More information about the Spread-users mailing list