[Spread-users] Partition problem
aalmeida at bbn.com
Tue Apr 8 21:04:29 EDT 2003
Surprisingly, we discovered that the problem was indeed due to specifying "127.0.0.1" per _segment_ in our spread.conf file.
This was the result of mistakenly including "127.0.0.1" per segment when dynamically generating spread.conf files. We would dynamically generate a new spread.conf file for each experiment run, including only those hosts which were involved.
Of interest was that this partitioning behavior (and later merge) would happen across 2 WAN sites with a _small_ group. With a larger group size (initial and joining), we did not see this behavior..
Yair did not know offhand why this would be the case, but cited that the 127.0.0.1 entry would only be valid in the instance where you have a single spread segment containing a single host... for example, you are testing Spread on a single machine. (Makes sense).
Both JHU and BBN did not have the time to further investigate how the misconfiguration produced such an error.
However, if you run into a similar problem and do _not_ have 127.0.0.1 incorrectly specified for each of your Spread Segments in your spread.conf, I'm sure the Spread community will want to know <:
If you want more details as to how we produced this error, I'll be happy to share. As for the s, r utility, we saw instances where links with high reported loss rates still showed consistent join times... so Yair might be the better person to ask about those utilities. I recall older messages on the Spread mailing list that discussed s,r as well. If you joined the mailing list more recently, search the archives at
http://lists.spread.org/pipermail/spread-users/ (grab the full raw archive).
At 09:20 AM 04/07/2003 -0400, G. Naik wrote:
>Although, my test environment is noticeably different than yours, I am
>interested in how spread performs in dynamic network environments
>(continuously changing routes, intermittent connectivity). Of course in
>your case, it is packet loss. (In fact does anyone have any numbers on
>how spread performs in environments prone to packet loss.)
>As of recent, my colleagues has been conducting various tests to
>determine join/leave times. Unfortunately, I cannot confirm if we are
>experiencing similar partition behaviors, however, I will continue to
>At some point in the next two weeks, I will run the s, r utilities to
>determine how my testing environment performs. I will post those results
>when they are available.
>In the meantime, please continue to post your results/experiences.
>On Wed, 2 Apr 2003, Aswin Almeida wrote:
>> Hello folks.
>> BBN Technologies is conducting experiments on the Spread and Secure Spread (layered architecture) for DARPA.
>> Recently, we experienced problems with a "partitioning issue" which can affect the measurement of join times.
>> Yair is aware of these problems via our BBN-JHU-SRI experiment mailing list.
>> I wanted to appeal to a _wider audience_ as well to see if anyone else has ideas.
><-snip, snip, snip->
>Gaurav Naik ("g") | SWAT: Secure Wireless Agent Testbed
>Data Fusion Lab | Drexel University - Philadelphia, PA, USA
>Spread-users mailing list
>Spread-users at lists.spread.org
More information about the Spread-users