[Spread-users] Partition problem

Sara Kaufman skaufman at bbn.com
Thu Apr 3 11:48:50 EST 2003


At 11:28 AM 04/03/2003 -0500, Yair Amir wrote:

>It's not likely to be a "bug" in Spread because the same configuration
>works for them with the layered architecture of Secure Spread, which
>has a regular Spread 3.17.0 underneath. In theory there should be no
>difference when running only Spread without Secure Spread. In
>practice, native Spread sends these join messages much faster than
>Secure Spread (that needs to compute the keys). Sending these messages
>at that speed creates a upd loss. It is a pure flow control problem but
>in my opinion, it might not be a Spread flow control problem. Last time I 
>checked
>it we traced it to a very high loss rate on all links coming to one cite
>which participated in all the experiments at the time.

Yair,

This is in response to your above statement:
    " Last time I checked it we traced it to a very high loss rate on all 
links coming to one cite which participated in all the experiments at the 
time."

Some of the more recent information provided for the partition issue,  as 
seen within and discussed within the SecureSpread Experiment,
is that we are now seeing the partition on links and sites other  than the 
noisy link /site you refer to.

We  can completely bypass the  site you refer to [the TIC] and the noisy 
links into that site,  and the partition occurs.

It does not  occur all of the time, which is probably why it was not seen 
earlier, especially on the day we were at the TIC,  but it does happen.,

This new info is that the partitioning issue has also been seen on 
the  Experiment links between the Cambridge site and the New York [AFRL] site.
This link was one of the links that we checked using your send and receive 
utility, on the day we [you, Aswin and myself] were all at the TIC.
On that day,  as I recall,  you thought the loss on the Cambridge-AFRL link 
was acceptable.
The send and receive utility still shows roughly the same loss rate of 5 - 
7 %  on this  Cambridge - AFRL link.
[the links into the tic were reporting a 30% loss]

In addition,  On the Cambridge-AFRL link, there seems to be some 
correlation of the partition occurrence to the time of day and presumably 
the network load,.   When the network has [presumably] a lighter load - say 
at 5AM - the partitioning will happen more often.  If the same type Run is 
executed at 3PM,  the partitioning will occur less often.

This new info is why we raised this question again.    This new information 
seemed to conflict with the previous determination.
I'm not sure I'm making myself clear without having a white board to draw 
on.  If  this new information and how it seems to conflict with the 
previous determination
is not clear,  please let me know and I'll try again.
We are still able to conduct our Experiment Runs,  but were hoping that you 
or someone on the mailing list could shed more insight
with respect to this new information.

thank you

Sara


>Aswin's e-mail suggests they get this now with other sites as well.
>
>    Cheers,
>
>    :) Yair.
>
>
>On Thursday, April 03, 2003 5:09 AM
>Ben Laurie ben at algroup.co.uk wrote:
>
>Ben> I'm curious to know why you don't think this is a bug in Spread?
>
>Ben> Cheers,
>
>Ben> Ben.
>
>
>_______________________________________________
>Spread-users mailing list
>Spread-users at lists.spread.org
>http://lists.spread.org/mailman/listinfo/spread-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.spread.org/pipermail/spread-users/attachments/20030403/7bd70790/attachment.html 


More information about the Spread-users mailing list