[Spread-users] More Performance Issues/Questions

Wed Dec 22 12:53:36 EST 2004

I've had my network personnel look at our network and they have found no 
problems.  I removed the machine you recommended me to remove and the problem 
persisted.  I wasn't aware of the spsend & sprecv utlilities, but I will 
continue to investigate the possibility of a network problem.

The questions I have that I feel haven't been answered are:

1) If the leader is the talker/publisher why don't I see the problem?
2) If I increase the Hurry_timeout to 4 seconds why does the hang correlate 
with it? 
3) If someone else wants to talk is there a "request for token" protocol?

Question #2 seems to imply that there are side effects or a bug with how the 
leader is behaving when it thinks that no one needs it.  

Paul DeMarco sent this url to the mailing list.

http://lists.spread.org/pipermail/spread-users/2004-July/002119.html

Is there a way to identify the requesting of the token by a talker that is not 
the leader?  I'm thinking that I could set the Hurry_timeout to 10 sec. and I 
should see the talker request the token to be sent. Correct?

I would be interested in running your code from your "The Spread Toolkit: 
Architecture and Performance" paper on my network.  I would like to see what 
kind of results are had with the publisher not being the leader.  Would it be 
possible to get the code?

I would like to use Spread it provides a lot of what I need, but I need to 
know what it's limitations are with the way we intend to use it.  Maybe it's 
a problem with my network, but maybe it's not and that's what I'm trying to 
get to the bottom of.

Thank You for your help,
Mike

On Wednesday 22 December 2004 11:23 am, Yair Amir wrote:
> Mike,
>
> In my opinion, there are no side effects of tokens, leaders or anything
> like what you are describing, in the Spread protocols. Your previous
> e-mails describing how the protocol works do not reflect the algorithms
> or their implementation.
>
> In my opinion the only issue with your latency is your network loosing
> packets, especially on one machine. The token is lost with some probability
> for this machine (as any other udp message), and thanks to the Spread
> protocol, you do not feel this beyond a token_hurry latency for some of the
> messages as Spread is recovering from that as part of its basic protocol.
> When you reduce the latency for hurry_timeout, you just make Spread more
> aggressive and this compensates for your network problem.
>
> You could check your network udp losses directly using spsend and sprecv
> that are provided in the Spread package.
>
> If you want to use Spread and are not happy with the latency then either
> fix your network, or make Spread more aggressive by lowering the
> hurry_timeout. I really don't know how to help you beyond this.
>
> Cheers,
>
>  :) Yair.
>
> Mike Perik wrote:
> > Shouldn't the leader give up the token before going into the select?
> >
> > What's the purpose of the leader?
> >
> > Is this a bug or just a side effect of the implementation?
> >
> > Seems to me that this should be documented especially for situations
> > where you have 1 talker and many listeners.  The leader needs to the be
> > talker. Couldn't there be some kind of agreement made in the ring that
> > whoever is talking a lot becomes the leader?  Or the leader could
> > determine that someone else out there is doing the talking and I'm not so
> > I'll give up the token a little quicker.
> >
> > Thanks,
> > Mike
> >
> > On Tuesday 21 December 2004 01:27 pm, Mike Perik wrote:
> >>Ok,  I think I've found the problem.
> >>
> >>In the spread.conf I had two machines.  The leader is always the first
> >>machine.  The leader is the one who holds onto the token and he'll hold
> >>onto the token for the Hurry_timeout.   Since the first machine in the
> >>configuration file is the client machine he holds onto it for
> >> Hurry_timeout seconds.  It goes into a select with the Hurry_timeout and
> >> since the server/publisher is waiting for the token to publish the
> >> client waits the whole time (Hurry_timeout or 2 seconds by default)
> >> since there is nothing to read.  I'm assuming the server queues  up all
> >> the messages that are being sent and when it gets the token it sends
> >> them all.
> >>
> >>I switched the order of the two machines in the configuration file around
> >>and the problem essentially went away.
> >>
> >>If I'm correct on how this is working, I have a couple of questions?
> >>
> >>What if I have two servers that are publishing data at a high rate and
> >>neither of them are the "leader"?
> >>What kind of delay is this going to cause?
> >>If I have 20-30 spread daemons in my segment how much additional latency
> >> is there going to be?
> >>
> >>I believe this is why the spmonitor shows the "last" which was the server
> >>having a high number of retransmits.
> >>
> >>Is this a known issue?
> >>
> >>How would I best design my system around this problem?
> >>
> >>Thanks,
> >>Mike
> >>
> >>_______________________________________________
> >>Spread-users mailing list
> >>Spread-users at lists.spread.org
> >>http://lists.spread.org/mailman/listinfo/spread-users
> >
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users