[Spread-users] 1 problem with Spread.pm and 1 problem with spread daemon, was

Fri Jan 18 02:17:30 EST 2002

On Fri, Jan 18, 2002 at 12:33:39AM -0500, Tim Peters wrote:
> [Tom Mornini, to Guido van Rossum]
> > How can you rewrite the app to fix this? It seems like the only way to
> > fix it is to: 1) Spread less, or 2) empty mailboxes faster...
> 
> Guido responded that we're reworking our protocol (on top of Spread) to
> batch up the thousands of very small messages we sometimes send now.  I
> think it will still be possible for us to hit the limit, though, just not
> nearly as often.  So batching isn't enough.  Fortunately, this part of our

As theo mentioned, Spread does batch together small messages if it can,
but I agree that your custom batching will surely do a better job of it as
you have more information and you do not have to have the standard Spread
message hearder on each message if you batch it.

> app is receiving messages from a source that's capable of replaying them, so
> we can catch the disconnection exception (wee're using a Python wrapper --
> none of this "guess what undef might mean" business <wink>) and effectively
> restart it, requesting a resend of the msgs just beyond the last one we saw
> (something *our* protocol handles, on top of Spread).  I expect the latter
> is going to be more painful than batching, though..

I think this is the right approach. Every application will have a
different way of recovering from the situation of some receiver being
slower then the rest, or all the receivers being slower then the senders.
Basically application level flow control is required. 

Even if Spread did flow control (and it does, but only for its internal
buffers), Spread cannot know about the buffer sizes and limits in the
application, so an application will ahve to have flow control anyway to
avoid overflow of their own buffers. ( I agree there might be some
specialized apps which could rely on Spread doing flow control -- if they
do not have any buffers of their own -- but in general that doesn't work.)

 > 
> Alas, I'm not sure what else the Spread folks can do about this.  For our
> specific app, booting a client based on aggregate unread message size, or
> even on time of oldest unread, would make a lot more sense than booting it
> based on raw count.  But despite that it would be self-serving, I can't
> think of any reason that would be universally true (then again, I can't
> think of any reason for why raw count is a particularly good criterion
> either ...).
> 
The reason for counting messages is just it is a fairly direct metric for
how much memory we are spending buffering -- and that is the reason we cut
them off. Counting the total bytes could be a better metric (if we counted
header bytes it would be more accurate), the message count was just the
way it was done originally. The only disadvantage of byte counts is more
complexity to configure it (knowing how many bytes you will be sending
will be more tricky to estimate then the number of outstanding messages in
a burst I think). 

Oldest unread is more specialized. I'm not sure how generally useful it
is. It is probably the right metric for 'stock update' type apps that
don't care about older info, and just want the most current. In those
cases, you just want to use "reliable" or maybe "unreliable" service
because you actually don't want any queueing.

Jonathan

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------