[Spread-users] Spread Reliability

John Lane Schultz jschultz at spreadconcepts.com
Wed Mar 28 11:32:19 EDT 2007


Gordan Bobic wrote:
> On Wed, 28 Mar 2007, Jim Vickroy wrote:
> 
> Nothing at all, but the problem remains - just one very fast machine 
> sending will equally overwhelm all the receivers that are trying to keep 
> up (which will likely be all of them if all the machines have the same 
> spec, which is reasonably likely in a real environment).
 >
 > This means that there would need to be a sideband/auxiliary method of
 > establishing performance of receivers, and then apply rate limiting on the
 > sender(s). This is something that one might expect the message queue (and
 > Spread does try to carry out the job of a publish/subscribe message queue)
 > to handle a little more gracefully.
 >

Yes, this is why users that intend to be sending very fast should implement 
(piggy backed) ACK-based client level flow control.  Spread doesn't do it for 
you because there is no one size fits all version of flow control.  Some 
applications would want the system to slow to the slowest receiver (which can 
block the whole group), other people would want the really slow people to be 
kicked to keep performance at a certain level.  That decision is best left to 
the application designer.

Here is a recent email describing how you could implement a "slowest receiver" 
version of flow control: 
http://commedia.cnds.jhu.edu/pipermail/spread-users/2007-March/003263.html

> The existing method seems to offer little advantage in terms of 
> reliability under high load compared to simply multicast UDP flooding. At 
> least in that case you'd only lose the messages the listener is too slow 
> to catch, as opposed to having to waste CPU time (which is already likely 
> to be in short supply if the server is so overwhelmed that it's starting 
> to drop packets) on re-connecting, and throwing away all the messages in 
> the meantime. Not to mention that spread daemon isn't all that cheap in 
> terms off CPU time, which also won't help the listener keep up when it is 
> already falling behind.
> 

Do ACK-based flow control and you should have no problems.  On a Gb LAN 
configuration of 20 daemons+clients, my product, Congruity, does ACK-based flow 
control and can reliably pass ~40K 200 (or ~20K 1400) byte messages per second 
to the entire configuration with *ZERO* disconnections.  If one daemon/client 
gets loaded, then the system slows down.

> Of course, a 
> threadsafe library would help with that (it would remove the need for a 
> separate secondary queue), but sadly, the one available for Perl crashes 
> pretty solidly when you try to use it after connecting and forking. 

That sounds like a problem with the Perl library as Spread 4 does support forking.

-- 
John Schultz
Spread Concepts LLC
Phn: 443 838 2200
Fax: 301 560 8875




More information about the Spread-users mailing list