[Spread-users] Process models, connection rates and channel counts?

Ryan Caudy rcaudy at gmail.com
Thu Aug 12 20:32:08 EDT 2004

I'm not sure I have enough of the details of your model in my head to
answer clearly, but I'll take a stab at it.  Answers inlined below.


On Thu, 12 Aug 2004 18:30:44 -0400, J C Lawrence <claw at kanga.nu> wrote:
> How well does spread deal with:
>   a) Forking processes?  In particular I'm looking at a server model
>   which listens on a TCP port which connected to a spread message bus,
>   and which fork()s on accept().

I understand the above description to say that you have a server that
listens on one socket (and forks when after accepting), and has,
separately, a connection to a Spread daemon (i.e. an mbox which is
actually a tcp socket, maybe associated with some groups, etc).

Spread shouldn't have any problem with this model, as long as you do
I/O multiplexing on the socket and the mbox.  You can use the event
system, or for more control you can directly use select or poll on the
socket and the Spread mbox without a problem.  The only thing to keep
in mind here is that if you do extremely intense setup for the
accept/fork, you might have problems with being disconnected from
Spread for not receiving quickly enough.  Basically, the thing to keep
in mind is that the queue of outstanding messages the daemon will hold
for you has a limited size in number of messages.

Also, I wouldn't recommend trying to use a given mbox in multiple
processes, without some synchronization, since it's a TCP socket.  A
bit of work on mutex.h should give you what you need to for
multiple-process safety, i.e. changing the pthread mutexes to be
process-shared on systems that support it, or changing the macros to
use semaphores.

>   b) High(er) connection rates?  I could have the front-end server
>   connect to spread after the fork(), but that would mean a moderately
>   high spread connection rate.  Problem?

Spread can handle a pretty high rate of connections... the upper-bound
in the code is just a bit under FD_SETSIZE.  The major concern is that
Spread must send to each of these connected processes over TCP, or
unix domain sockets.  So, for a high number of connections, I would
recommend having the processes on the same machine as the Spread
daemon they connect to.  Other than that, you only have to be
concerned with memory (buffer) and cpu usage on the daemon's machine,
for which I can't give you any reasonable advice beyond recommending
that you test your own environment empirically.

>   c) High (transient) channel/mbox counts?  One of the internal
>   communications models I can use would have fairly rapid creation and
>   abandonment of ad-hoc mboxes (each channel would negotiate and
>   establish name and membership of the future channels).

This shouldn't be a major problem for Spread.  Its membership
algorithm only needs to send one AGREED message to the other daemons
for adding a member to a group, and no messages to the other daemons
to accept a new connection (which implicitly creates a private group
for the connecting process).

> My current assumption is that high connection rates are Very Bad, high
> transient channel rates are Very Bad, and forking a process which is
> connected to spread with the expectation that both parent and child, or
> multiple children would then communicate via spread might be kinda Bad.

I think I've addressed your questions, but let the list and I know if
you need more.

> --
> J C Lawrence
> ---------(*)                Satan, oscillate my metallic sonatas.
> claw at kanga.nu               He lived as a devil, eh?
> http://www.kanga.nu/~claw/  Evil is a name of a foeman, as I live.
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

Ryan W. Caudy
<rcaudy at gmail.com>
Bloomberg L.P.
<rcaudy1 at bloomberg.net>
<caudy at cnds.jhu.edu>         
Center for Networking and Distributed Systems
Department of Computer Science
Johns Hopkins University          

More information about the Spread-users mailing list