[Spread-users] Process models, connection rates and channel counts?

Fri Aug 13 12:24:09 EDT 2004

Hello,

I just have a few things to add to what Ryan said.

On Thu, Aug 12, 2004 at 08:32:08PM -0400, Ryan Caudy wrote:
> On Thu, 12 Aug 2004 18:30:44 -0400, J C Lawrence <claw at kanga.nu> wrote:
> > 
> > How well does spread deal with:
> > 
> >   a) Forking processes?  In particular I'm looking at a server model
> >   which listens on a TCP port which connected to a spread message bus,
> >   and which fork()s on accept().

Could you clarify what you mean by this? For example, is the TCP port you 
refer to a non-spread related network connection? For example, one of the 
current active uses of Spread is to consolidate Apache web access log 
records from a cluster of web servers. In this model each Apache process 
is forked on accept (or preforked but that doesn't change anything) and 
then the apache process establishes a Spread connection (with SP_connect() 
) and sends the log records from the http requests as Spread messages 
instead of writing them to local disk. As long as that particular Apache 
process stays alive (it may server multiple http requests) it maintains 
it's Spread connection. The connection is closed when the apache process 
quits. 

So in this case each Apache process on a web-server (maybe hundreds) has a 
separate connection to a Spread daemon (which is often co-located on the 
web-server, but it may be remote)

If that is what you are talking about then I think it works pretty well. 
If something else, could you explain more?

> Also, I wouldn't recommend trying to use a given mbox in multiple
> processes, without some synchronization, since it's a TCP socket.  A
> bit of work on mutex.h should give you what you need to for
> multiple-process safety, i.e. changing the pthread mutexes to be
> process-shared on systems that support it, or changing the macros to
> use semaphores.

Just to emphasize, this synchronization is absolutely required for shared 
Spread connections.

> 
> >   b) High(er) connection rates?  I could have the front-end server
> >   connect to spread after the fork(), but that would mean a moderately
> >   high spread connection rate.  Problem?
> > 

As I mentioned above that is how some other forking servers use Spread -- 
you might have to just try it and see whether the connection time is low 
enough for your specific circumstances.

> Spread can handle a pretty high rate of connections... the upper-bound
> in the code is just a bit under FD_SETSIZE.  The major concern is that

Actually it can handled more then that (although it requires a re-compile) 
The FD_SETSIZE limit is because of select(), but there also a poll() 
implementation of our event handling that can handle an unlimited number 
(memory bound) of connections.

Spread is pretty fast at accepting and establishing connections, but it is 
much more costly then sending a message over an established connection so 
reusing connections when possible is more efficient. What kind of rates 
are you talking about?

> >   c) High (transient) channel/mbox counts?  One of the internal
> >   communications models I can use would have fairly rapid creation and
> >   abandonment of ad-hoc mboxes (each channel would negotiate and
> >   establish name and membership of the future channels).
> > 

Can you explain why you need to negotiate and then establish an ad-hoc 
channel? The reason I ask is that Spread has some specific features to 
make the need for ad-hoc connections/groups less and I wonder if you could 
avoid them all together. 

For example, you Spread supports open-groups so you do not have to join a
group to send messages to it. This makes client--> group of server
interations easier as the client sends a message to a well-known server
group name and then one(or more) servers reply directly with a unicast
spread message to the client. 

Also joining and leaving groups is very light-weight in Spread, so a
single client connection can join and leave groups to change what services
it is requesting or using without closing the actual spread connection
(mbox)  and re-opening.

> 
> This shouldn't be a major problem for Spread.  Its membership
> algorithm only needs to send one AGREED message to the other daemons
> for adding a member to a group, and no messages to the other daemons
> to accept a new connection (which implicitly creates a private group
> for the connecting process).
> 
> > My current assumption is that high connection rates are Very Bad, high
> > transient channel rates are Very Bad, and forking a process which is
> > connected to spread with the expectation that both parent and child, or
> > multiple children would then communicate via spread might be kinda Bad.

The first one is not necessarily true (depends on how high the rates are 
and what "bad" means to you :-) i.e. latency to connect, overrunning 
internal spread state, crashing... 

If I understand you correctly, then high-transient channels is just 
another case of high connection rates?

The third can be done, but as noted above requires you to do the 
synchronization yourself. You can also link with the 'thread safe' Spread 
library (libtspread.a) which provides basic internal locking of SP_* calls 
to allow use in multi-threaded programs. So it's only bad if you don't do 
the locking right. I will say that I think the way most forking apps use 
Spread is to connect after the fork.

Hope this helps,

Jonathan

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------