[Spread-users] Re[2]: [mls-users] where to run spread daemons?

Yair Amir yairamir at cnds.jhu.edu
Sun Jul 21 10:04:37 EDT 2002


Hi,

I thought this message that was sent on the mod_log_spread list could
be of interest here.

:) Yair.
-----------------------------------------------------------

Hi,

Theo said it well. I would like to add a bit as I saw several cases
lately of people configuring their systems with one Spread daemon in
the whole system and all of the clients connecting remotely (locally
for the clients that happen to be on that one machine) to that one
daemon.

While Spread should work with that configuration, I wanted to make the
following comments:

1. Working like that you allow many point-to-point TCP/IP connections
to regulate the flow control and be responsible for reliability, as
opposed to a Spread network.

2. One point of failure - if a client is disconnected from that one
daemon or that one daemon is dead - the whole system is dead.

3. If you have just one receiver in a group and it is not in the same
machine as the daemon and the sender - the message will cross then
network twice (instead of once in a well configured Spread
configuration). If you have more than one receiver in a group, the
message will cross the network one more time for each additional
receiver (as opposed to just once total).

4. We did not optimized Spread to work like that and it will have a
lot of overhead in that kind of work. In my opinion, especially in
the case there is only one daemon (as opposed to two or more) Spread
has currently huge overhead because of priority tuning that is not
meant for such situations where there is only one daemon. For example,
I would think that a well configured Spread network on a cluster could
achieve in the range of 70Mbits/sec or so (maybe a bit more). For that
one daemon - I would not be surprised if it will achieve around
1Mbits/sec because of the bad tuning for this case.

I have seen cases where the Spread configuration even included all of
the machines but the daemon was only run on one of them, and all the
clients connected with that one. So I thought to clarify that the
configuration should only include daemons. Client machines need not
appear in the configuration if they are only serving clients.

For those that are interested in how to configure Spread - read the user guide.

    Cheers,

    :) Yair.
    
Theo> On Saturday, July 20, 2002, at 03:19 , Aditya wrote:
>> be a problem right now. What other horrible things can I expect to 
>> happen if
>> I don't run spread locally on each webserver?

Theo> If you have a cluster of 10 web servers on which each Apache instance 
Theo> has 256 children, that is 2560 TCP connections to one box.  That is not 
Theo> optimal.  Spread (with poll patch) should be able to handle that, but it 
Theo> will be abusively slow.  Also, you have to also consider the remote 
Theo> spread daemon's OS now has to maintain 2560 active, long-term TCP/IP 
Theo> sessions in addition to its normal workload (as opposed to a single unix 
Theo> domain socket session).

Theo> Now consider that you want more than just spreadlogd listening to the 
Theo> log stream.  You want a monitor here and there and an addition logger 
Theo> for fault-tolerance on another machine.  This too can be done of TCP/IP 
Theo> all to one Spread daemon, but you loose your fault tolerance AND all the 
Theo> log streams have to be sent back out on the network to be delivered to 
Theo> the monitor Spread clients.

Theo> TCP/IP isn't the optimal networking protocol to transmit these packets 
Theo> (log messages) across a local area network.  Spread however, is close to 
Theo> "as good as it gets".  So, if you have a Spread daemon running on each 
Theo> web server and on each monitoring server, you get all of the messages 
Theo> going everywhere at almost no additional cost.  You also only have 256 
Theo> or so open sockets to each webserver's Spread daemon and one or two unix 
Theo> domain connections on the logging and monitoring hosts.  You can have 
Theo> multiple sources (web servers) and multiple sinks (loggers, monitors, 
Theo> graphers, etc.) and it is still cheaper than the single logger model to 
Theo> which you currently subscribe.

Theo> If you are not pushing much traffic (less that 10 million log lines/day) 
Theo> you probably won't notice any logging bottlenecks the way you are doing 
Theo> it.  And, the way you are doing it now it a little easier to 
Theo> administrate if you aren't a networking/systems admin -- configuring and 
Theo> managing high traffic Spread rings (and keeping them stable) can be 
Theo> challenging at times.  So, if you are low traffic, then you probably 
Theo> won't run into any "show-stoppers" running with one remote Spread daemon.

Theo> On a side note, the messages are tagged by site (as the private spread 
Theo> user name is), so if you have one machine running spread, it can be hard 
Theo> to tell which web servers are logging which hits.  If you run an 
Theo> individual spread daemon on each, the message that are sent will have 
Theo> the hostname as a part of the sender's name, so you can actually break 
Theo> out which hits were served by which hosts.  This technique can be 
Theo> invaluable in a real-time (top-like) monitor.  With something like this, 
Theo> you can see if your load-balancing configuration has resource contention 
Theo> issues like assigning too many requests to a single machine.  These 
Theo> sorts of problem often present hard to decipher side-effects when 
Theo> looking at daily, hourly or even minute-to-minute summary information.

Theo> And lastly, Spread is designed to be a distributed message distributions 
Theo> system.  In your model, you are not using it as intended.  It will work 
Theo> that way, but it isn't the suggested configuration. So, if something 
Theo> goes wrong and you try to troubleshoot it and look for help in the user 
Theo> community, most likely they will tell you first to set it up in a 
Theo> distributed fashion and see if your problems go away :-D

Theo> Hope this helps a bit!

Theo> --
Theo> Theo Schlossnagle
Theo> Principal Consultant
Theo> OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Theo> Phone:  +1 301 776 6376       Fax:  +1 410 880 4879
Theo> 1024D/82844984/95FD 30F1 489E 4613 F22E  491A 7E88 364C 8284 4984
Theo> 2047R/33131B65/71 F7 95 64 49 76 5D BA  3D 90 B9 9F BE 27 24 E7


Theo> _______________________________________________
Theo> mls-users mailing list
Theo> mls-users at lists.backhand.org
Theo> http://lists.backhand.org/mailman/listinfo/mls-users


_______________________________________________
mls-users mailing list
mls-users at lists.backhand.org
http://lists.backhand.org/mailman/listinfo/mls-users





More information about the Spread-users mailing list