[Spread-users] mod_log_spread errors in error_log

Monte Ohrt monte at ispi.net
Thu Aug 16 11:49:21 EDT 2001


Ok great! This is my first time putting spread and mod_log_spread to the
test with production systems, so a couple more questions I have:

* How do you suggest the spread daemon be started on the apache hosts?
Right now I'm starting with an init.d start/stop script. The problem is
if the spread daemon dies, the logs will be lost for this host, yes?
Should I run spread from inittab, or should I monitor the daemon
process? What are others doing?

* heavy traffic configuration
All my spread hosts are on one network segment (100mb), currently I have
about 100 virtual hosts, we're probably seeing 3-5 million hits a day
collectively. I only have a few servers now, but this will grow over
time, as will the hits. Is this a walk in the park for spread, or is
this considered  to be heavy traffic?
  * What is the most efficient way to run spread?  Should I put all the
machines into one segment in the config file? I am limited to 128
systems per segment, correct?
  * Are the default WATER_MARK and MAX_SESSION_MESSAGES values adequate?

Thanks!
Monte

George Schlossnagle wrote:
> 
> Yep.  Looks like it's working.  If those name not unique (-6) messages only
> occur at startup, I would just consider it a benign but annoying bug.  Still
> should be identified and fixed, but doesn't seem to cause problems.  I know
> you had mentioned earlier that you had made chaned to m_l_s to make it ansi
> compliant.  Can I see the patch... maybe there's something that is going on
> there.
> 
> Also, you it might be interesting to add the following code at  line 857 of
> mod_log_spread.c
> 
> else {
>     ap_log_error(APLOG_MARK, APLOG_ERR, s, "Connected to spread with priv
> name (%s)", private name);
> }
> 
> This will (rather noisily) log evry time a successful spread connect is
> done.  It may help identify the source of the problem.
> 
> george
> 
> ----- Original Message -----
> From: "Monte Ohrt" <monte at ispi.net>
> To: "George Schlossnagle" <george at omniti.com>
> Cc: "Jonathan Stanton" <jonathan at cnds.jhu.edu>;
> <spread-users at lists.spread.org>
> Sent: Thursday, August 16, 2001 11:01 AM
> Subject: Re: [Spread-users] mod_log_spread errors in error_log
> 
> > I'm pretty sure everything is working OK. Here is my test.
> >
> > spread.conf
> > -----------
> >
> > Spread_Segment  10.131.192.255:4803 {
> >
> > getz-prv        10.131.192.114
> > drew-prv        10.131.192.115
> > }
> >
> >
> > apache httpd.conf
> > -----------------
> >
> > SpreadDaemon 4803
> > CustomLog $test combined
> >
> >
> > * Here is the message when I start spread:
> >
> >  Conf_init: using file: /usr/local/etc/spread/spread.conf
> > Successfully configured Segment 0 [10.131.192.255:4803] with 2 procs:
> >             getz-prv: 10.131.192.114
> >             drew-prv: 10.131.192.115
> > Finished configuration file.
> > Conf_init: My name: drew-prv, id: 10.131.192.115, port: 4803
> > Spread: not running as root, won't chroot
> > Membership id is ( 176406643, 997973726)
> > --------------------
> > Configuration at drew-prv is:
> > Num Segments 1
> > 1 10.131.192.255    4803
> > drew-prv            10.131.192.115
> > ====================
> >
> >
> >
> > * Then I start apache (see error_log errors from previous e-mails)
> >
> > * Then I log into spread from the command line to watch the traffic:
> >
> > 9:59 drew[237] /usr/local/etc/spread/tuser -s 4803
> > Spread library version is 3.15.2
> > User: connected to 4803 with private group #user#drew-prv
> >
> > ==========
> > User Menu:
> > ----------
> >
> >         j <group> -- join a group
> >         l <group> -- leave a group
> >
> >         s <group> -- send a message
> >         b <group> -- send a burst of messages
> >
> >         r -- receive a message (stuck)
> >         p -- poll for a message
> >         e -- enable asynchonous read (default)
> >         d -- disable asynchronous read
> >
> >         q -- quit
> >
> > User> j test
> >
> > User>
> > ============================
> > Received REGULAR membership for group test with 1 members, where I am
> > member 0:
> >         #user#drew-prv
> > grp id is 176406643 997973726 1
> > Due to the JOIN of #user#drew-prv
> >
> > User>
> >
> >
> > Now, I hit the web server with my browser, and this is what comes up in
> > the terminal:
> >
> > ============================
> > received RELIABLE message from #ap28008#drew-prv, of type 1, (endian 0)
> > to 1 groups
> > (127 bytes): 206.131.193.10 - - [16/Aug/2001:10:00:55 -0500] "GET /
> > HTTP/1.0" 200 1251 "-" "Mozilla/4.77 [en] (X11; U; Linux 2.4.2-2 i686)"
> >
> > User>
> >
> >
> >
> > So this tells me that everything is working, yes?
> >
> >
> > George Schlossnagle wrote:
> > >
> > > I've never seen this behaviour actually.  The SP_connect is only done in
> > > child_init, so it shouldn't be due to the Apache's double-loading of
> > > modules.  It is possible for this to occur if a connection is broken to
> > > spread, I guess, m_l_s works like
> > >
> > > if(SP_multicast() < 0) {
> > >     error_log();
> > >     SP_disconnect();
> > >     SP_connect();
> > >     if(SP_multicast()< 0) {
> > >         error_log();
> > >     }
> > > }
> > >
> > > still, weird that that would be the first error.   Also, the lack of
> > > multicast errors is strange as well (this implies that the sending is
> > > working).  Logging is working, right?
> > >
> > > ----- Original Message -----
> > > From: "Jonathan Stanton" <jonathan at cnds.jhu.edu>
> > > To: <spread-users at lists.spread.org>
> > > Sent: Thursday, August 16, 2001 10:35 AM
> > > Subject: Re: [Spread-users] mod_log_spread errors in error_log
> > >
> > > > On Thu, Aug 16, 2001 at 09:16:20AM -0500, Monte Ohrt wrote:
> > > > > Hi,
> > > > >
> > > > > I got spread 3.15.2 and mod_log_spread working, however there are
> some
> > > > > errors I am seeing in the Apache error_log that concern me:
> > > > >
> > > > > Here is the output to error_log when I start the server:
> > > > >
> > > > > [Thu Aug 16 09:05:43 2001] [notice] Create log to group test for
> daemon
> > > > > 0
> > > > > [Thu Aug 16 09:05:44 2001] [notice] set_spread_daemon(4803) for
> index 0
> > > > > [Thu Aug 16 09:05:44 2001] [notice] Create log to group test for
> daemon
> > > > > 0
> > > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand -- UnixSocketDir
> set to
> > > > > /export/apache/backhand
> > > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand -- Broadcast
> > > > > 10.131.192.255:4445 added
> > > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand -- Multicast accept
> > > > > 10.131.192.0/24
> > > > > [Thu Aug 16 09:05:45 2001] [notice] backhand_init(12292) spawning
> > > > > moderator (PID 12293)
> > > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand moderator ready to
> go
> > > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
> connect
> > > > > to spread  with private_name ap12294. Error -6
> > > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
> connect
> > > > > to spread  with private_name ap12295. Error -6
> > > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
> connect
> > > > > to spread  with private_name ap12296. Error -6
> > > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
> connect
> > > > > to spread  with private_name ap12297. Error -6
> > > > > [Thu Aug 16 09:05:45 2001] [notice] Apache/1.3.20 (Unix)
> mod_ssl/2.8.4
> > > > > OpenSSL/0.9.6b mod_gzip/1.3.17.1a balanced_by_mod_backhand/1.2.0
> > > > > configured -- resuming normal operations
> > > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
> connect
> > > > > to spread  with private_name ap12298. Error -6
> > > > >
> > > > >
> > > > > Although spread seems to be working fine, the "Bad file number"
> errors
> > > > > are what concern me, what could be causing this?
> > > >
> > > > This erroor means that the private name used to connect to spread was
> not
> > > > "unique" meaning some other connection using the same name was already
> > > > established. It means the attempt to connect failed. If they only show
> up
> > > > transiently when the system starts up I wouldn't worry about it. I'll
> > > think
> > > > and see why they happen -- probably an interaction between
> mod_log_spread,
> > > > the way Apache starts processes and how spread accepts connections.
> > > >
> > > > If they continue regularaly after it has started then tell me. As long
> as
> > > > it does succesfully connect 'quickly' (i.e. it doesn't keep failing
> for
> > > > seconds) you should be ok.
> > > >
> > > > The mod_log_spread authors are here on this list also, they might have
> > > seen
> > > > this error before and have a better answer.
> > > >
> > > > Jonathan
> > > > --
> > > > -------------------------------------------------------
> > > > Jonathan R. Stanton         jonathan at cs.jhu.edu
> > > > Dept. of Computer Science
> > > > Johns Hopkins University
> > > > -------------------------------------------------------
> > > >
> > > >
> > > > _______________________________________________
> > > > spread-users mailing list
> > > > spread-users at lists.spread.org
> > > > http://lists.spread.org/mailman/listinfo/spread-users
> > > >
> > >
> > > _______________________________________________
> > > spread-users mailing list
> > > spread-users at lists.spread.org
> > > http://lists.spread.org/mailman/listinfo/spread-users
> >
> > --
> > Monte Ohrt <monte at ispi.net>
> > http://www.ispi.net/
> >
> 
> _______________________________________________
> spread-users mailing list
> spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

--
Monte Ohrt <monte at ispi.net>
http://www.ispi.net/





More information about the Spread-users mailing list