[Spread-users] mod_log_spread errors in error_log

George Schlossnagle george at omniti.com
Thu Aug 16 11:10:40 EDT 2001


Yep.  Looks like it's working.  If those name not unique (-6) messages only
occur at startup, I would just consider it a benign but annoying bug.  Still
should be identified and fixed, but doesn't seem to cause problems.  I know
you had mentioned earlier that you had made chaned to m_l_s to make it ansi
compliant.  Can I see the patch... maybe there's something that is going on
there.

Also, you it might be interesting to add the following code at  line 857 of
mod_log_spread.c

else {
    ap_log_error(APLOG_MARK, APLOG_ERR, s, "Connected to spread with priv
name (%s)", private name);
}

This will (rather noisily) log evry time a successful spread connect is
done.  It may help identify the source of the problem.

george


----- Original Message -----
From: "Monte Ohrt" <monte at ispi.net>
To: "George Schlossnagle" <george at omniti.com>
Cc: "Jonathan Stanton" <jonathan at cnds.jhu.edu>;
<spread-users at lists.spread.org>
Sent: Thursday, August 16, 2001 11:01 AM
Subject: Re: [Spread-users] mod_log_spread errors in error_log


> I'm pretty sure everything is working OK. Here is my test.
>
> spread.conf
> -----------
>
> Spread_Segment  10.131.192.255:4803 {
>
> getz-prv        10.131.192.114
> drew-prv        10.131.192.115
> }
>
>
> apache httpd.conf
> -----------------
>
> SpreadDaemon 4803
> CustomLog $test combined
>
>
> * Here is the message when I start spread:
>
>  Conf_init: using file: /usr/local/etc/spread/spread.conf
> Successfully configured Segment 0 [10.131.192.255:4803] with 2 procs:
>             getz-prv: 10.131.192.114
>             drew-prv: 10.131.192.115
> Finished configuration file.
> Conf_init: My name: drew-prv, id: 10.131.192.115, port: 4803
> Spread: not running as root, won't chroot
> Membership id is ( 176406643, 997973726)
> --------------------
> Configuration at drew-prv is:
> Num Segments 1
> 1 10.131.192.255    4803
> drew-prv            10.131.192.115
> ====================
>
>
>
> * Then I start apache (see error_log errors from previous e-mails)
>
> * Then I log into spread from the command line to watch the traffic:
>
> 9:59 drew[237] /usr/local/etc/spread/tuser -s 4803
> Spread library version is 3.15.2
> User: connected to 4803 with private group #user#drew-prv
>
> ==========
> User Menu:
> ----------
>
>         j <group> -- join a group
>         l <group> -- leave a group
>
>         s <group> -- send a message
>         b <group> -- send a burst of messages
>
>         r -- receive a message (stuck)
>         p -- poll for a message
>         e -- enable asynchonous read (default)
>         d -- disable asynchronous read
>
>         q -- quit
>
> User> j test
>
> User>
> ============================
> Received REGULAR membership for group test with 1 members, where I am
> member 0:
>         #user#drew-prv
> grp id is 176406643 997973726 1
> Due to the JOIN of #user#drew-prv
>
> User>
>
>
> Now, I hit the web server with my browser, and this is what comes up in
> the terminal:
>
> ============================
> received RELIABLE message from #ap28008#drew-prv, of type 1, (endian 0)
> to 1 groups
> (127 bytes): 206.131.193.10 - - [16/Aug/2001:10:00:55 -0500] "GET /
> HTTP/1.0" 200 1251 "-" "Mozilla/4.77 [en] (X11; U; Linux 2.4.2-2 i686)"
>
> User>
>
>
>
> So this tells me that everything is working, yes?
>
>
> George Schlossnagle wrote:
> >
> > I've never seen this behaviour actually.  The SP_connect is only done in
> > child_init, so it shouldn't be due to the Apache's double-loading of
> > modules.  It is possible for this to occur if a connection is broken to
> > spread, I guess, m_l_s works like
> >
> > if(SP_multicast() < 0) {
> >     error_log();
> >     SP_disconnect();
> >     SP_connect();
> >     if(SP_multicast()< 0) {
> >         error_log();
> >     }
> > }
> >
> > still, weird that that would be the first error.   Also, the lack of
> > multicast errors is strange as well (this implies that the sending is
> > working).  Logging is working, right?
> >
> > ----- Original Message -----
> > From: "Jonathan Stanton" <jonathan at cnds.jhu.edu>
> > To: <spread-users at lists.spread.org>
> > Sent: Thursday, August 16, 2001 10:35 AM
> > Subject: Re: [Spread-users] mod_log_spread errors in error_log
> >
> > > On Thu, Aug 16, 2001 at 09:16:20AM -0500, Monte Ohrt wrote:
> > > > Hi,
> > > >
> > > > I got spread 3.15.2 and mod_log_spread working, however there are
some
> > > > errors I am seeing in the Apache error_log that concern me:
> > > >
> > > > Here is the output to error_log when I start the server:
> > > >
> > > > [Thu Aug 16 09:05:43 2001] [notice] Create log to group test for
daemon
> > > > 0
> > > > [Thu Aug 16 09:05:44 2001] [notice] set_spread_daemon(4803) for
index 0
> > > > [Thu Aug 16 09:05:44 2001] [notice] Create log to group test for
daemon
> > > > 0
> > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand -- UnixSocketDir
set to
> > > > /export/apache/backhand
> > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand -- Broadcast
> > > > 10.131.192.255:4445 added
> > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand -- Multicast accept
> > > > 10.131.192.0/24
> > > > [Thu Aug 16 09:05:45 2001] [notice] backhand_init(12292) spawning
> > > > moderator (PID 12293)
> > > > [Thu Aug 16 09:05:45 2001] [notice] mod_backhand moderator ready to
go
> > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
connect
> > > > to spread  with private_name ap12294. Error -6
> > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
connect
> > > > to spread  with private_name ap12295. Error -6
> > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
connect
> > > > to spread  with private_name ap12296. Error -6
> > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
connect
> > > > to spread  with private_name ap12297. Error -6
> > > > [Thu Aug 16 09:05:45 2001] [notice] Apache/1.3.20 (Unix)
mod_ssl/2.8.4
> > > > OpenSSL/0.9.6b mod_gzip/1.3.17.1a balanced_by_mod_backhand/1.2.0
> > > > configured -- resuming normal operations
> > > > [Thu Aug 16 09:05:45 2001] [error] (9)Bad file number: Could not
connect
> > > > to spread  with private_name ap12298. Error -6
> > > >
> > > >
> > > > Although spread seems to be working fine, the "Bad file number"
errors
> > > > are what concern me, what could be causing this?
> > >
> > > This erroor means that the private name used to connect to spread was
not
> > > "unique" meaning some other connection using the same name was already
> > > established. It means the attempt to connect failed. If they only show
up
> > > transiently when the system starts up I wouldn't worry about it. I'll
> > think
> > > and see why they happen -- probably an interaction between
mod_log_spread,
> > > the way Apache starts processes and how spread accepts connections.
> > >
> > > If they continue regularaly after it has started then tell me. As long
as
> > > it does succesfully connect 'quickly' (i.e. it doesn't keep failing
for
> > > seconds) you should be ok.
> > >
> > > The mod_log_spread authors are here on this list also, they might have
> > seen
> > > this error before and have a better answer.
> > >
> > > Jonathan
> > > --
> > > -------------------------------------------------------
> > > Jonathan R. Stanton         jonathan at cs.jhu.edu
> > > Dept. of Computer Science
> > > Johns Hopkins University
> > > -------------------------------------------------------
> > >
> > >
> > > _______________________________________________
> > > spread-users mailing list
> > > spread-users at lists.spread.org
> > > http://lists.spread.org/mailman/listinfo/spread-users
> > >
> >
> > _______________________________________________
> > spread-users mailing list
> > spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
>
> --
> Monte Ohrt <monte at ispi.net>
> http://www.ispi.net/
>







More information about the Spread-users mailing list