[Spread-users] persistant multicast error(-11)

Marius marius at mail.communityconnect.com
Thu Apr 18 13:04:34 EDT 2002


I have been trying to debug this mess for about two weeks now and have had
little success.  I am using mod_log spread and am seeing constant
"SP_multicast error(-11) in config_log_tranaction" errors in my apache
error logs.  These errors generally some in every few seconds, but
sometimes minutes pass without any under lower load.  I tried upgrading
spread to the most current version, and that has shown little or no
improvement.

>From the spread side I do not see too many problems indicating
what is exactly wrong.  The 'user' application indicates that content is
indeed being broadcast around. The 'monitor' application shows more
retransmissions then I would like, but nothing too vulgar IMHO.

leader has 18 retransmissions in the last 48 hours
host2 has had 2000+ retransmissions for same time frame
host3 & host4 no transmission errors in that time.

Now I have been told that the retransmission problem on host2 are likely
indicative of problem on the host that precedes it in the spread.conf
file.  (In this case it is the leader proceeding.)  This is
likely as hosts 2 and 3  are not in active service, so would not have much in
the way of logs to transmit of their own.  (host4, which is in active
service might be buffered by the non-active host3.)

My sessions according to the 'monitor' are generally around 200 during
the day (give or take 40 or so depending on traffic)  I do not know if
that is particularly high.

My setup is as follows:
4 FreeBSD 4.5-Stable machines with dual cpu's ranging from 500Mhz-800Mhz
spread 3.16.2rc1 (but error occured in 3.12 also)
Apache/1.3.20

Spread runs on a backend interface that is only actively used for
spread and nfs traffic though Cisco 5500 and 6500 switches.  (100mb speed)
This problem started occurring when I upgraded FreeBSD from 4.2-stable to
4.5-stable, but I do not want to downgrade as 4.2 was producing fairly
frequent crashes under the load.  (I will take spread errors to
mysterious crashes any day.)  The specific version of FreeBSD (4.2-stable)
is running just fine on 6 similar machines without spread, so I think it
is all connected.  But with a great deal of searching I can not figure out
what it is about the upgrade that caused this.  The people of the FreeBSD
list have not had too many ideas on the matter either.

Anyone have a clue how I can debug this?  What other information should I
collect to figure this out?  I am getting to wits end here.


Marius M. Rex
Community Connect Inc.
marius at mail.communityconnect.com









More information about the Spread-users mailing list