[Spread-users] Spread contractors in SF Bay area?
Tom Mornini
tom at quios.net
Thu Jan 10 13:40:37 EST 2002
On Wednesday, January 9, 2002, at 06:17 PM, John David Duncan wrote:
> On Wednesday, January 9, 2002, at 05:15 PM, Tom Mornini wrote:
>
>> We've been quite careful in our implementation, and I believe we're
>> using Spread
>> as intended. We have written a Perl OO wrapper for the supplied Perl
>> module and
>> it could very well be doing something that is causing the problems.
>
> I am also in San Francisco having a very similar problem. We have a
> small custom-written perl module, and run spread for logging, and have
> an occasional problem with the system hanging (more like every 4 or 5
> days than every 10, unfortunately). When this happens it becomes
> impossible to connect using spuser, and though spmonitor will connect,
> it will not report any message activity. After one server is
> restarted, it appears that all of the "held" spread messages from the
> other servers do get delivered to spreadlogd.
>
> I've attached the perl module for comparison, but I'm pretty convinced
> it's not a part of the problem.
>
> All of this is on Spread 3.16.1 and FreeBSD 4.4
Hey, I think I've solved our problems! Must be a West Coast thing indeed!
The funny thing in our case is that I had solved this once before!
We have two ways that we log with Spread:
1) STDIN to Spread for logging Apache access and error logs via a
customlog pipe
2) Our own application logging system
When this problem first started, I scratched my head and realized that
you can't just open a connection and write to it forever! Spread sends
special membership and perhaps some other messages (been a while since I
worked on the Spread details) to each and every mailbox each time
someone joins and leaves a shared mailbox.
If those messages aren't read on a regular basis then surely a buffer is
eventually going to fill up and cause some grief.
So, last night it occured to me that I had realized this and corrected
it in #1 above, but somehow had completely missed the fact that #2 does
exactly the same thing and has the same problem!
I looked at your code and you do the same thing.
So, here's what I do: I set the timeout value to zero, and do a receive
for each message I send. I don't DO anything with the received messages,
as I don't care about them, but just receiving them should tidy things
up for the Spread daemon.
I"ve applied this to our system, and we'll know in a couple of days if
this truly solves the problem 100%. Here's hoping!
--
-- Tom Mornini
-- eWingz Systems, Inc.
--
-- ICQ: 113526784, AOL: tmornini, Yahoo: tmornini, MSN: tmornini
More information about the Spread-users
mailing list