[Spread-users] unix domain socket in /tmp and diskfull

Aditya aditya at grot.org
Thu Apr 10 15:47:13 EDT 2003


I just had a cascading failure of several webservers that all use spread to
log. The webservers became sluggish and unresponsive and I noticed that
nothing was being logged via spread. I use spread 3.16.2 on FreeBSD 4-STABLE.
This cluster of spread-enabled servers has been up and running for around 7
months without any spread problems...it's possible that in the last 3 weeks we
have been seeing a lot more traffic and consequently many more spread
messages...

Typically, running spmonitor yields something like:

  Status at server1 V 3.16. 2 (state 1, gstate 1) after 2673 seconds :

for each spread daemon (== servers in my case).

however, when this problem occurred, all the spread daemons were in gstate 3,
ie:

  Status at server1 V 3.16. 2 (state 1, gstate 3) after 12321 seconds :

Since stopping/starting individual spread daemons did not seem to change that,
I stopped all the spread clients, spread daemons and restarted them all and it
seems "okay" now. The only thing I can think of is that *after* I noticed all
the daemons in gstate 3, /tmp on one of the servers started complaining it was
full. The default spread socket is put in /tmp/4803 and that makes me
suspicious...

Any clues to what I should be looking for?

Thanks,
Adi




More information about the Spread-users mailing list