[Spread-users] Spread daemon ceases to work on mac, stuck processes
jschultz at spreadconcepts.com
Tue May 7 12:15:03 EDT 2013
Please post the config file you are actually using and how you are running spread from the command line.
I waded through your log file. It looks like what is happening is that the daemon is running through its normal membership algorithm, but when it sends the token to itself it never receives it back. So, it thinks it has lost the token and then restarts its membership algorithm. But because it will always fail to get its token back, these membership attempts will always fail and loop like this forever.
It seems like the problem is that when the daemon tries to send to itself at 127.0.1.1 at port 4804 that it never receives it back. Are you sure your addresses are correct? Are you sure there isn't a firewall or something similar interfering?
I noticed that your id is 127.0.1.1 rather than 127.0.0.1. I'm wondering how that happened? Normally, I would think you would need an entry in /etc/hosts that maps "localhost" to 127.0.1.1 and change Spread's config file for it to start up happily. Is that what you did? Or is something else going on? Are you running Spread by forcing it to use a particular name in the config (e.g. - spread -n localhost)?
Are you running on a 64 bit machine? There is a known bug in 4.1.0 on such machines that can exhibit behavior like this. You should probably try 4.2.0 or even the 4.3.0 release candidate and see if you have the same problem.
As to why it seems to work a little bit, Spread accepts messages from users even when it can't send them immediately (e.g. - during a membership algorithm), but eventually will refuse more messages (e.g. - after 500) if you send it a lot. So, while Spread accepts the messages from your load application for a while it eventually refuses any more, so he seems to lock up.
If you have your receive client join before the load client does it ever receive anything? My hunch is that it would not.
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200
On May 7, 2013, at 6:44 AM, Johannes Wienke wrote:
I have a quite simple setup on a Mac running Mountain Lion with spread
4.1, where the spread operations stop to work after some time.
I have launched a spread daemon using the default configuration file and
then used a process to send events on that daemon without any clients.
After a few seconds the sending process blocks and also no new clients
can connect to the spread process before relaunching it. The sending
process might impose some good amount of load as image data are sent,
however I would expect a clear error message and not such a stuck
behavior. Any idea what is wrong here?
In order to provide some debugging hints I have enabled full debugging
of the daemon. The log can be found here:
I have stopped the spread process 3 or 4 seconds after the sending
process got stuck. At that time I also tried to connect a second client
which got stuck, too, until spread was killed.
Spread-users mailing list
Spread-users at lists.spread.org
More information about the Spread-users