[Spread-users] disconnect by heartbeat?

Chang Song changsong at nhn.com
Sat Sep 18 23:41:13 EDT 2010


Hello.
I was wondering if anyone can help me on this.

I am developing an app that relies on Spread for automatic reconfiguration.
The scenario is that the app relies on the notification of application failure,
server failures, network failure, etc.

Each app is a Spread client that joins to a single group.
I simulated this by simply using "spuser" command to join into a group from
several servers.

Then I disconnected one server from the network.
It doesn't look like spread detects its failure.
I sent a message to a group, and after about 50 sec, I get membership
change notification due to DISCONNECT (of the disconnected server)

When I reconnect the server, and sent a message, the server doesn't
get the message.
I repeated this test, and it doesn't look like there is any deterministic
heartbeat timeout between spread node and daemon.

Here's what I want, I want fast detection (under 5 sec) of membership change 
due to any failures (network disconnect, OS hang, app hang, server down).
This is different than simple graceful application exit.

Maybe I am missing something in the configuration.
Please help me on this.

Thank you very much in advance.

Chang








More information about the Spread-users mailing list