[Spread-users] How to detect if a Spread node fails
John Schultz
jschultz at spreadconcepts.com
Wed Jul 13 13:45:23 EDT 2011
You don't get a leave message but you do get a caused by network membership change message. I'm saying you have to remember what the previous membership was, inspect the new membership and notice that the *#able guys are no longer there.
Cheers!
-----
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200
On Jul 13, 2011, at 1:03 PM, Andrew Holt wrote:
Hi,
The interesting thing is I don’t get an leave messages.
To use my example I kill spread on node able and I get join messages for
#bill#baker
#who#baker
I presume this is caused by the spread network reforming around the failed node.
If I got leave messages for:
fred#able
#joe#able
That would be fine.
Regards,
Andrew
On 13 Jul 2011, at 17:20, John Schultz wrote:
> Have at least one process per daemon join the groups in which you are interested in tracking. Track the membership of the group. If you see a caused by network where all the *#able members leave, then its a good bet that daemon has either crashed or partitioned away.
>
> If that's not good enough then you might be able to administratively, through code changes, get a report of the actual daemon membership which can tell you if able is in the daemon membership or not.
>
> Cheers!
>
> -----
> John Lane Schultz
> Spread Concepts LLC
> Phn: 301 830 8100
> Cell: 443 838 2200
>
> On Jul 13, 2011, at 11:27 AM, Andrew Holt wrote:
>
> Hi,
>
> I have two machines each running the spread server, and each setup so they know about each other. Let’s call them able & baker.
>
> I have a number of clients programs, in the same group on each
>
> So:
>
> #fred#able
> #joe#able
> #bill#baker
> #who#baker
>
> If spread dies, is killed or a machine dies how can I detect that the node has gone ‘offline’ ?
>
> baker receives a message of the type Is_caused_network_mess for each group. What I need is to know the source is 'able’
>
> Any hints ?
>
> Thanks,
> Andrew
>
> =============================
> Andrew Holt
>
> Email: andrew.holt at 4asolutions.co.uk
>
> De Omnibus Dubitandum
> =============================
>
>
>
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>
=============================
Andrew Holt
Email: andrew.holt at 4asolutions.co.uk
De Omnibus Dubitandum
=============================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3805 bytes
Desc: not available
Url : http://lists.spread.org/pipermail/spread-users/attachments/20110713/27e55980/attachment.bin
More information about the Spread-users
mailing list