[Spread-users] When do I need to rejoin a group?
Dustin J. Mitchell
dustin at v.igoro.us
Wed Nov 8 22:01:04 EST 2006
I have an app that's sending quite a lot of large packets via spread to two
highly-loaded boxes. The sending app is on 'belmont', and the receiving boxes
are 'throop' and 'foster'. The app is, basically, a distributed filesystem.
There are SAFE_MESS metadata updates (new file, delete file, etc.) which I
assume that all nodes in the group get, and then there are data messages
(UNRELIABLE_MESS) containing chunks of data for the files themselves.
Obviously there's more to it than this, but I think those are the key
attributes for this question.
While this process is ongoing, I see the following on belmont:
2006/11/08 00:35:07 CST group 'test' will undergo a membership change shortly
2006/11/08 00:35:07 CST #r3342-9#throop disappeared from group 'test'
2006/11/08 00:35:07 CST #r4647-9#foster disappeared from group 'test'
2006/11/08 00:35:07 CST #r742-9#belmont disappeared from group 'test'
which seems a little odd -- "I", belmont, am being told that I have
disconnected from the group 'test'? My app blithely continues to send data to
the group, and other boxes seem to receive it. My (apparently incorrect)
understanding was that, so long as spread doesn't forcefully disconnect me, I
can assume that *I* am still a member of the group, and that I will receive
all messages. But I have seen at least one case where a SAFE_MESS was lost,
and I suspect it has to do with my incorrect understanding.
I didn't find a lot of explanation of this in the documentation -- what do the
"disappeared" messages mean for my application? I now *think* that the
correct interpretation is that when I see my own name (#r742-9#belmont in the
above) come up in a disconnection notification, that I should assume I've
missed some packets and enter my app's just-joined-the-group state.
Is this correct? I'm sorry this explanation isn't too clear -- obviously
there are some holes in my understanding.
More information about the Spread-users