[Spread-users] Failure detection

Anders.Lindstrom at ubsw.com Anders.Lindstrom at ubsw.com
Wed May 28 01:29:14 EDT 2003


I am thinking about using Spread to implement a replicated service. It is very important that this service does not accidentally process the same request more than once. It seems to me that the timeout failure detection mechanism used by Spread can lead to a situation where a machine is falsely suspected as having failed.

The specific example is this. There are two servers A and B, where A is the primary and B is the secondary. Both these machines a members of the the "server" group. There is an arbitrary number of clients that are not part of the group but nevertheless send requests to it.

At some point in time, both A and B time out with respect to one another but they do not crash. That is, they are partitioned from one another but still alive. What happens when the clients send to the "server" group? Which one of them gets the request? Is it ever possible for both A and B to get a request from a client while they are partitioned?

If yes, does this mean that I've really got to have three servers so that if partitioning ever happens, the component with 2 members becomes the primary component? Any server that finds itself isolated would then deem itself failed and not respond to requests?

Anders.

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.





More information about the Spread-users mailing list