[Spread-users] Fault Tolerant Server

Thu Dec 23 02:06:30 EST 2004

You should also check "wackamole" (look at www.backhand.org). The truth is
that I ran into many problems (compilation errors, crashes...) trying to use
it on Linux (I gave up eventually), but the main idea there is using virtual
Ips.
A "service" is provided by a set of virtual Ips, which enables both fault
tolerance and load balancing.
There is still one problem in they way of deciding which is the "active"
service provider, ordering is good, and is working fine in one direction -
when the 0-th node is down, the 1-st node is taking over, and so on. But,
when "0" is up again, there might be an ambiguous situation.
Another problem is that not all IT people (and some routers) like software
that messes with their Ips.
Since I'm working on it right now, I'll be glad to share more code/ideas in
that subject.

Yuval.

> David,
>
> A lot of what you do is not needed when using Spread. The group membership
notifications give you for free the whole process of 
> discovery. It is also guaranteed that the group membership lists will
contain the members in the same order for all members.
>
> Play a bit with sptuser or spuser and try it with several members and
several groups and see how it works.
>
> If I missed something, let me know.
>
> Cheers,
>
>	:) Yair.
>
> David Avraamides wrote:
> The basic approach I use is what I call "discovery". In our world each 
> type of service would publish on a different spread group and whenever 
> a service comes up it broadcasts a discovery request message on the group.
> Any other instances of the service respond with a discovery reply and 
> the requesting service adds their private group name to a list. This 
> service then adds itself to the list. The list is sorted 
> alphabetically and this determines the ranking of the peers. 
> Additionally, each service sends out heartbeats and a listening 
> service updates the last heartbeat time of that peer. A background 
> timer removes stale peers from the peer list. The net effect is that 
> each instance of a service should maintain an identical list of the active
peers in the peer group.
> 
> All services go through this process and the decision of whether a 
> service should implement fault-tolerance, horizontal scaling, or both 
> is up to the derived class. In the fault-tolerant only case, the peer 
> with rank 0 is the only one that publishes messages. All of the peers 
> will "hear" requests but only the 0-th peer in the ranking is active (i.e.
> the master). If it goes down, planned or unplanned, the rest of the 
> peers will adjust their list and one of them will become the new peer 
> and start publishing. This works whether the service is a 
> request-reply model or a pub-sub model. In the pub-sub model, after 
> discovery, the publisher sends out a topic list request so all clients 
> will let it know what topics they are listening to. This way it should 
> maintain the same subscription list as the master publisher.
> 
> For the horizontal scaling case, I simply "distribute" the requests 
> among the peers by modding the request ID (set uniquely by the client) 
> with the number of peers in the peer list. If the remainder matches 
> this instance's rank, it process the message, else it discards it. For 
> subscriptions, I don't have a request ID so I mod the hash of the 
> topic ID. Same diff - each client request is handled by one service.
> 
> Issues:
> - I don't support true load balancing, its really just dynamic request 
> partitioning. But that's fine for our needs (so far).
> - There are possible race conditions (server dies while processing a 
> request and no other service will pick up the request). I just let the 
> client handle it by retrying the request later. A typical client will 
> blast out a large number of request and monitor the replies and after 
> some time period, timeout and examine/retry any missing replies. If 
> two replies are received by a client, the last one wins. In our world 
> (hedge
> fund) the latest one is the best one to use.
> - I'm still playing with good timeouts for how long to wait before 
> marking a service as stale or waiting for discovery replies (right now 
> 5 seconds).
> - Some services partition better less randomly, for example our 
> calculation service is best partitioned by the type of model 
> (convertible bond, option, CDS, etc.) so it can look at the instrument 
> type of the request rather than modding the request ID. The point is 
> that its up to the specific implementation of the service to make this 
> decision.
> 
> I've also written a launcher service that can start/stop other 
> services proactively or reactively, thus our risk scenario process 
> could ask the launcher service to start up the calculation service on 
> every machine on the LAN, run our risk scenarios, then ask them to be 
> stopped (that hasn't really been tested yet, but its designed in). My 
> plan is to deploy the launcher and "service" assembly (this is all in 
> C#) on every machine in the firm (client or server) and make them 
> available as worker machines if/when needed.
> 
> -Dave