[Spread-users] Newbie: Initial queries

Wed Mar 13 07:04:53 EST 2002

Hi,
I am briefy evaluating spread as a system with the speed of multicast but with failover detection and guaranteed
delivery. I am particularily interested in no single point of failure (as in failure of the system rather than a node)

I have naievely been playing with spread to see if it can tolerate the spread binary being killed and restarted. 

This is my spread.conf

Spread_Segment  192.168.42.255:4803 {

        master    192.168.42.91
        node100 192.168.42.100
        node101 192.168.42.101
        node102 192.168.42.102
}

This is  what I 'think' I see, using  the demo program spuser (user.c):

If the master node running spread  dies the program spuser dies on the other nodes
as well because they decide to rather than are forced to. If I remove the exit(0) from user.c
then the nodes running spuser see the following when I kill the spread process on the master node:

received FIFO message from Üa@ðq@, of type 0, (endian 3) to 43432 groups

.

If I restart the spread process on the master node then the other nodes cannot interract with it:
I try; 

j nodes

but nothing happens.........

-----------------------------------------------

I realise I am probably not doing it the proper way but I would love to know if the spread process can be restarted
and:
#1 other nodes can rejoin the groups, infact do anything spread related, or even nicer
#2 the other nodes are automagically rejoined to the groups.

Comments welcome,

David T