[Spread-users] Spread 4.0 daemon "goes to sleep" on XP network disconnect

Ryan Caudy rcaudy at gmail.com
Wed Aug 8 07:12:52 EDT 2007


In order to better assess the issue, you might want to create logs of
what's going on with membership information printed.  See the sample
spread.conf distributed with Spread, or the documentation at
spread.org for more information on how to do this.

It sounds to me like A is starting the membership change, but failing
to complete it, even though B and C are able to start the change and
install a new membership in the same period of time.  During the
execution of the membership algorithm, new client messages are blocked
until completion.

Also, please check the regular membership messages received at clients
on each side of the partition you're creating.  Everyone should be
receiving A1 ... C2 in their new membership lists, but A1 and A2
should be in one of the VS sets, and B1 ... C2 should be in the other.
 If this is not the case then either my assumptions/understanding are
wrong or there's incorrect behavior at the daemon.

I think the most likely scenario at this point is that there's a
networking issue preventing A from completing the membership algorithm
and installing its new (solo) configuration.

Cheers,
Ryan

On 8/8/07, Steve Duff <Steve.Duff at vivista.sungard.com> wrote:
> I'm evaluating Spread 4.0 as a possible candidate for a group
> communication task, and I've noticed an unexpected behaviour (well,
> unexpected to me, anyway).
>
> I have three machines A, B, and C each running a daemon, and with two
> local clients (i.e. on the same machine), A1, A2, B1 etc. all subscribed
> to the same group.
>
> If I pull the network cable out of machine A, daemons B and C notice
> this and tell B1, B2, C1 and C2 of the membership change. A1 and A2 get
> no membership change.
>
> As expected, messages can be exchanged between clients on B and C, but
> when messages are sent through A1 they are not delivered anywhere (even
> to A2). If I then reconnect A, then clients on A, B, and C all get
> membership change messages, and any messages sent from A1 while
> disconnected are now delivered to all clients.
>
> My expectation was that daemon A would eventually tell A1 and A2 that
> they had been separated from the rest of the group, but this doesn't
> happen. This makes the membership change message that they DO receive on
> being reconnected seem spurious. Also the messages that are delivered on
> reconnect are out of context and would cause problems when they are
> delivered.
>
> I tried this initially on three virtual machines, but have confirmed the
> behaviour is the same with real machines. I've tried using a single
> spread segment, and three separate spread segments, but this also makes
> no difference. I've also tried using two switches and separating A from
> B and C by disconnecting the switches from each other - in that case A
> works entirely as expected.
>
> I think the possibilities are:
>
> A) My expectation that A should separate and notify it's clients in this
> circumstance is simply wrong, in which case could somebody please
> explain why.
> B) This is a "feature" caused by Windows XP networking, i.e. it doesn't
> happen on other platforms.
> C) It is a feature of the Spread implementation, and I need to work
> around it at the application level.
>
> I would appreciate if anyone can offer any help with this?
>
>
> Thanks
> Steve
>
> **********************************************************************
>
> SunGard Vivista Limited, Marshfield, Chippenham, Wiltshire SN14 8SR
> Telephone: 08456 041999, Fax: 08456 052999
>
> Registered Office: 33 St Mary Axe, London EC3A 8AA. Registered in England No. 1593831 VAT Reg No. GB 810 9546 34
>
> **********************************************************************
>
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the system manager.
>
> This footnote also confirms that this email message has been swept by
> MIMEsweeper for the presence of computer viruses.
>
> www.mimesweeper.com
> **********************************************************************
>
>
> This message has been checked for all known viruses on behalf of SunGard Vivista by MessageLabs.
>
> http://www.messagelabs.com or Email: mailsweeper.info at vivista.sungard.com
>
> For further information http://www.sungard.com/vivista
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>




More information about the Spread-users mailing list