[Spread-users] Re: partition detection

Mon Apr 22 16:18:29 EDT 2002

> > > 1) JOIN: This means a single member joined the group. Noone failed and the
> > > 
> > > 2) LEAVE: A single member left the group (someone called SP_leave). the
> > > 
> > > 3) DISCONNECT: A single member 'disconnected' from the daemon it had been
> > 
> > I believe the API provides for multiple members joining, leaving or
> > disconnecting in this way.  Does Spread guarantee that you get a
> > separate message for each join/leave/disconnect event?  Could this
> > change in the future?  (IOW should I write code that can handle
> > multiple members, or can I safely assume this will never happen?)
> 
> Currently the API specifys that a caused_by_join, caused_by_leave and
> caused_by_disconnect will always be a single member. So changing that would
> be an API break and major change. Because of the way it is implemented I
> don't anticipate that changing anytime soon, if at all.

Thanks.  I think I was confused by the data structure returned by
SP_receive(), which is declared as char groups[][MAX_GROUP_NAME]
suggesting an array of group names.  But the explanation of
CAUSED_BY_JOIN clarifies that it's only one process that joined.

(As a result of this confusion, the Python API returns a list of
member names here. :-( )

> You may be thinking of the sp_multigroup_multicast which sends a
> message to multiple groups at once. Or it would be possible to have
> an API allowing you to join multiple groups in one "join" call, but
> that would still cause each group to see a join of a single member.

Agreed.

> Because join/leave/disconnect events are naturally generated one at
> a time for a single connection at a time, the only way they could be
> aggregated into multi-member joins is if we waited before handling
> one to see if more arrived and that would not give much gain at all
> for the added complexity (and increase in latency)

Good point!

> Now the Network events can easily have multiple members join or
> leave at once because they represent actual topology changes which
> effect lots of daemons and members. So if you receive a network
> event you must be able to handle any possible membership change of
> possibly many members.

Understood.

Perhaps you can also write up a clarification of the TRANS_MEMB_MESS
message?  That's still a mystery to me.  I have no idea how I can
force it so as to test my application's response to it!  This makes it
hard to even think about writing code to handle it...

--Guido van Rossum (home page: http://www.python.org/~guido/)