[Spread-users] Sending messages to a fixed set of servers
Christian Schnell
lulli at cs.tu-berlin.de
Fri May 7 07:25:28 EDT 2004
Hello Ciprian,
thanks for your very helpful reply. I have looked through
"Impossibility of Distributed Consensus with One Faulty Process"
(Fischer-Lynch-Patterson) which without doubt is another important brick
of knowledge in all this (never thought about it on that general kind of
a level).
Well, I agree that my first quick-shot idea doesn't even live up to my
own requirements, just forget about it (forgive me, I'm still novice
with these matters and will remain so for unknown lengths). Still,
thanks for your suggestions.
I've read through "From Total Order to Database Replication" and it is
really inspiring for me in my current position. You and Yair propose a
lot of properties that I think also comply for the service that I'm
thinking about. I'm afraid I haven't yet understood all details (I'll
read it again), but I think I can now identify my consistency
requirements a lot better than before :)
Now, a general database replication service (as you and Yair propose so
nicely) clearly has stronger global consistency requirements than a
persistent message service. Most importantly: unlike a DBMS, a PMS does
not by itself demand the separation in primary and secondary components,
in a persistent message service all components are primary (well at
least that's what I have in mind, or more general: it should be up to
the application to decide whether it requires that level of
consistency). Do you agree?
The persistent messages of any node N can be delivered anywhere as long
as all previous persistent messages from all past components that
included N have been delivered there, that's my basic consistency
requirement, plus correct order (happened-before relations) of
persistent messages and components. Any objections?
I think I could introduce a "persistent session"-level above the
component level. Nodes must explicitly enter a new persistent session
whenever they detect the start of a new regular component. Each
persistent message is contained and ordered in exactly one persistent
session. Nodes store information about
a. persistent messages and their order within their session,
b. persistent sessions (a global unique session id and membership info) and
c. the "happened-before" relations between adjacent persistent sessions
I think it's natural to attempt to capture messages in the context of
their component. At the end, a component and its sequence of messages is
known to a (growing) set of servers and needs to be replicated to a
disjunct (shrinking) set of servers. This should leave the replication
layer with a lot less information to interchange between nodes in order
to determine what to replicate.
Since I get no information like a global unique component identifier or
about the directly preceeding components, I have to establish that level
of knowledge by hand, that's why I introduce sessions above components.
At my current level of knowledge, this makes sense to me and should be
possible to develop without the need for component-wide consensus. The
algorithm that I have in mind would
1) guarantee that a node becomes member of a current persistent session
as long as it can receive its own messages
2) block a node from sending out new messages until it is member of a
persistent session and
3) queue incoming persistent messages from another node N until all
preceeding persistent sessions that N participated in have been
delivered to the application
I'll of course post all my thoughts here to the list, but I feel here is
a good point to stop for now and to listen to your opinions about that.
Thanks,
Christian.
More information about the Spread-users
mailing list