[Spread-users] Sending messages to a fixed set of servers

Wed May 5 13:48:31 EDT 2004

Hi Christian,

CS> After thinking a bit more about Lamports logical clocks, I think I
CS> understood a very simple solution of how to place a correct and usefull
CS> total order on messages (it's rather obvious).

The solution you present will only work in well-behaved environments
where nodes do not partition from each other and do not even crash
(you mentioned storing the timestamp on shutdown/startup). You could
modify your idea to store the timestamp to disk each time a message is
received. Furthermore, this method does not allow you the detection of
"holes" in the message ordering. That is to say that when you receive
a message m, you do not know if you can deliver it or whether you need
to wait for some other message m' that is ordered before m, but that
was missed by the current receiver due to some network fault.

All these delicate issues stem, if you want, from the famous
Fischer-Lynch-Patterson impossibility result which says that you
cannot reach agreement in an asynchronous system susceptible to even
one crash fault. There are practical and theoretical ways that
circumvent the issue. The theoretical solutions involve the notion of
failure detectors. The practical solutions may rely on different
assumptions and they are usually designed such that the correctness of
the protocol is _never_ broken, but the liveness may be compromised
(ie all nodes block) in the presence of certain failure patterns.

For more detailed literature you can have a look at the paper
"From Total Order to Database Replication"
that you can find here: http://www.cnds.jhu.edu/publications
(direct link: http://www.cnds.jhu.edu/pub/papers/AT02_icdcs.pdf )
The paper presents exactly what the difference is between what seems
to be an obvious correct solution, and what a real solution that
provides consistent persistent order should be.

Ciprian

CS> A persistent, system-wide logical clock that spans across all messages
CS> for the entire run-time of a "persistent group". Before sending a 
CS> message m, a node increments its local logical-clock value and sends
CS> that with m. Upon receiving a message m, a node updates its local 
CS> logical-clock value: logical-clock = MAX(logical-clock, m.lctime + 1).
CS> On shutdown/startup, each node stores/restores its local logical-clock
CS> value. Changes in the configuration can be ignored, and by building the
CS> pair of (local-IP, logical-clock) I have a nice "primary key" for 
CS> message log records.

CS> With the totally ordered logical-clock in the message log records, it
CS> becomes *much* simpler to determine the set of mission messages at each
CS> node, correct?

CS> Thanks for any input,
CS> Christian.

CS> _______________________________________________
CS> Spread-users mailing list
CS> Spread-users at lists.spread.org
CS> http://lists.spread.org/mailman/listinfo/spread-users