[Spread-users] how spread persist messages

Mon Mar 17 10:13:27 EDT 2008

John Song,

Spread does not persist messages.  It keeps messages in main memory until 
it knows all the currently connected daemons have received it or a daemon 
level membership change is completed and message buffers are cleared.  In 
a default configuration, I believe Spread will keep at most a few thousand 
of such messages in main memory and these messages can be up to about 100K 
in size.

Spread Concepts offers a commercial product, Congruity, that offers client 
based message persistance on top of Spread.  It is somewhat similar in 
concept to JMS using persistant subscriptions and messages.  First, you 
define a group of processes with which you want reliability maintained. 
Every message sent by one of these members will be persisted and delivery 
will be retried until every participant has successfully received the 
messages (i.e. - replays after crashes, disconnects, network partitions + 
heals, etc.). When you send a message, it is persisted to disk 
synchronously before being sent to Spread and it offers features to 
resynchronize on crash recovery (e.g. - knowing which messages were sent + 
delivered before the crash so that no messages are lost).  Furthermore, it 
offers a powerful ordering service where every participant can receive 
every message in the exact same order (global, persistent, total order), 
which can make designing distributed algorithms or implementing state 
machine based replication much easier.

You can read more about it at:

http://www.spreadconcepts.com/replication_info.html

If you are interested in learning more about Congruity, then please write 
to us at info at spreadconcepts.com

Cheers!

---
John Schultz
Spread Concepts
Phn: 443 838 2200

On Sat, 15 Mar 2008, John Song wrote:

> It is doable.  One option is to have a message aggregator that listens for
> messages from multiple message clients.  The aggregator can do message
> persistence.  A side benefit of an aggregator is in preserving orders across
> distributed message clients.   Persistence is done to local file system.
> Then there is a message distributor daemon that actually distribute the
> message to actual message consumers who listens on a different group.
>
> While this method works, there are a couple of reason why it is not ideal:
> 1. if the aggregator is down, message is lost(?).  For asynchronous message
> delivery, ideally, a success return code means that message has been
> persisted (not necessarily delivered).  If the local spread client can do
> the persistence, then there is less of a chance of dropped messages.
> Pulling the existence of an aggregator on every message delivery is both
> expensive and still not 100% sure of persistence.
> 2. the aggregator that persists message need to synchronous with the
> dsitributor daemon while reading/writing to the local file system.  While it
> is not that difficult, it will be better it is not implemented case by case,
> but rather by a component of a mesaging platform.
>
> I read the spread user manual.  The user manual is more a spread
> introduction literature plus an api reference manual. I can't find more
> indepth architecture discussion on spread's internal.  E.g. spread will
> retry a message delivery if there is a network failure.  That means it
> maintains an internal buffere.  However, how big is the buffer and how large
> each message can be?
>
> Is there any white paper on spread?  Any related projects that has done file
> system based message persistence with reader/writer synchronization?
>
> thanks,
>
> -john