[Spread-users] Spread usage scenario for sending data
Ryan Caudy
caudy at jhu.edu
Tue Apr 27 08:27:20 EDT 2004
My answers are inlined.
--Ryan
Mayne, Peter wrote:
> I'm looking at Spread to implement a data replication scenario, but the
> documentation is ambiguous enough that I can't tell is Spread is
> suitable or not.
>
> The scenario has a central shared database with (say) 30 client sites
> sharing the data. Each of these client sites also has a site-specific
> database that is shared with the central site. Each of the 31 sites can
> update its own copy of the shared database and site-specific database,
> and each night, changes are copied to the central site. (Ignore possible
> conflicts for the purposes of this explanation.)
>
Maybe you have some sort of special data semantics, like commutative
updates... otherwise, the amount of complexity being ignored is fairly
significant.
> Spread would appear to be useful in that the client sites could fetch
> their copy of the shared database changes in a sequence of reliably
> delivered messages, and send their copy of the site-specific database
> changes likewise.
>
You would probably use at least FIFO.
> My initial experiment with Spread doesn't look promising. I set up a
> Spread daemon with two clients (spuser) in the same group, and sent
> messages between clients using spuser's default SAFE_MESS. If a
> receiving client dies, then further messages sent to that group are
> lost, which doesn't seem very safe. Membership of a group appears to
> require the client being connected, and the Spread daemon doesn't store
> the messages until the client comes online.
>
> The documentation makes it hard to figure out what is meant by
> "reliable". It appears that my concept of reliable (JMS persistent
> messages, for instance) isn't the same as Spread's. There's nothing
> wrong with that, but I'm not sure what spread's definition of reliable is.
>
Reliable for Spread means the same thing as reliable for other network
protocols... if a message (UDP) is lost, it will be resent until it does
arrive. Depending on the service type chosen, such message losses may
delay delivery of many other messages, in order to maintain safety and
ordering guarantees.
For persistence, there is a robust JMS implementation that uses Spread.
I'm not sure if there persistent version has been released... see the
JMS4Spread site (linked from www.spread.org) if you're interested.
> However, it appears that sending a list of changes via Spread, with the
> currently off-line clients fetching them whenever they are able, is not
> suited to the spread paradigm.
>
Under these circumstances, there seems to be no reason not to implement
this persistence service centrally, at the master replica. Buffer
messages, and keep a vector of sequence numbers up to which each client
is up to date. Any message with sequence number that is less than or
equal to the minimum from the vector may be discarded.
> If the client sites connect to the central site and ask for updates
> since the last time they connected, I'm not sure what extra benefit
> Spread provides over and above a simple TCP connection between sites. (A
> slightly provocative statement, but I'm trying to get educated. :-)
>
> Comments? Thanks.
>
You haven't really discussed your network architecture, but Spread isn't
really designed to help centralized client/server systems, assuming you
have a good, switched network. For you, (unless you need higher
service types than FIFO) all you gain over TCP is ease of
implementation, which may not help if you have trouble learning to use
Spread. Spread's design would show more of its potential in a system
with multiple databases acting as peers, rather than this 1 master/many
slaves design. For description of an eager many-to-many replication
algorithm using the services provided by Spread, see the paper "From
Total Order to Database Replication" from the CNDS publications web site.
> PJDM
> --
> Peter Mayne
> Technology Consultant
> Spherion Technology Solutions
> Level 1, 243 Northbourne Avenue, Lyneham, ACT, 2602
> T: 61 2 62689727 F: 61 2 62689777
--
Ryan W. Caudy
Center for Networking and Distributed Systems
Department of Computer Science
Johns Hopkins University
More information about the Spread-users
mailing list