[Spread-users] chunking messages?

John Schultz jschultz at d-fusion.net
Thu May 23 15:39:10 EDT 2002


Yair, Jonathan: Jacob and I wrote a Java subclass to SpreadConnection 
that transparently allows arbitrarily sized messages while maintaining 
Spread's safety properties. We'd like to include this in the Spread 
distribution if you agree. How does the IP work for code like this?

Once I hear from the Spread guys, I'll post the code to this list.


Quentin,

My company ran into a similiar problem as yours, so we wrote a small 
(Java) extension to SpreadConnection which transparently supports 
arbitrarily sized msgs (I think it is restricted by Java arrays to a max 
of 2^31 - 1 bytes). In your code just use MMSpreadConnection instead of 
SpreadConnection. Be sure all processes in the same group use one or the 
other, or you will be on the bullet train to Bugsville :)  If you aren't 
using Java, then you could use the code as a basis to write a C/C++, 
Perl, Python or whatever floats your boat, version.

This code is a recent (today) rewrite of a previous version and so it 
hasn't seen much testing yet (just what is in the main() fcn). I'll be 
happy to fix any problems you or we run into.

Here are the semantics:

Messages larger than 80K are broken into smaller msgs of 80K or less. 
The first msg is sent with the service type (FIFO, CAUSAL, etc.) 
requested by the user, the rest of the chunked msgs are sent in order as 
FIFO msgs until the entire data payload is sent. NOTE: No flow control 
is performed as the msg is sent so if you send REALLY big messages (10Mb 
or larger) then you will flood Spread and potentially cause disconnects 
of your receivers for not reading from Spread fast enough!

On the receiving end, msgs are put into a queue as they are received. If 
  a multi-msg is at the head of the queue then all other messages will 
be queued and delivered until after that multi-msg is received in its 
entirety and delivered.  If the sender of a multi-msg leaves before his 
entire msg is received then that MMsg is dropped.  If a receiver comes 
in during the middle of a multi-msg (doesn't get the first msg) it just 
drops all of those chunks. I'm pretty sure these heuristics maintain all 
of Spread's safety properties.

MMSpreadConnection is a transparent replacement for SpreadConnection, 
you don't have to do anything special, except be sure that all processes 
in a given group use MMSpreadConnection and that every process is a 
member of any group to which it sends.

PS - I used this class with 3.16.0 and haven't tried it with 3.16.2 yet, 
but I don't imagine any difficulties.

-- 
John Schultz
Co-Founder, Lead Engineer
D-Fusion, Inc. (http://www.d-fusion.net)
Phn: 443-838-2200 Fax: 707-885-1055


Apologies if this is a FAQ or is a newbie question...

I've just discovered Spread, and am considering using it as a way to
achieve reliable, batch oriented DB synchronization over a standard
mechanism (smtp, http, or ftp).   I don't necessarily need scalable
group services or multicast, but ordered messages and reliable delivery
would help.

I saw this in the FAQ:

 > How big a message can I send to Spread?
 >    Spread currently supports application messages up to
 >    around 100Kbytes. Currently the exact size is not exported
 >    as part of the API.

and was wondering if Spread has a mechanism for "chunking" messages
down to 100K, or would I have to do that myself?   The system will
probably have message lenths the order of 1M.

Thanks in advance


-- 
Quentin Neill                qneill at racemi.com
Racemi                       http://www.racemi.com
work: 404-892-5850 x111      cell: 678-557-5408






More information about the Spread-users mailing list