[Spread-users] is spread the right choice ?

John Lane Schultz jschultz at spreadconcepts.com
Mon Apr 2 12:07:19 EDT 2007


Spread's design is optimized for *groups* of processes to communicate, thus the 
name "group communication."  It can certainly be used for 2 party communication 
as well, but 2 party communication can incur unexpected overhead.

For example, currently all data traffic is sent to and processed by *all* Spread 
daemons.  This communication pattern can cause unexpected network overhead. 
Within a segment (LAN), all data traffic is broad/multicast to all the daemons 
in that segment.  This can incur a lot more traffic overhead than point to point 
traffic because your switch/router/hub has to put the traffic on every 
(interested) port.  If you have multiple segments, then Spread also sends the 
data to the leader of each segment (i.e. - n segments usually causes n-1 
unicasts and 1 broad/multicast per sp_multicast, ignoring packing).  So, if you 
have two applications communicating with each other, even through the same 
daemon, then their conversation is actually sent to and processed by *all* the 
daemons -- all the other daemons simply drop the data on the floor after 
extracting the necessary meta-data.  Spread could be optimized to better handle 
daemon interest in data (i.e. - send side filtering), but that has not yet been 
done.

On the flip side, Spread provides a pretty simple API for message passing and 
provides powerful ordering and reliability semantics for n-party communication, 
all of which can ease development of distributed applications.

I can't make the call for you, but if you really are only doing point to point 
communication and don't expect to go to n-party communication, then the good old 
BSD socket client-server interface, along with select, might be your best bet.

Cheers!

Sami M wrote:
> OK Folks. Maybe I am on the wrong track here. We need a message passing 
> library.
>  
> It's for a large distributed application that needs to scale on a linux 
> cluster with possibly tens of nodes and 7/8 different type of processes 
> [don't ask... we might be building the next google :)]. Different 
> processes running on seperate (or same) machines need to 
> communicate. There is currently no need for multicast messages 
> although any process may exchange a message with any other process 
> (n-way connectivity). Doing that point-to-point would be overly complex 
> I think !?!? They need to be able to send messages that may go upwards 
> of 100M. There is a need for java/c inter-operability as some processes 
> are in java. 
>  
> I have tried mpich2 & pvm so far & those libs proved to be not well 
> suited for this application for various reasons. Spread seemed well 
> suited initially but has its own set of constrants with regards to 
> message sizes & flow control. So far I have implemented message chunking 
> to overcome the 100K limit. It seems I will need a fix for this flow 
> control issue.
>  
> Is spread the right choice for this kindof application with regards to 
> scalability, performance, and feature requirements here. I'd prefer 
> using an off the shelf solution for this. What other open and/or 
> commercial libs can I try? I am using a wrapper interface to message 
> passing lib so I can try different solutions relatively easily.
>  
> BTW... I am not a comm expert so bear with me if I am missing anything 
> obvious. Please feel free to weight in. Any help here is appreciated.
>  
> Thanks,
>  
> Sami
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users


-- 
John Schultz
Spread Concepts LLC
Phn: 443 838 2200
Fax: 301 560 8875




More information about the Spread-users mailing list