[Spread-users] Question on Spread for high performance database application

Mon Feb 6 22:17:08 EST 2006

On Mon, 6 Feb 2006, Shilpa Lawande wrote:

> We are evaluating Spread as a replacement for MPI for a high data
> throughput database application running on a cluster of Linux boxes
> with GigE as interconnect.  Our application may need to scale anywhere
> from several hundred to 1000 nodes.

Spread works on a client-daemon peer-to-peer model.  A deployment usually 
consists of a few tens of daemons (e.g. - 5 to 30).  Each daemon can 
handle several hundred client connections.  So, depending on your exact 
system needs and capabilities, Spread may be able to support up to a 
thousand "client nodes."  However, Spread in its current form usually 
cannot scale beyond a few tens of "daemons nodes."

> b) For point-to-point communication between nodes.  The alternative
> for us is to roll out our own using vanilla TCP sockets.  Significant
> portion of our data traffic is point-to-point and we need high data
> throughput here. We are concerned by the extra hop of going through
> the Spread daemon.

Currently, Spread only emulates point-to-point communication using normal 
multicast communication.  In other words, every message is reliably 
multicast to every daemon in the system and then every daemon that is not 
hosting a client destination for the message drops it.  This is obviously 
not optimal for a system that envisions lots of point-to-point 
communication.

> a) Are there any studies on network performance of Spread vs sockets
> for large amounts of data transfer ?

Not in the particular context of your question.  There have been lots of 
studies in general of throughput and latency performance.  In general 
though, I don't think any studies have looked at CPU overhead that much.

http://www.spreadconcepts.com/publications.html
http://www.cnds.jhu.edu/pub/papers/cnds-2004-1.pdf

> b) How much overhead might Spread introduce in terms of CPU usage ?

It depends on how hard you stress your system in terms of # of msgs/sec 
and Mb/sec.  In general though, Spread is a bit more CPU hungry than your 
typical TCP/IP communications, although it is doing considerably more than 
point-to-point communication too.  In the past, we have seen that on 
gigabit networks, when system throughput goes into the *several hundred* 
Mb/s range, Spread has been CPU bound rather than I/O bound.  This of 
course heavily depends on the strength of your processor.

> c) There is a limit of 128 Spread daemons. This means we may need to have
> one daemon for a set of nodes in our cluster. What is the impact of
> the daemon running on same node as client vs. another ?

The main impact is the cost of the client-daemon communication.  When the 
connection is local the communication should stay within the bounds of the 
machine through IPC (unix pipes) or it might still traverse the IP stack 
(unix/windows).  The traffic should never get out onto the network and the 
in memory bandwidth of a computer is usually much greater than that of a 
network.

If you are connecting remotely to a daemon then the client-daemon 
communication goes through a TCP/IP channel across the network thus eating 
bandwidth.  Therefore, the bandwidth optimal deployment of Spread is to 
have a daemon on every machine with client processes.  This way a message 
will usually only traverse the network once rather than potentially many 
times if lots of clients are remote.

> d) Are there any studies comparing Spread to MPI ?

Not to my knowledge.

Feel free to ask any more specific questions either on this mailing list 
or by directly contacting us at Spread Concepts through 
info at spreadconcepts.com

Cheers!

---
John Schultz
Spread Concepts
Phn: 443 838 2200