[Spread-users] Spread Performance

Tue Apr 27 17:02:48 EDT 2004

[Please keep these emails on the spread-users list.]

Answers inlined, again.

--Ryan

William Isley wrote:

> Ryan,
> 
> Thank you so much for you help! The only piece that I still need help 
> with is how to configure a Spread Daemon for each client.
> 
> Spread_Segment xxx.yyy.zzz.255:4803 {
>  hostname1  xxx.yyy.zzz.11
>  hostname2  xxx.yyy.zzz.12
>  hostname3  xxx.yyy.zzz.13
> }
> 
> What does each "hostname1  xxx.yyy.zzz.11" value represent? A different 
> Spread Daemon client on the network? If each client has its own Spread 
> Daemon, and the network is a 192.168.1.x network with broadcast address 
> 192.168.1.255, and the local Spread Daemon client on the localhost has 
> an IP address of 192.168.1.11, then would the configuration for the 
> local host Spread Daemon look like:
> 

To clarify terminology, when I say "client" I mean any application that 
links with Spread's client library (or uses one of the other client 
interfaces people have developed).  Localhost is the address 127.0.0.1, 
for all machines (afaik), and is how they can send information to 
themselves.

> Spread_Segment 192.168.1.255:4803 {
>  machine1 192.168.1.11
> }
> 

This is fine for a configuration that has only one daemon.  Generally, 
each daemon should have exactly the same spread.conf (and always the 
same segment descriptions).

> Or is a separate hostname3  xxx.yyy.zzz.13 needed for each Spread Daemon 
> on the network?
> 

If I understand your question correctly, yes, one entry of type 
"hostname ip_address" is needed for each *potential* daemon on the 
network.  So, if my cluster has characteristics similar to those you 
described, and I have exactly 6 machines that might run Spread daemons, 
the configuration would be:

Spread_Segment 192.168.1.255:4803 {
   machine1  192.168.1.11
   machine1  192.168.1.11
   machine1  192.168.1.11
   machine1  192.168.1.11
   machine1  192.168.1.11
}

> Note that the machines that will use Spread uses DHCP to obtain an IP 
> address and can join and leave the network - though not during data 
> transmission.  It is a problem if every Spread Daemon on the network 
> must be list here because I may not know what Spread Deamon (or clients) 
> are available before hand. Is there a way to overcome this?
> 
> Thank you again for your help,
> 
> William Isley
> IMS Consultants
> 1250 N. Lakeview Ave, Suite A
> Anaheim, CA  92807
> Phone:  (714) 693-3505, x21
> E-mail: wmisley at imsconsultants.com
> 
> 
>> From: Ryan Caudy <caudy at jhu.edu>
>> To: William Isley <wmisley at hotmail.com>
>> Subject: Re: [Spread-users] Spread Performance
>> Date: Tue, 27 Apr 2004 14:52:13 -0400
>>
>> My comments are inlined below.
>>
>> Good luck,
>> Ryan
>>
>> William Isley wrote:
>>
>>> Hi Spread Group,
>>>
>>> I plan to use spread to reliably multicast Gigabytes of files to 40+ 
>>> machines. I have tried several configurations, and with only 5 
>>> clients receiving files (rebuilt from messages sent over Spread), the 
>>> best data throughput I can achieve is 3 MB/sec. I am required to 
>>> achieve at least 6 MB/sec.
>>>
>>
>> What is your network environment?  I'll assume a LAN with 100 Mbps 
>> ethernet.
>>
>>> I have tried running the Spread Daemon on a separate server, on the 
>>> message sender, and a message receiver. I get the best performance 
>>> with the Spread Daemon running on the message receiver. I am using 
>>> the SAFE message transport type, but have tried the UNRELIABLE 
>>> message transport type with negligable performance gain.
>>>
>>
>> I'd recommend that you run Spread daemons on all of the hosts that 
>> have Spread clients (sender and all receivers in your system).  The 
>> idea here is to maximize the amount of data that is transferred over 
>> your network as multicast/broadcast.  Otherwise, you end up with the 
>> same unicast data going over the network several times.  Multicast is 
>> only done between daemons in the same segment.
>>
>> I don't know for sure what your application's semantics are, but it 
>> sounds to me like you don't need SAFE messages.  For a one-to-many 
>> system like you describe, FIFO should be sufficient.  The application 
>> should be easier to develop with service type FIFO (or higher) than 
>> RELIABLE or UNRELIABLE.
>>
>>> I have tried scheduling Spread to run with Realtime scheduling. The 
>>> performance gain was negligable. I am running all of this software on 
>>> Windows.
>>>
>>
>> I wouldn't be surprised if you didn't get all the performance you want 
>> on Windows, although giving it high priority as you've described 
>> should certainly help.  Be careful to make sure that the clients get 
>> plenty of CPU time also, so that the sender can send fast enough, and 
>> so that the receivers can keep up.
>>
>>> I tried running the Spread Daemon on a dual Xeon processor machine. 
>>> The result is that the Spread clients loose there connection under 
>>> heavy load. The other machines in the configuration single 1GHz 
>>> processor machines.
>>>
>>
>> Again, I'd recommend that you use many Spread daemons, not just one. 
>> You may need to tweak the flow control a bit, but I think you'll be ok 
>> with a network like I described.  Keep in mind that if your sender 
>> outpaces your receivers, Spread will eventually disconnect them.
>>
>>> I need to squeeze more out of Spread than 3 MB/sec. The website 
>>> advertises 8 MB/sec. What can I do to better this performance? Are 
>>> there any changes I can make to the Spread.conf file that will 
>>> increase the performance? Is it possible to run multiple Spread 
>>> Daemons? How to I configure this system if this is an option and what 
>>> is the benefit?
>>>
>>
>> It is not only possible, but almost always desirable to run multiple 
>> Spread daemons.  To do so, the only major change is to the 
>> Spread_Segment section of the spread.conf file.  Basically, instead of 
>> using the configuration for a single daemon (usually just a daemon on 
>> localhost using the localhost broadcast address), you use a 
>> configuration more like
>> Spread_Segment xxx.yyy.zzz.255:4803 {
>>   hostname1  xxx.yyy.zzz.11
>>   hostname2  xxx.yyy.zzz.12
>>   hostname3  xxx.yyy.zzz.13
>> }
>> This configuration assumes you have a /24, and that you're using 
>> broadcast instead of multicast.  You could replace the (/24) broadcast 
>> address xxx.yyy.zzz.255 with the appropriate one for your network, or 
>> simply use a multicast addres instead if your network supports IP 
>> multicast.  Do NOT include the localhost address in your 
>> configuration.  Note that it may be necessary to specify at each host 
>> the hostname it is using (using spread -n hostname), depending on how 
>> your machines are configured.
>>
>> I've descibed the benefits of using multiple daemons above.
>>
>>> I have looked at TIBCO's SmartPGM, which is not viable due to cost. 
>>> JGroups advertises performance below that of Spread.
>>>
>>> Any help from anyone would be apprecieate,
>>>
>>> William Isley
>>> IMS Consultants
>>> 1250 N. Lakeview Ave, Suite A
>>> Anaheim, CA  92807
>>> Phone:  (714) 693-3505, x21
>>> E-mail: wmisley at imsconsultants.com
>>>
>>> _________________________________________________________________
>>>
>>>> From must-see cities to the best beaches, plan a getaway with the 
>>>> Spring
>>>
>>>
>>> Travel Guide! http://special.msn.com/local/springtravel.armx
>>>
>>>
>>> _______________________________________________
>>> Spread-users mailing list
>>> Spread-users at lists.spread.org
>>> http://lists.spread.org/mailman/listinfo/spread-users
>>>
>>
>> -- 
>> Ryan W. Caudy
>> Center for Networking and Distributed Systems
>> Department of Computer Science
>> Johns Hopkins University
> 
> 
> _________________________________________________________________
> Get rid of annoying pop-up ads with the new MSN Toolbar – FREE! 
> http://toolbar.msn.com/go/onm00200414ave/direct/01/
> 
> 
> 

-- 
Ryan W. Caudy
Center for Networking and Distributed Systems
Department of Computer Science
Johns Hopkins University