[Spread-users] Slow receiver

Jonathan Stanton jonathan at cnds.jhu.edu
Tue Jan 27 10:41:41 EST 2004


My first comment is I don't know if the spread.conf file you gave is 
exactly what you are using -- but if it is (and hte xxx.xxx addresses are 
real IP addresses) then one problem is that you should not mix the 
internal localhost addresses and external ones. It doesn't make sense from 
Spread's point of view. 

I think your config may sort of work if you only connect from a client 
running on the same machine as the daemon, but the 'client' and 
'spread-daemon' machines will certainly not work.

If you are just running on one machine, then I'd try the config

Spread_Segment 127.0.0.255:4803 {
	localhost 127.0.0.1
}

If you are connecting clients from other machines or using multiple 
daemons I'd do aconfig like:

Spread_Segment xxx.yyy.zzz.255:4803 {
	client   xxx.yyy.zzz.1
	spread-daemon xxx.yyy.zzz.2
} 

(Obviously with real addresses. The key change is you should not include 
localhost in the set of machines if you are not ONLY using localhost). 

You are correct about what data_link is doing. It retries the sends on 
error to overcome transient errors. The question is why is the send 
failing in the first case. From the error I'm guessing it might be because 
of the config file having both localhost and remote addresses.

The best thing to do is try with one of the config files listed above and 
if that doesn't fix it, then turn on the DATALINK debug flag by adding

DebugFlags = { PRINT EXIT DATA_LINK } 
EventLogFile = bad_sendmsg.log

Then run spread and your test program. This will generate a log file 
"bad_sendmsg.log" that has every send call logged and should report more 
information that might show me what the error is. 

Cheers,

Jonathan

On Mon, Jan 26, 2004 at 09:33:35PM -0800, Anurag Gupta wrote:
> Thanks for your quick response.
> 
> Latency of each message is ~200ms. ktracing the spread daemon (we are on
> free bsd) led us to multiple sendmsg's failling for each one succeeding. I
> think num_try is restricted to 10 in data_link.c thats why it stops after
> 200ms. When I do a dump from ktrace, I see 10 of these for each successful
> sendmsg:
> 
> ==============
>  61327 spread   0.000004 RET   select 0
>  61327 spread   0.000003 CALL  sendmsg(0x5,0x80b1ae4,0)
>  61327 spread   0.000005 RET   sendmsg -1 errno 49 Can't assign requested
> address
>  61327 spread   0.000004 CALL  select(0,0,0,0,0x806a824)
> ===============
> 
> 
> Only uncommented lines in spread.conf are:
> 
> ==========
> Spread_Segment  127.0.0.255:4803 {
>   localhost   127.0.0.1
>   client  xxx.xxx.xxx.xxx
>   spread-daemon xxx.xxx.xxx.xxx
> }
> ==========
> 
> thanks
> -anurag
> 
> -----Original Message-----
> From: spread-users-admin at lists.spread.org
> [mailto:spread-users-admin at lists.spread.org]On Behalf Of Jonathan
> Stanton
> Sent: Monday, January 26, 2004 8:54 PM
> To: Anurag Gupta
> Cc: spread-users
> Subject: Re: [Spread-users] Slow receiver
> 
> 
> Hi,
> 
> I don't have quite enough information to know what is going on. Generally
> the latency of a single message send in Spread should be quite low. On
> modern machines, maybe in the neighborhood of a few hundred microseconds
> of work plus scheduling delays (switching between client, daemon, back to
> client if run on one machine) of anywhere from a few milliseconds to 30
> ms.
> 
> Can you provide us with your spread.conf configuration and some more
> information about how much the receiver is lagging (what the delay is) and
> the rough computer power? How are you timing the send time vs receive
> time? Are they on different machines?
> 
> One note, although it isn't necessarily affecting your results,
> on most OS's sleeping for 1ms actually sleeps for 10+ms since 10 ms is the
> minimum scheduling delay.
> 
> Cheers,
> 
> Jonathan
> 
> On Mon, Jan 26, 2004 at 08:43:27PM -0800, Anurag Gupta wrote:
> > Hi,
> >
> > I am seeing some unusual delays in getting spread to transfer messages. I
> > have a simple flooder sleeping 1 millis after publishing each message
> > (message size 100 bytes), and a receiver receiving those messages (no
> > processing done). Receiver is lagging behind in receiving the messages
> quite
> > a bit.
> >
> > How do I see where the delay is? Is this expected? Or a result of some
> > misconfiguration?
> >
> > regards
> > -anurag
> >
> >
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
> 
> --
> -------------------------------------------------------
> Jonathan R. Stanton         jonathan at cs.jhu.edu
> Dept. of Computer Science
> Johns Hopkins University
> -------------------------------------------------------
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
> 
> 

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------




More information about the Spread-users mailing list