[Spread-users] Delays in receiving messages

Melissa Jenkins melissa-spread at temeletry.co.uk
Tue Jan 25 15:58:02 EST 2011


I have had similar problems on a Spread network passing between 10 and 20 messages a second.

I couldn't see any packet loss on the link (no SACKs no lost ICMP) but Spread was sending in bursts, and sometimes getting messages out of order.  Also, there was no retrans reported in spmonitor.

Although I haven't done any conclusive testing disabling CPU idle modes and ensuring that the CPU didn't drop frequency too quickly seems to help.

Mel


On 25 Jan 2011, at 20:31, Yair Amir wrote:

> Hi Kevin,
> 
> To me it seems you have some loss on your network and possibly a control
> message (either a token or a hurry request) is lost for some reason.
> 
> Now, 10 messages per second for Spread is equivalent to an inactive network
> for most of the time (it would take a millisecond or so to propagate the
> one message and then there are 100ms with quiet.
> 
> The question is why the control message is lost occasionally.
> 
> Changing the hurry timer is actually a fine solution if it solves your
> problem. You can actually eliminate the problem all together if you
> void the slow-down feature, but that will have a price of the token
> rotating even without new messages.
> 
> If you tell more about what your goal is, perhaps better comments can
> be made.
> 
> Cheers,
> 
> 	:) Yair.
> 
> On 1/25/11 2:55 PM, Kevin Everets wrote:
>> Hello,
>> I'm playing with spread here and running into an issue where there are
>> delays in receiving messages.
>> The setup is with three machines, all of which are in separate Spread
>> segments, though currently the first two are in the same network
>> subnet.  The config looks like:
>> Spread_Segment 0.0.0.0:5333 {
>>  host_a 10.1.1.1
>> }
>> Spread_Segment 0.0.0.0:5333 {
>>  host_b 10.1.1.2
>> }
>> Spread_Segment 0.0.0.0:5333 {
>>  host_c 10.1.2.1
>> }
>> Then, on host_b, there's an active sender, sending a message every 0.1
>> seconds containing the current high-res timestamp.  There's also a
>> receiver running on host_b (same host) receiving the message and
>> comparing the timestamp.  If the timestamp differs by more than 0.1
>> seconds, it prints out the message.
>> What the result seems to be is that every so often there's a delay in
>> reception where it seems that messages get buffered up to 2 seconds
>> and then all of those messages are received by the receiver.  Changing
>> the Hurry_timeout in membership.c seems to indicate that it is hitting
>> the Hurry_timeout case (changing this to 10 seconds means that
>> occasionally messages are buffered by up to 10 seconds).
>> The question is, why is this happening, and how can it be prevented?
>> A quick fix seems to be to turn down the Hurry_timeout, but ideally
>> this shouldn't be necessary since it seems the network shouldn't be
>> hitting this case.  It's fairly busy with 10 messages per second, and
>> the reading of the Hurry_timeout is that it seems to be meant for a
>> relatively inactive network.
>> Thanks in advance for any direction on this,
>> K.
>> _______________________________________________
>> Spread-users mailing list
>> Spread-users at lists.spread.org
>> http://lists.spread.org/mailman/listinfo/spread-users
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users





More information about the Spread-users mailing list