[Spread-users] Strange ordering problem

Jonathan Stanton jonathan at spread.org
Tue Oct 5 16:35:44 EDT 2010


That obviously should not happen :-) Under AGREED order messages all receiveres should receive the 
exact same sequence of messages unless they get a membership message and even then the messages they do 
get should be in order. 

You can check the status of the Spread daemons when it is behaving this way under load by running the 
spmonitor program that comes with Spread. If you run that and choose option "0" to activate status and 
montitor 'all' the daemons, you will get a printout on the screen every few seconds with internal 
status on the daemon. 

If you see the "state" or "gstate" variables change from 1 during the test then some membership 
activity is going on. You should also check the 'retrans' fields which indicate retransmissions (if 
there are a lot then something is wrong with the network.  You can also check the ARU and Deliver_Mess 
fields to see if they are increasing smoothly (and not pausing sometimes)

Another thing to check is if you can duplicate the problem with client programs using the core C api 
(instead of Perl). You could use the spuser program to inject bursts of AGREED messages when the 
network is under load and see if they are in order. Or for a high rate test you can modify the 
spflooder program to use AGREED instead of RELIABLE messages and see if it can duplicate the problem.

If C api works then my guess is something the Perl library is doing may batch up the messages, or not 
dequeue (or enqueue) them in the right order when under load. 

Cheers,

Jonathan

On Tue, Oct 05, 2010 at 08:30:29PM +0100, Melissa Jenkins wrote:
> Just wondering if somebody could point me in the right direction on this...
> 
> I've got a setup with two Spread servers in a single broadcast segment.
> 
> I have a writer on the first server, and a process on each of the second servers that listen for messages and process them.
> 
> What I've noticed is that although I'm using AGREED_MESS the reader on the second normally gets the messages out of order, and the one on the first occasionally gets them out of order. (I started with FIFO_MESS as I only have one source, but both seem to behave the same way.)
> 
> This only happens under very high load - and seems to happen in 'chunks', where a series of messages is delivered out of order (possibly in reverse if one run is indicative) and then normal ordering returns.
> 
> I'm using Spread 4.1.0, and Spread-3.17.4.4 perl module.
> 
> There doesn't appear to be any change in membership during this process.  The two machines are connected to the same switch and the port is not overloaded.  
> The readers are not being kicked off and are keeping up with the messages.  The writer has a queue to send to Spread which seems to back up a little, though messages are not stuck for a measurable period of time.
> 
> The traffic appears to burst in a similar fashion - every now and then it just pauses - though I'm not sure that is related.
> 
> I'm at a bit of a loss as to what to look at - all suggestions of where to start would be hugely gratefully received!
> 
> Thanks,
> Mel
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

-- 
-------------------------------------------------------
Jonathan Stanton         jonathan at spread.org
Spread Group Messaging   www.spread.org
Spread Concepts LLC      www.spreadconcepts.com
-------------------------------------------------------




More information about the Spread-users mailing list