[Spread-users] Message sequence counter wrap bug fixed

M S martin4321234 at googlemail.com
Tue Nov 11 05:44:14 EST 2008


Hi Jonathan,

thank you for your fix for the message sequence counter wrap bug.

I've tested this patch for protocol.c with our spread-3.17.4
installation and found it useful. The counter reset works as expected
before the total malicious blocking limit is reached. There is no
message loss.

The only thing which I find a little bit annoying is the very long
pause in daemon throughput during the reset process. I always observed
a 12 (twelve!) seconds pause in throughput. I assume this is twice the
Token_timeout value of 5 seconds plus something else (our spread
configuration is not wide_network). For wide_networks it would be
longer.

Of course it would be better, if this temporary blocking could be
avoided, but for me that solution is currently sufficient.

A minor recommendation: The logging classification of that new reset
event should be Alarm(PRINT,"...Token seq number approaching...reset
it") instead of Alarm(PROTOCOL,...) because I want to see that event
in the spread.log without turning on other voluminous logging.

Regards,
Martin

On Sat, Oct 11, 2008 at 4:13 PM, Jonathan Stanton <jonathan at spread.org> wrote:
> Hi,
>
> I've committed a fix to svn trunk for the problem where sequence numbers
> used by the daemons cause a hung daemon when they reach 2^32 (the max
> value of the counter). This fix works in my tests, but I would be very
> interested in anyone who has had this problem verifing that it also
> solves the problem for them. If you have not used the svn trunk before,
> you can find instructions at www.spread.org/devel.html
>
> This fix does not change the packet formats of Spread and so is fully
> compatible with Spread 4.0 systems. However because it does not increase
> the counter size, what it does do is trigger a spurrious membership
> change amoung the daemons when the counter gets close to wrapping which
> resets it back to 0. This membership will NOT be seen by any of your
> client applications, but will cause a short (few second) pause in the
> daemon throughput of messages.
>
> Let me know what you think of this.
>
> Cheers,
>
> Jonathan
> --
> -------------------------------------------------------
> Jonathan Stanton         jonathan at spread.org
> Spread Group Messaging   www.spread.org
> Spread Concepts LLC      www.spreadconcepts.com
> -------------------------------------------------------
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>




More information about the Spread-users mailing list