[Spread-users] Answer_retrans: retrans of 1 requested while Aru is 14

Matt Garman matthew.garman at gmail.com
Mon Feb 11 16:03:22 EST 2008


On Mon, Feb 11, 2008 at 02:58:35PM -0500, John Lane Schultz wrote:
> > I can't tell you why that occurred but I can say what the error
> > means.  The daemon that crashed believed that all the other
> > daemons had acknowledged receiving up through message #14.  But
> > then it looks like one of the daemons requested a resend of
> > message #1.  This shouldn't happen because the requesting daemon
> > had (allegedly) already acknowledged receiving up through
> > message #14.
> 
> > This could be some kind of wrap around problem with the message
> > counter?  The fact that the message numbers were so small (1,
> > 14) around the time of this failure strikes me as suspicious ...
>
> Following up on my last point, did this occur in a long running
> system or had you just started up the daemon(s)?

I'd say long-running: non-stop for a whole month.  And this daemon
gets a lot of use, too.





More information about the Spread-users mailing list