[Spread-users] Answer_retrans: retrans of 1 requested while Aru is 14

John Lane Schultz jschultz at spreadconcepts.com
Mon Feb 11 14:56:48 EST 2008


I can't tell you why that occurred but I can say what the error means.  The
daemon that crashed believed that all the other daemons had
acknowledged receiving up through message #14.  But then it looks like one of
the daemons requested a resend of message #1.  This shouldn't happen
because the requesting daemon had (allegedly) already acknowledged
receiving up through message #14.

This could be some kind of wrap around problem with the message
counter?  The fact that the message numbers were so small (1, 14) around the
time of this failure strikes me as suspicious ...

Cheers!
John

---
John Lane Schultz
Spread Concepts LLC
Phn: 443 838 2200 
Fax: 301 560 8875

Monday, February 11, 2008, 2:37:37 PM, you wrote:

> Today we experienced a spread 4.0.0 daemon crash.

> The log had this:

> [Mon 11 Feb 2008 13:21:00] Answer_retrans: retrans of 1 requested while Aru is 14
> Exit caused by Alarm(EXIT)

> Looks like this message was generated in daemon/protocol.c:788

> But I'm not familiar enough with the source to know exactly what's
> going on at this point.  Overflow, maybe?

> Has anyone seen this before?

> This is the first daemon crash we're aware of since upgrading from
> 4.0.0rc2.

> Thanks!
> Matt


> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users





More information about the Spread-users mailing list