AW: [Spread-users] Spread daemon seems to "forget" its name
Schroeder, Heiko, ADBM62
heiko.schroeder at eads.com
Tue Nov 9 06:09:28 EST 2004
Hi,
I think, I found the problem:
Temp_buf (in sees_body.h) seems is too small in our
case, it overflows in G_mess_to_groups. And the linker
choose to place My after this buffer.
I don't fully understand the code yet, but shouldn't
this buffer be able to hold at least MAX_MESSAGE_BODY_LEN
bytes, which would be about 144k?
Anyway, increasing the buffer to this size solved
our problem here.
CU
Heiko
> -----Ursprüngliche Nachricht-----
> Von: Ryan Caudy [mailto:rcaudy at gmail.com]
> Gesendet am: Dienstag, 9. November 2004 03:44
> An: Schroeder, Heiko, ADBM62
> Cc: spread-users at lists.spread.org
> Betreff: Re: [Spread-users] Spread daemon seems to "forget" its name
>
> Hi,
>
> What OS are you using? What kind's of things are your clients doing?
> This isn't something that has turned up in ordinary testing, although
> I haven't put 3.17.3 through it's paces the way I have with a slightly
> hacked 3.17.2, or a precursor to the current CVS head based on 3.17.2.
> In order to reproduce the problem, it would help to know any
> descriptive information you can think of.
>
> Cheers,
> Ryan
>
>
> On Mon, 8 Nov 2004 16:51:56 +0100, Schroeder, Heiko, ADBM62
> <heiko.schroeder at eads.com> wrote:
> > Hi,
> >
> > we just came across a problem which (I think) hints to some
> > memory management relating bug in Spread. This is
> > with version 3.17.3.
> >
> > We have a system of 12 hosts that host several process each
> > that communicate using Spread. When switching one of the
> > hosts off and on again, sometimes (in about 30-50% of all
> > cases!), the whole system breaks down. At first, the crash
> > was because of an "illegal private name to kill" message.
> > I changed this message into a warning to see how the
> > system would react and switched SESSION debugging
> > on.
> >
> > The following output comes from one of the hosts that
> > were not switched off (the others produce output that is
> > very similar):
> >
> > [Mon 08 Nov 2004 13:57:18] Sess_read: queueing message of
> type 4 with len 0
> > to the protocol
> > [Mon 08 Nov 2004 13:57:19] Sess_read: Message has type
> field 0x80000084
> > [Mon 08 Nov 2004 13:57:19] Sess_read: queueing message of
> type 4 with len 0
> > to the protocol
> > Membership id is ( 176161537, 1099915040)
> > [Mon 08 Nov 2004 13:57:19] --------------------
> > [Mon 08 Nov 2004 13:57:19] Configuration at mfc2 is:
> > [Mon 08 Nov 2004 13:57:19] Num Segments 1
> > [Mon 08 Nov 2004 13:57:19] 12 10.128.255.255 4803
> > [Mon 08 Nov 2004 13:57:19] mfc1
> 10.128.3.1
> >
> > [Mon 08 Nov 2004 13:57:19] mfc2
> 10.128.3.2
> >
> > [Mon 08 Nov 2004 13:57:19] mfc3
> 10.128.3.3
> >
> > [Mon 08 Nov 2004 13:57:19] mfc5
> 10.128.3.5
> >
> > [Mon 08 Nov 2004 13:57:19] mfc6
> 10.128.3.6
> >
> > [Mon 08 Nov 2004 13:57:19] siu1
> 10.128.2.1
> >
> > [Mon 08 Nov 2004 13:57:19] siu2
> 10.128.2.2
> >
> > [Mon 08 Nov 2004 13:57:19] siu3
> 10.128.2.3
> >
> > [Mon 08 Nov 2004 13:57:19] siu5
> 10.128.2.5
> >
> > [Mon 08 Nov 2004 13:57:19] gpcu1
> 10.128.1.1
> >
> > [Mon 08 Nov 2004 13:57:19] gpcu2
> 10.128.1.2
> >
> > [Mon 08 Nov 2004 13:57:19] gpcu3
> 10.128.1.3
> >
> > [Mon 08 Nov 2004 13:57:19] ====================
> > [Mon 08 Nov 2004 13:57:19] Sess_read: Message has type
> field 0x80000084
> > [Mon 08 Nov 2004 13:57:19] Sess_validate_read_header: proc
> name mfc2 is not
> > my name
> > [Mon 08 Nov 2004 13:57:19] Sess_kill: killing session P3636
> ( mailbox 24 )
> > [Mon 08 Nov 2004 13:57:19] Sess_read: Message has type
> field 0x80000084
> > [Mon 08 Nov 2004 13:57:19] Sess_validate_read_header: proc
> name mfc2 is not
> > my name
> > [Mon 08 Nov 2004 13:57:19] Sess_kill: killing session P3669
> ( mailbox 27 )
> > [Mon 08 Nov 2004 13:57:19] Sess_read: Message has type
> field 0x80000084
> > [Mon 08 Nov 2004 13:57:19] Sess_validate_read_header: proc
> name mfc2 is not
> > my name
> > [Mon 08 Nov 2004 13:57:19] Sess_kill: killing session P3637
> ( mailbox 22 )
> > [Mon 08 Nov 2004 13:57:19] Sess_handle_kill: Illegal
> private name to kill
> > #P3636#
> > [Mon 08 Nov 2004 13:57:19] Sess_handle_kill: Illegal
> private name to kill
> > #P3669#
> > [Mon 08 Nov 2004 13:57:19] Sess_handle_kill: Illegal
> private name to kill
> > #P1274#
> > [Mon 08 Nov 2004 13:57:19] Sess_handle_kill: Illegal
> private name to kill
> > #P2135#
> >
> > Just before the new configuration message is output,
> everyhting seems
> > to be fine. But after this, the "My.name" is suddenly
> empty. All of the 11
> > "remaining" hosts showed the same problem, the one that "came back"
> > did not (might be by chance, though).
> >
> > I'll try to investigate this further but I'd be very happy
> if someone who
> > really
> > understands the code could help here... ;-)
> >
> > CU
> >
> > Heiko
> >
> > --
> > Heiko Schröder
> > EADS Deutschland GmbH
> > Defence and Communication Systems
> > Naval Combat Systems (ADBM62)
> > Bontekai 55
> > 26382 Wilhelmshaven - Germany
> > Tel: +49 44 21.15 43-230
> > Fax: +49 44 21.15 43-111
> > e-Fax: +49 731.392-20 91 11
> > heiko.schroeder at eads.com
> >
> > www.eads.com
> >
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
> >
>
>
> --
> ---------------------------------------------------------------------
> Ryan W. Caudy
> <rcaudy at gmail.com>
> ---------------------------------------------------------------------
> Bloomberg L.P.
> <rcaudy1 at bloomberg.net>
> ---------------------------------------------------------------------
> [Alumnus]
> <caudy at cnds.jhu.edu>
> Center for Networking and Distributed Systems
> Department of Computer Science
> Johns Hopkins University
> ---------------------------------------------------------------------
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>
More information about the Spread-users
mailing list