[Spread-users] Sess_validate_read_header: Message has illegal type field 0x80000080

Jonathan Stanton jonathan at cnds.jhu.edu
Thu Apr 21 21:36:22 EDT 2005


Hi,

No, the correct behavior I was referring to was that SP_multicats should 
receive either an error from the OS, if the socket is non-blocking, or 
block (if the socket is in blocking mode)  when it tries to "write()" to 
a socket whose socket buffer is full. So the SP_multicast will not write 
data into the socket until there is room. 

Your observed behavior of the data corruption in Sess_read is definitely 
a bug in either spread or the OS.

I'm looking at the code and trying to see how such a corruption should 
occur or what debugging to add. I'll email back as soon as I have 
something.

Cheers,

Jonathan

On Thu, Apr 21, 2005 at 08:15:41PM -0400, Scott Barvick wrote:
> Jonathon,
>  
> You are correct that I was talking about the C code this time.  The C
> code does push the TCP socket code up to the system maximum, but as
> you note it is still possible to overflow that.  When you say that the
> overflow condition is handled correctly, do you mean in the way that
> we observe: Sess_read gets invalid header data and then closes the
> connection?  Do you think there is anything we could do on the
> SP_multicast side to detect that the length would overflow the TCP
> buffer and then possibly wait?  Are there ioctls calls that could do
> this before a send?
>  
> Thanks,
> Scott
> 
> ________________________________
> 
> From: Jonathan Stanton [mailto:jonathan at cnds.jhu.edu]
> Sent: Thu 4/21/2005 5:48 PM
> To: Scott Barvick
> Cc: spread-users at lists.spread.org
> Subject: Re: [Spread-users] Sess_validate_read_header: Message has illegal type field 0x80000080
> 
> 
> 
> Mentioning this reminded me of something. I remember in the past there
> was a problem someone had that went away if they increased the tcp
> socket_buffers on the client-server tcp connections. If you are using
> Java, the code that increases the socket buffers is commented out in the
> library we distribute becuase it wasn't compatible with old JVM's, but
> if you are running anything current you can uncomment it and increase
> the buffer sizes.
> 
> Scott, I thought you were using the C library, not java? Is that
> correct? The C library does increase the socket buffer size, although it
> can still be filled, and that should be handled correctly.
> 
> Jonathan
> 
> On Thu, Apr 21, 2005 at 05:32:41PM -0400, Scott Barvick wrote:
> > After more experimentation and upgrading to 3.17.3 (which still showed
> > the problem), I believe this is the same as the Java problems - we are
> > trying to send more data into the TCP socket than it can hold.  With big
> > sends and high rates, a temporary increase in load on the daemon will
> > cause the clients to overrun the TCP buffer.  This resulted in the
> > Sess_read reading in all 0s for the header and then appropriately
> > calling it invalid.
> >
> > Is there any way for the SP_multicast() to monitor the TCP queue and
> > block on a send until there is enough room?
> >
> > Thanks,
> > Scott
> >
> > On Tue, 2005-04-19 at 20:18, Ryan Caudy wrote:
> > > The interaction there seems fairly normal -- if I had to guess, I
> > > would say that this points to a memory-corrupting bug.  I assume this
> > > is version 3.17.2 or 3.17.3?
> > >
> > > Cheers,
> > > Ryan
> > >
> > > On 4/19/05, Scott Barvick <sbarvick at revasystems.com> wrote:
> > > > Greetings,
> > > >
> > > > I'm getting the following error when running with a few test systems,
> > > > and I'm curious if others have seen anything similar.  I believe we are
> > > > hitting it medium hard with sends between 2 systems.  When I disconnect
> > > > one system with a hard stop of the app, the other system sees this,
> > > > processes the membership changes, but then a short time later kills the
> > > > client session when it receives a type field that has no type bits set
> > > > (only the endian bit - 0x80000080).
> > > >
> > > > I turned on SESSION and GROUP debug logging and included the output
> > > > below.  I was looking through the code to see how a message can get
> > > > through without the (FIFO_MESS | SELF_DISCARD) bits set as we send them
> > > > with the SP_multicast() call.  It probably is significant that the group
> > > > just dropped from 2 members to 1 member (the sender), but this works
> > > > fine in the steady state operation, even with only one member.
> > > >
> > > > Any similar experience or thoughts?
> > > >
> > > > Thanks,
> > > > Scott
> > > >
> > > > -------------------
> > > >
> > > > [...] lots more where this came from
> > > >
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > *****  Other system goes down ******
> > > > Send_join: State is 4
> > > > Send_join: State is 4
> > > > Memb_handle_token: handling form2 token
> > > > Handle_form2 in FORM
> > > > Memb_transitional
> > > > G_handle_trans_memb:
> > > > G_handle_trans_memb in GOP
> > > > G_handle_trans_memb: Received trans memb id of: {proc_id: -1408236782
> > > > time: 1113940766}
> > > > Memb_regular
> > > > Membership id is ( -1408236782, 1113940767)
> > > > --------------------
> > > > Configuration at testsys8 is:
> > > > Num Segments 1
> > > >         1       239.16.3.18       4803
> > > >                 testsys8                   172.16.3.18
> > > > ====================
> > > > G_handle_reg_memb:  with (172.16.3.18, 1113940767) id
> > > > G_handle_reg_memb in GTRANS
> > > > G_handle_reg_memb: skipping state transfer for group RTestGroup.
> > > > G_handle_reg_memb: skipping state transfer for group TTestGroup.
> > > > G_handle_reg_memb: skipping state transfer for group GTestGroup.
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > ******** start to receive membership messages ******
> > > > received TRANSITIONAL membership for group RTestGroup
> > > > Received REGULAR membership for group RTestGroup with 1 members, where I
> > > > am member 0:
> > > >         #RTEST0#testsys8
> > > > grp id is -1408236782 1113940767 1
> > > > Due to NETWORK change. VS set has 1 members:
> > > >         #RTEST0#testsys8
> > > > received TRANSITIONAL membership for group TTestGroup
> > > > received TRANSITIONAL membership for group GTestGroup
> > > > Received REGULAR membership for group TTestGroup with 1 members, where I
> > > > am member 0:
> > > >         #TTEST0#testsys8
> > > > grp id is -1408236782 1113940767 1
> > > > Due to NETWORK change. VS set has 1 members:
> > > >         #TTEST0#testsys8
> > > > Received REGULAR membership for group GTestGroup with 1 members, where I
> > > > am member 0:
> > > >         #TTEST0#testsys8
> > > > grp id is -1408236782 1113940767 1
> > > > Due to NETWORK change. VS set has 1 members:
> > > >         #TTEST0#testsys8
> > > > ***** we thought things were ok *******
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > >
> > > > ******  Something isn't right ************
> > > > Sess_read: Message has type field 0x80000080
> > > > Sess_validate_read_header: Message has illegal type field 0x80000080
> > > > SP_error: (-8) Connection closed by spread
> > > > Sess_kill: killing session RTEST0 ( mailbox 14 )
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > > Sess_read: Message has type field 0x800000c4
> > > > Sess_read: queueing message of type 4 with len 0 to the protocol
> > > >
> > > > _______________________________________________
> > > > Spread-users mailing list
> > > > Spread-users at lists.spread.org
> > > > http://lists.spread.org/mailman/listinfo/spread-users
> > > >
> >
> >
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
> 
> --
> -------------------------------------------------------
> Jonathan R. Stanton         jonathan at cs.jhu.edu
> Dept. of Computer Science  
> Johns Hopkins University   
> -------------------------------------------------------
> 
> 

> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users


-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------




More information about the Spread-users mailing list