[Spread-users] write(): java.net.SocketException

Shawn Bradford shawnb at mojix.com
Sun Nov 11 19:47:27 EST 2012


We have finally got to the bottom of the issue.  Fundamentally, a write
only client does not exist.  If you are connected to spread any client can
send a message to any other client directly (non-multicast).  This also
explains the proffered remedies :
  - Adding a process to read messages (Melissa)
  - Increasing the message buffer size (Marcelo) would only make the
disconnect less frequent.

If there is a way to configure a client to ignore all incoming message I
would be interested.  Otherwise our solution is is line with Melissa's and
we have added a process to read any messages.

Many thanks to all for the assistance, it has provided greater insight as
well as fixed a nasty bug in our SW.
##Shawn



On Wed, Nov 7, 2012 at 3:58 PM, Marcelo San-Martin <
Marcelo.San-Martin at harmonicinc.com> wrote:

> Hi,
> I used to have a similar problems, in my case I fixed it by increasing
> MaxSessionMessages in the configuration file. The default value was 1000, I
> increased it to 10000 and the problem went away.
>
> Cheers,
> Marcelo
>
>
> -----Original Message-----
> From: spread-users-request at lists.spread.org [mailto:
> spread-users-request at lists.spread.org]
> Sent: Wednesday, November 07, 2012 2:02 PM
> To: spread-users at lists.spread.org
> Subject: Spread-users Digest, Vol 91, Issue 4
>
> Send Spread-users mailing list submissions to
>         spread-users at lists.spread.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.spread.org/mailman/listinfo/spread-users
> or, via email, send a message with subject or body 'help' to
>         spread-users-request at lists.spread.org
>
> You can reach the person managing the list at
>         spread-users-owner at lists.spread.org
>
> When replying, please edit your Subject line so it is more specific than
> "Re: Contents of Spread-users digest..."
>
>
> Today's Topics:
>
>    1. Re: write(): java.net.SocketException (Shawn Bradford)
>    2. Re: write(): java.net.SocketException (Ed Holyat)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 7 Nov 2012 11:20:23 -0800
> From: Shawn Bradford <shawnb at mojix.com>
> Subject: Re: [Spread-users] write(): java.net.SocketException
> To: Jonathan Stanton <jonathan at spreadconcepts.com>
> Cc: spread-users at lists.spread.org
> Message-ID:
>         <
> CADTONkdQ4GbQc_nD5oTB4jcpJK5uAbWr7WjRaOrShtdp5W4JVw at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Here is an update on the status of this issue :
>   - We tried adding 1ms delay between transmissions (still fails)
>   - We tried upgrading to spread 4.2.0 (still fails)
>
> We will try Melissa's suggestion and do some reading.
>
> Thanks,
> ##Shawn
>
>
> On Mon, Nov 5, 2012 at 3:33 PM, Jonathan Stanton <
> jonathan at spreadconcepts.com> wrote:
>
> > Hello Shawn,
> >
> > Since you are using Spread 4.1, this may be a fixed problem. The
> > Spread
> > 4.2 release that came out in June has a number of fixes (especially to
> > the Java API) which solved a number of deadlock, disconnection and crash
> bugs.
> > If you can try the 4.2 release and see if that resolves the problem,
> > or look at the changes to the Java API between 4.1 and 4.2 and merge
> > them into the version of the Java library that you use that could help.
> >
> > I've included the summary release notes below.
> >
> > Cheers,
> >
> > Jonathan
> >
> > The main new features of this release are:
> >
> > 1) Added Keepalive support to client-server TCP connections. Requires
> > correct
> >    operating system values set for keepalives in order to be useful.
> > 2) Switch internal code to use MONOTONIC clocks when available and
> > appropriate
> >    to remove chance of system clock changes (from the clock being set)
> > from affecting
> >    message processing
> > 3) Break out events, memory, data_link and alarm code into separate
> >    libspread-util package. This package also has a number of
> > improvements in
> >    the functionality of those code files which are listed in the internal
> >    package release notes.
> >
> > It also includes a number of important bug fixes. The most significant
> > include:
> >
> > 1) Fix bug with structure size on 64 bit platforms causing crash.
> > 2) Fix several deadlock, crashes and race conditions in java Listener
> code.
> > 3) Fix 100 ms timeout in java socket handling code so it does not corrupt
> >    messages that take a long time to arrive.
> > 4) Fix java disconnect bug that prevented client from reconnecting
> > until restarted.
> > 5) Remove cause of slow message delivery when a client is receiving a
> > lot of
> >    messages and gets into a badger state.
> > 6) Improve help output and error messages in utility programs.
> > 7) Fix token hurry bug that caused messages to have a 2 second latency in
> >    specific circumstances.
> > 8) Fix crash bug when new daemon configuration files are loaded while the
> >    system is running.
> >
> >
> >
> >
> -------------------------------------------------------------------------------
> > Jonathan Stanton                jonathan at spreadconcepts.com
> > Spread Group Messaging  www.spread.org
> > Spread Concepts LLC     www.spreadconcepts.com
> >
> > ----------------------------------------------------------------------
> > ---------
> >
> >
> >
> > On Nov 5, 2012, at 3:03 PM, Shawn Bradford wrote:
> >
> > > Hello,
> > >
> > > We are currently using spread and have found this error occurring
> > > quite frequently. Unfortunately there is little information on
> > > write() errors
> > to
> > > be found on the net (many more read() errors).
> > >
> > > *spread.SpreadException: write(): java.net.SocketException:
> > > Connection
> > reset
> > > *
> > >
> > > Would someone be able to describe what would be a potential issue
> > > causing this?  I am looking for some guidance as to the source of
> > > the error
> > (maybe
> > > from a developer) to assist in debugging the error.
> > >
> > > We have tried to write several test apps to replicate the bug but
> > > have
> > been
> > > unsuccessful.  Our system is quite large with many moving parts and
> > > it is unclear as to what sequence of events are causing the errors.
> > >
> > > We are using spread 4.1 on 64 bit centos 5.5.
> > >
> > > Thanks in advance,
> > > ##Shawn
> > >
> > > *--
> > > ------------------------------
> > > *  Director Software | Mojix Inc.
> > >  phone : +1.562.221.3474
> > >  email : shawn.bradford at mojix.com
> > >  web : www.mojix.com
> > >
> > > Unless expressly identified to the contrary herein, this email and
> > > any attachments contain  and constitute confidential and
> > > proprietary material  and information for the sole use of the
> > > intended recipient. If you are not the intended recipient or
> > > otherwise received this e-mail in error, please (i) immediately
> > > delete this email and any attachments,
> > print
> > > outs and copies of the foregoing and (ii) please notify me
> > > immediately by responding to this e-mail message.
> > >
> > > *
> > > _______________________________________________
> > > Spread-users mailing list
> > > Spread-users at lists.spread.org
> > > http://lists.spread.org/mailman/listinfo/spread-users
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.spread.org/pipermail/spread-users/attachments/20121107/d3aaafc6/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Wed, 7 Nov 2012 16:44:35 -0500
> From: Ed Holyat <Ed.Holyat at openlink.com>
> Subject: Re: [Spread-users] write(): java.net.SocketException
> To: "Shawn.Bradford at mojix.com" <Shawn.Bradford at mojix.com>, Jonathan
>         Stanton <jonathan at spreadconcepts.com>
> Cc: "spread-users at lists.spread.org" <spread-users at lists.spread.org>
> Message-ID:
>         <
> 648AFB5742D6394FB956DC60697556054491EB0883 at OLFANDEXCH01.andover.olf.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> I have not used the java version of spread, but, usually a connection
> reset means that the connection terminated hard and the other side did not
> close it. Have you verified that the spread daemon closed the connection on
> purpose; you can put debugging on the spread daemon to determine if Spread
> closed the connection because of a slow consumer.
> Here are some other scenarios I have scene.
> anti virus software delaying the packets and one side getting a
> Sockettimeoutconnection which wasn't handled correctly, this produced a
> connection reset on the other side.  Try disabling any virus software.
> This can also occur if a client terminates before a socket is flushed of
> all its packets.  This can happen on a system with high memory or CPU usage
> or just sending large packets.  Monitor resources and check that the MTU is
> the same on both sides of the connection.
> And there is always the possibility of hardware issues.  You can try
> duplicating the problem outside of Spread by executing ping with a large
> buffer size ping -t -l 1350 and look for packet loss.  This should be
> performed from client host to daemon host and vise versa
>
>
> From: Shawn Bradford [mailto:shawnb at mojix.com]
> Sent: Wednesday, November 07, 2012 2:20 PM
> To: Jonathan Stanton
> Cc: spread-users at lists.spread.org
> Subject: Re: [Spread-users] write(): java.net.SocketException
>
>
> Here is an update on the status of this issue :
>   - We tried adding 1ms delay between transmissions (still fails)
>   - We tried upgrading to spread 4.2.0 (still fails)
>
> We will try Melissa's suggestion and do some reading.
>
> Thanks,
> ##Shawn
>
> On Mon, Nov 5, 2012 at 3:33 PM, Jonathan Stanton <
> jonathan at spreadconcepts.com<mailto:jonathan at spreadconcepts.com>> wrote:
> Hello Shawn,
>
> Since you are using Spread 4.1, this may be a fixed problem. The Spread
> 4.2 release that came out in June has a number of fixes (especially to the
> Java API) which solved a number of deadlock, disconnection and crash bugs.
> If you can try the 4.2 release and see if that resolves the problem, or
> look at the changes to the Java API between 4.1 and 4.2 and merge them into
> the version of the Java library that you use that could help.
>
> I've included the summary release notes below.
>
> Cheers,
>
> Jonathan
>
> The main new features of this release are:
>
> 1) Added Keepalive support to client-server TCP connections. Requires
> correct
>    operating system values set for keepalives in order to be useful.
> 2) Switch internal code to use MONOTONIC clocks when available and
> appropriate
>    to remove chance of system clock changes (from the clock being set)
> from affecting
>    message processing
> 3) Break out events, memory, data_link and alarm code into separate
>    libspread-util package. This package also has a number of improvements
> in
>    the functionality of those code files which are listed in the internal
>    package release notes.
>
> It also includes a number of important bug fixes. The most significant
> include:
>
> 1) Fix bug with structure size on 64 bit platforms causing crash.
> 2) Fix several deadlock, crashes and race conditions in java Listener code.
> 3) Fix 100 ms timeout in java socket handling code so it does not corrupt
>    messages that take a long time to arrive.
> 4) Fix java disconnect bug that prevented client from reconnecting until
> restarted.
> 5) Remove cause of slow message delivery when a client is receiving a lot
> of
>    messages and gets into a badger state.
> 6) Improve help output and error messages in utility programs.
> 7) Fix token hurry bug that caused messages to have a 2 second latency in
>    specific circumstances.
> 8) Fix crash bug when new daemon configuration files are loaded while the
>    system is running.
>
>
>
> -------------------------------------------------------------------------------
> Jonathan Stanton                jonathan at spreadconcepts.com<mailto:
> jonathan at spreadconcepts.com>
> Spread Group Messaging  www.spread.org<http://www.spread.org>
> Spread Concepts LLC     www.spreadconcepts.com<
> http://www.spreadconcepts.com>
>
> -------------------------------------------------------------------------------
>
>
>
> On Nov 5, 2012, at 3:03 PM, Shawn Bradford wrote:
> > Hello,
> >
> > We are currently using spread and have found this error occurring
> > quite frequently. Unfortunately there is little information on write()
> > errors to be found on the net (many more read() errors).
> >
> > *spread.SpreadException: write(): java.net.SocketException: Connection
> > reset
> > *
> >
> > Would someone be able to describe what would be a potential issue
> > causing this?  I am looking for some guidance as to the source of the
> > error (maybe from a developer) to assist in debugging the error.
> >
> > We have tried to write several test apps to replicate the bug but have
> > been unsuccessful.  Our system is quite large with many moving parts
> > and it is unclear as to what sequence of events are causing the errors.
> >
> > We are using spread 4.1 on 64 bit centos 5.5.
> >
> > Thanks in advance,
> > ##Shawn
> >
> > *--
> > ------------------------------
> > *  Director Software | Mojix Inc.
> >  phone : +1.562.221.3474<tel:%2B1.562.221.3474>
> >  email : shawn.bradford at mojix.com<mailto:shawn.bradford at mojix.com>
> >  web : www.mojix.com<http://www.mojix.com>
> >
> > Unless expressly identified to the contrary herein, this email and any
> > attachments contain  and constitute confidential and  proprietary
> > material  and information for the sole use of the intended recipient.
> > If you are not the intended recipient or otherwise received this
> > e-mail in error, please (i) immediately delete this email and any
> > attachments, print outs and copies of the foregoing and (ii) please
> > notify me immediately by responding to this e-mail message.
> >
> > *
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org<mailto:Spread-users at lists.spread.org>
> > http://lists.spread.org/mailman/listinfo/spread-users
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.spread.org/pipermail/spread-users/attachments/20121107/f9bc44c7/attachment.html
>
> ------------------------------
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>
>
> End of Spread-users Digest, Vol 91, Issue 4
> *******************************************
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.spread.org/pipermail/spread-users/attachments/20121111/075d59f2/attachment-0001.html 


More information about the Spread-users mailing list