[Spread-users] write(): java.net.SocketException

Ed Holyat Ed.Holyat at openlink.com
Wed Nov 7 16:44:35 EST 2012

I have not used the java version of spread, but, usually a connection reset means that the connection terminated hard and the other side did not close it. Have you verified that the spread daemon closed the connection on purpose; you can put debugging on the spread daemon to determine if Spread closed the connection because of a slow consumer.
Here are some other scenarios I have scene.
anti virus software delaying the packets and one side getting a Sockettimeoutconnection which wasn't handled correctly, this produced a connection reset on the other side.  Try disabling any virus software.
This can also occur if a client terminates before a socket is flushed of all its packets.  This can happen on a system with high memory or CPU usage or just sending large packets.  Monitor resources and check that the MTU is the same on both sides of the connection.
And there is always the possibility of hardware issues.  You can try duplicating the problem outside of Spread by executing ping with a large buffer size
ping -t -l 1350 and look for packet loss.  This should be performed from client host to daemon host and vise versa

From: Shawn Bradford [mailto:shawnb at mojix.com]
Sent: Wednesday, November 07, 2012 2:20 PM
To: Jonathan Stanton
Cc: spread-users at lists.spread.org
Subject: Re: [Spread-users] write(): java.net.SocketException

Here is an update on the status of this issue :
  - We tried adding 1ms delay between transmissions (still fails)
  - We tried upgrading to spread 4.2.0 (still fails)

We will try Melissa's suggestion and do some reading.


On Mon, Nov 5, 2012 at 3:33 PM, Jonathan Stanton <jonathan at spreadconcepts.com<mailto:jonathan at spreadconcepts.com>> wrote:
Hello Shawn,

Since you are using Spread 4.1, this may be a fixed problem. The Spread 4.2 release that came out in June has a number of fixes (especially to the Java API) which solved a number of deadlock, disconnection and crash bugs. If you can try the 4.2 release and see if that resolves the problem, or look at the changes to the Java API between 4.1 and 4.2 and merge them into the version of the Java library that you use that could help.

I've included the summary release notes below.



The main new features of this release are:

1) Added Keepalive support to client-server TCP connections. Requires correct
   operating system values set for keepalives in order to be useful.
2) Switch internal code to use MONOTONIC clocks when available and appropriate
   to remove chance of system clock changes (from the clock being set) from affecting
   message processing
3) Break out events, memory, data_link and alarm code into separate
   libspread-util package. This package also has a number of improvements in
   the functionality of those code files which are listed in the internal
   package release notes.

It also includes a number of important bug fixes. The most significant include:

1) Fix bug with structure size on 64 bit platforms causing crash.
2) Fix several deadlock, crashes and race conditions in java Listener code.
3) Fix 100 ms timeout in java socket handling code so it does not corrupt
   messages that take a long time to arrive.
4) Fix java disconnect bug that prevented client from reconnecting until restarted.
5) Remove cause of slow message delivery when a client is receiving a lot of
   messages and gets into a badger state.
6) Improve help output and error messages in utility programs.
7) Fix token hurry bug that caused messages to have a 2 second latency in
   specific circumstances.
8) Fix crash bug when new daemon configuration files are loaded while the
   system is running.

Jonathan Stanton                jonathan at spreadconcepts.com<mailto:jonathan at spreadconcepts.com>
Spread Group Messaging  www.spread.org<http://www.spread.org>
Spread Concepts LLC     www.spreadconcepts.com<http://www.spreadconcepts.com>

On Nov 5, 2012, at 3:03 PM, Shawn Bradford wrote:
> Hello,
> We are currently using spread and have found this error occurring quite
> frequently. Unfortunately there is little information on write() errors to
> be found on the net (many more read() errors).
> *spread.SpreadException: write(): java.net.SocketException: Connection reset
> *
> Would someone be able to describe what would be a potential issue causing
> this?  I am looking for some guidance as to the source of the error (maybe
> from a developer) to assist in debugging the error.
> We have tried to write several test apps to replicate the bug but have been
> unsuccessful.  Our system is quite large with many moving parts and it is
> unclear as to what sequence of events are causing the errors.
> We are using spread 4.1 on 64 bit centos 5.5.
> Thanks in advance,
> ##Shawn
> *--
> ------------------------------
> *  Director Software | Mojix Inc.
>  phone : +1.562.221.3474<tel:%2B1.562.221.3474>
>  email : shawn.bradford at mojix.com<mailto:shawn.bradford at mojix.com>
>  web : www.mojix.com<http://www.mojix.com>
> Unless expressly identified to the contrary herein, this email and any
> attachments contain  and constitute confidential and  proprietary
> material  and information for the sole use of the intended recipient. If
> you are not the intended recipient or otherwise received this e-mail in
> error, please (i) immediately delete this email and any attachments, print
> outs and copies of the foregoing and (ii) please notify me immediately by
> responding to this e-mail message.
> *
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org<mailto:Spread-users at lists.spread.org>
> http://lists.spread.org/mailman/listinfo/spread-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.spread.org/pipermail/spread-users/attachments/20121107/f9bc44c7/attachment-0001.html 

More information about the Spread-users mailing list