[Spread-users] Send_new_packets: created packet 203 already exist 2

Matt Garman matthew.garman at gmail.com
Fri Mar 2 22:16:08 EST 2012


On Wed, Feb 22, 2012 at 1:04 PM, John Schultz
<jschultz at spreadconcepts.com> wrote:
> We are currently wrapping up a release candidate for the next version of Spread (4.2).  After that is release is officially out, we intend to turn our attention to this issue to try to nail down what is happening here.
>
> Since you are regularly running into this issue, and it has proven very hard for us to cause in our test environment, it might be helpful if we could deploy test versions into your environment.  Would you be open to that kind of arrangement?

Hi John,

Sorry for the delayed response.  I'd be happy to help, but
unfortunately, I also cannot duplicate it in any kind of test or
development environment.  It only happens in production.  And I can't
afford to run test versions in a production environment.

I have "solved" the problem by distributing the load.  We basically
have a spread daemon network across four machines.  The systems that
talk to this spread network were heavily biased towards one server in
particular in the segment; that is the exact server-daemon that kept
crashing.  I basically did a "netstat | grep <spread_channel>" and saw
that there was over 500 connections (to this particular daemon) on the
server with the issues.  The next highest connection count was less
than half that.  The other two servers both had less than 100
connections.  So I just reconfigured a few systems to point to one of
the lightly loaded servers, and we haven't had any crashes since.

I don't know if this helps you or not, probably just more of what you
already know: it seems to be load dependent.  But I will say, we have
one kind of spread-based program that doesn't create a lot of
individual connections, but generates a crazy amount of network
traffic.  The first thing we did was to deploy a new spread segment
specifically for this high-traffic program.  But the problem
persisted.  So at least in our case, it looks more like the number of
connections was more likely the culprit as opposed to high network
load.

Let me know if I can be of any more help,
Thanks again,
Matt



More information about the Spread-users mailing list