[Spread-users] Java - connection closed while reading data

Jonathan Stanton jonathan at cnds.jhu.edu
Wed Feb 6 19:33:57 EST 2008


This error most commonly occurs when the client is not reading the messages fast enough from the server, 
so the server queues up messages (up to a limit) and then will close down the connection to the client 
since it has nowhere else to buffer messages for that particular client. 

You can easily check if this is occuring. Just edit the spread.conf to add the SESSION type to the 
DebugFlags field so session errors are printed. Then look in the output for messages like:

Sess_write: killing mbox %d for not reading

If you see any of those then the recieving application was not keeping up with the sending rate and at 
least 1000 messages had been queued up for that receiver. 

If you don't see that, then there might be another cause and I can suggest some more detailed event 
tracing to try and determine it. 

This issue has been discussed extensively on the mailing list with lots of examples and explanation, but
the bottom line is there are only two solutions:

1) If the burstiness is limited, but greater then the current max buffering, you can raise the limit 
defined for MAX_SESSION_MESSAGES in spread_params.h (daemon source code) and recompile.

2) You add flow control to your application so it doesn't send faster then the receivers can handle.

Given your description of the problem it sounds to me like this might be the problem -- as when the 1000 
messsages queue gets filled will depend on exactly how fast each program is running (scheduling delays 
and such will have an impact at these rates) so it could vary each run somewhat. 

Hope that helps, but if you think it's some thing else get back to us with more details and if possible 
some daemon logs showing errors. 

Cheers,

Jonathan
On Tue, Feb 05, 2008 at 11:48:42PM +0000, Christopher Browne wrote:
> "John Lister" <john.lister-spread at kickstone.com> writes:
> > Hi, i have a simple spread network consisting of 2 daemons and a pair of applications talking to them (via localhost).
> >
> > One application sends messages to the network while the other (an application server) receives them and places them into a durable message queue to be processed
> > later.
> >
> >  
> >
> > I'm just doing some performance tests and at about 100-200 messages a second, the received dies with an exception
> >
> > "Connection closed while reading data".
> >
> >  
> >
> > I've noticed some similar messages on the forum from a few years ago... Is this still an issue or has it been resolved, is there any way of finding out what/why is
> > causing the connection to die, i've tried to turn on debugging messages in the daemon but it is like looking or a pin in a haystack because of the amount of log
> > messages generated.
> >
> >  
> >
> > The app server is able to "process" the messages from spread fast enough, so i don't think that is the issue.
> >
> >  
> >
> > Any hints?
> 
> Hmm.  We're seeing much the same condition here...
> 
> Scenario is similar, albeit with somewhat higher numbers...
> 
>  - Created a client that is encoding messages in ASN.1 format;
> 
>  - It throws those messages at a Spread group (actually, where there is just 1 spread daemon which is running on a *remote* box)
> 
>  - A "message processor" then subscribes to the Spread group, grabs the messages, decodes them (from ASN.1), and then spits out messages.
> 
> It *appears* like a load issue; when the client is set up to throw
> 5000 messages, as fast as it can, the message processor tends to fall
> over, with "Connection closed while reading data."
> 
> It's not happening at any fixed location in the message stream; when
> it happens varies quite a bit.  Oh, and occasionally, it happens in a
> different spot: "Connection closed while reading header".
> 
> I'm guessing that some resource is getting overrun.  Not at all sure
> what.
> 
> In a way, it's an unfair case; we're throwing messages at the bus as
> hard as we can, at this stage of development.  This may be how things
> break down when a reader can't handle the message volume.  It would be
> good to know that for sure, though.
> -- 
> "cbbrowne","@","linuxfinances.info"
> http://cbbrowne.com/info/
> Rules  of the  Evil Overlord  #227.  "I will  never bait  a trap  with
> genuine bait." <http://www.eviloverlord.com/>
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------




More information about the Spread-users mailing list