[Spread-users] NET_ERROR_ON_SESSION

Miguel Araújo miguelaraujo at lsd.di.uminho.pt
Fri Feb 18 12:21:34 EST 2011


Thank you for the clear explanation of the problem.

I have just solved it using the flag SELF_DISCARD on the publisher. This way the messages will not be sent to the publisher connection and thus overcome the delivery queue problem.
In my use case I don't need the publisher to receive the messages it sends. So this solution fits my needs.


Best regards,
Miguel Araújo 


On Feb 18, 2011, at 4:49 PM, John Schultz wrote:

> The problem is that your sender joins the group so Spread delivers the messages back to it as well.  However, the publisher never reads them.  Spread gets angry at him because he is using up the daemon's memory in his delivery queue.  Spread has a threshold that after ~1000 messages being built up in a client's message delivery queue that it will kill any such receiver -- and your publisher is also a receiver because it joined the group.
> 
> So, you have two options: (1) don't have your publisher join the group or (2) have your publisher also read messages.  The problem with (1) is that you have no feedback about whom is in the group or how fast you can send -- you basically just have to guess and pray.
> 
> For (2), a simple model for your sender would be something like: join the group, send 100 messages, then until done { receive 1 msg; send 1 msg; }
> 
> If you have multiple publishers sending to the same group, then you will have to somehow coordinate their sending rates so that you don't overwhelm any of the receivers, which is a bit of an advanced topic.  Hopefully, the simple model will suffice for you for now.
> 
> Cheers!
> 
> -----
> John Lane Schultz
> Spread Concepts LLC
> Phn: 301 830 8100
> Cell: 443 838 2200
> 
> On Feb 18, 2011, at 9:09 AM, Miguel Araújo wrote:
> 
> Hi. 
> 
> Thank you for your fast reply.
> 
> Ok, that's true. I have a sending app that joins a group and never reads it just sends messages. 
> 
> I've checked my code and I am checking all return codes, I forgot to mention that before I get the NET_ERROR_ON_SESSION I get a couple of CONNETION_CLOSED. What I do on my code is to reconnect and send again the messages. I also get the REJECTED_NOT_UNIQUE error. And inevitably I end getting the NET_ERROR_ON_SESSION.
> 
> The app which is dying is the sender app. All the apps (sender and receivers) are on the same group and the messages are sent with the type FIFO.
> 
> So, if the connection is closed after ~1000 msgs are not read from the mailbox can my receiver app be not efficient enough at reading the messages? Or the Spread does not handle a high throughput of message sending? I use libevent on my receiver app in order to catch the events on the mailbox and then use the Spread API to receive the messages with SP_Receive.
> 
> What can I do? 
> 
> Thanks!
> 
> Regards,
> Miguel Araújo
> 
> On Feb 17, 2011, at 6:17 PM, John Schultz wrote:
> 
>> Most likely, the daemon is closing the connection and your sender thread is seeing a permanent error recorded on the mailbox and is returning a generic error.  The daemon is most likely closing your connection due to you not reading from the mailbox fast enough.  For example, if you have a sending app. that joins a group and then never reads, then its connection will be closed after a ~1000 msgs are not read from the mailbox.  Most likely, a previous call already failed with a CONNECTION_CLOSED error.
>> 
>> If you can describe a bit better which app. is dying (i.e. - sender or receiver) and whether they are joined to any groups that would be helpful.  Also, check your code and make sure you are checking all return codes, as almost surely the NET_ERROR_ON_SESSION is not the first time an error is returned.
>> 
>> Cheers!
>> 
>> -----
>> John Lane Schultz
>> Spread Concepts LLC
>> Phn: 301 830 8100
>> Cell: 443 838 2200
>> 
>> On Feb 17, 2011, at 12:42 PM, Miguel Araújo wrote:
>> 
>> Hello.
>> 
>> I am using Spread on a group of machines connected through an LAN. One of the machines is sending messages to the others with a not really high rate.
>> After some minutes I receive the NET_ERROR_ON_SESSION message:
>> 
>> "SP_error: (-18) The network socket experienced an error. This Spread mailbox will no longer work until the connection is disconnected and then reconnected"
>> 
>> Even with small rates of "message sending" the Spread does not handle it too much time and ends with the same error.
>> 
>> So, what is happening here? What are the limits of Spread in this case? Throughput, network, etc.
>> 
>> I would appreciate any help and information, and I am available to provide more information if necessary.
>> 
>> 
>> Thanks and kind regards,
>> 
>> --
>> Miguel Araújo
>> | miguelaraujo at lsd.di.uminho.pt |
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Spread-users mailing list
>> Spread-users at lists.spread.org
>> http://lists.spread.org/mailman/listinfo/spread-users
>> 
> 
> 
> --
> Miguel Araújo
> | miguelaraujo at lsd.di.uminho.pt |
> 
> 
> 
> 
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
> 


--
Miguel Araújo
| miguelaraujo at lsd.di.uminho.pt |








More information about the Spread-users mailing list