[Spread-users] Is there a limit on total traffic sent by Spread?

Tim Peters tim at zope.com
Fri Jan 18 23:47:06 EST 2002


[Guido van Rossum]
> ...
> Once I was logging all SESSION messages from the Spread daemon to a
> file; the log file abruptly ended with these messages:
>
> Sess_read: failed receiving message on session 9, ret is -1:
>            error: Resource temporarily unavailable
> Sess_read: failed recv msg more info: len read: 102192,
>            remain:  240, to_read: 240, pkt_index: 71, b_index: 0,
>            scat_nums: 72
> Sess_kill: killing session test1 ( mailbox 9 )
>
> If it's not the 4Gbyte limit, are there other reasons why a prolific
> *sender* can be disconnected?

Note that before we added our own batching, Barry got near the end of the 4.5GB test once too, also with SESSION logging.  He
reported an identical (modulo session name and mbox number) death:

Sess_read: failed receiving message on session 11, ret is -1:
           error: Resource temporarily unavailable
Sess_read: failed recv msg more info: len read: 102192,
           remain: 240, to_read: 240, pkt_index: 71, b_index: 0,
           scat_nums: 72
Sess_kill: killing session test1.zope ( mailbox 11 )

> Also, this *is* a multithreaded app; there's one thread receiving and
> one thread sending.  But the receiving thread is mostly idle: there
> isn't any traffic going in that direction.

Absolutely none.  The mailbox receive call is guarded by a select with a 1-second timeout, and we don't even try to do an mbox
receive unless the select says its descriptor is ready for reading.  But it should never be ready for reading.

> Yet, it *appears* that it gets the error first (before the sending
> thread gets the same error).

I can only guess this is a membership message generated by Spread, since our app isn't sending anything to any group this mbox has
joined.  It may be the self-leave message!

> The threads are sharing an mbox.  Could it be that the Linux network
> drivers somehow get overloaded and make the receive fail?  (This *is*
> a stress test -- before we upgraded all our kiernels to the latest,
> 2.4.17, we had regular kernel crashes in an earlier test using the
> same 4.5 Gbyte database.)

Fun, isn't it <wink>?






More information about the Spread-users mailing list