[Spread-users] Error
Tim Peters
tim at zope.com
Sat Jan 19 06:41:32 EST 2002
[Jonathan Stanton]
> The problem I remember did have to do with thread behavior when a
> disconnect occured on a socket. The problem is that there are races
> when one thread gets a socket error and closes a socket (in the
> libsp code) and other threads are also trying to use that socket. I
> think it actually only happened when the socket was immediately
> reconnected and the socket number (fd) got reused. We know how to
> fix it, and I just don't recall if we have already integrated the
> fix or not.
I'm pretty sure we've seen this happen under 3.16.1, so I don't think a fix
has been released yet.
[Guido van Rossum]
> I find it kind of strange that Spread closes the socket file
> descriptor; it would have been safer for the user if it just marked
> that mbox as "bad" without actually closing it (the reason being the
> file descriptor reuse case you describe). I had to put a bandaid
> around this problem in the Python wrapper (this bandaid isn't on the
> distribution on the web yet).
"The bandaid" is to set our own mbox wrapper object's disconnected flag to
true upon seeing CONNECTION_CLOSED or ILLEGAL_SESSION come back from Spread,
right? Alas, that doesn't really solve it, just makes it more unlikely:
because we release the global interpreter lock around the Spread API calls,
this (for example) is possible:
Thread A Thread B
call Python mbox.receive()
passes self->disconnected check
releases GIL
calls Python mbox.multicast()
passes self->disconnected check
calls Spread SP_receive()
releases GIL
reacquires GIL
sees CONNECTION_CLOSED
sets self->disconnected
*** an arbitrarily long time can pass here ***
calls Spread SP_multicast, now with
a recycled mbox descriptor
That's what I was groping at when I said earlier we'd have to put out
Python-level mbox calls under protection of a mutex. Alternatively, and
until this problem is fixed in Spread, it would be better if we stopped
releasing the GIL in our wrapper (then the sequence of checking
self->disconnected, making a Spread call, and possibly setting
self->disconnected upon error, would be indivisible).
More information about the Spread-users
mailing list