[Spread-users] [BUG][PATCH] fd to session troubles

Tim Peters tim at zope.com
Fri Jan 25 14:11:46 EST 2002


[Marc Zyngier]
> I'd like to report a problem we've been facing on a multiprocessor NT
> system, as well as proposing a possible fix.
>
> The problems is the way the spread daemon finds a session given a
> mailbox.
>
> From sess_body.h :
>
> ext    int             Session_index[FD_SETSIZE]; /* converting
> from fd to inde
> x in Sessions */
>
> The Session_index array is a fd indexed lookup table for sessions. As
> noted in events.c, fd can be larger than FD_SETSIZE :
>
> /* Windows bug: Reports FD_SETSIZE of 64 but select works on all
>  * fd's even ones with numbers greater then 64.
>  */

Good catch!  The "Windows bug" comment is incorrect, because sockets simply
aren't fds on Windows (or BeOS) -- they're a distinct datatype.  The handle
that represents a socket on Windows comes out of a handle space whose upper
bound cannot be predicted in advance.  FD_SETSIZE on Windows is simply the
maximum number *of* socket handles you can pass to the Windows select.

Spread could also boost FD_SETSIZE on Windows (it's OK to #define this
yourself on Windows, before #include'ing winsock.h (in Python, we boost it
to 512 by default)), but there remains no relation between FD_SETSIZE and
the numerical magnitude of Windows socket handles.  An fd_set on Windows is
a contiguous vector of socket handles, not a bit map as on Unix:

/* This is from winsock.h */
typedef struct fd_set {
        u_int   fd_count;               /* how many are SET? */
        SOCKET  fd_array[FD_SETSIZE];   /* an array of SOCKETs */
} fd_set;

> For example, on our test box, the first session gets fd number
> 88. Given that FD_SETSIZE is still 64, we are corrupting spread's
> internal structures, and spread eventually crashes at some latter
> time ...

I suspect this explains a lot of flaky problems I've seen (e.g., the
distributed spread.exe doesn't work at all on Win98 -- attempts to connect
crash the daemon every time; but "works fine" if I compile it myself, but
then only in an MSVC Debug build; etc; it "smells like" memory corruption).
I'll attempt to try your patch over the weekend.






More information about the Spread-users mailing list