[Spread-users] SP_connect_timeout not really using a timeout

Shlomi Yaakobovich Shlomi at exanet.com
Tue Sep 21 12:18:23 EDT 2004


Hi all,

While playing with the network interfaces a lot, one of the spread daemons got stuck, for some reason. That being bad enough, the spread client, that uses SP_connect_timeout in order to connect to the daemon locally, got stuck too. I looked into the code a bit, and it seems like the timeout is not even used before the connect attempt, nor does the socket's flags indicate that the connect should be non-blocking. 

I have a quick fix, just use

	fcntl(s, F_SETFL, O_NONBLOCK);

before trying to connect. Otherwise, the connect could possibly block. This fix solved my problem. Below is a patch (diff -u)

Next time this will happen, I will try to investigate why spread got stuck in the first place...


Shlomi





--- sp.c        2004-09-21 19:15:01.000000000 +0300
+++ /patch/spread/sp.c    2004-09-21 19:13:35.000000000 +0300
@@ -465,6 +465,7 @@
                  * connect completes or fails.  This is a while loop but it is never
                  * done more then once. The while is so we can use 'break'
                  */
+            fcntl(s, F_SETFL, O_NONBLOCK);
                while( ((ret = connect( s, (struct sockaddr *)&inet_addr, sizeof(inet_addr) ) ) == -1)
                        && ((sock_errno == EINTR) || (sock_errno == EAGAIN) || (sock_errno == EWOULDBLOCK)) )
                 {
@@ -552,6 +553,7 @@

                unix_addr.sun_family = AF_UNIX;
                sprintf( unix_addr.sun_path, "/tmp/%d", port );
+        fcntl(s, F_SETFL, O_NONBLOCK);
                while( ((ret = connect( s, (struct sockaddr *)&unix_addr, sizeof(unix_addr) )) == -1)
                        && ((sock_errno == EINTR) || (sock_errno == EAGAIN) || (sock_errno == EWOULDBLOCK)) )
                 {




More information about the Spread-users mailing list