[Spread-users] SP_connect_timeout not really using a timeout
Shlomi Yaakobovich
Shlomi at exanet.com
Tue Sep 21 12:18:23 EDT 2004
Hi all,
While playing with the network interfaces a lot, one of the spread daemons got stuck, for some reason. That being bad enough, the spread client, that uses SP_connect_timeout in order to connect to the daemon locally, got stuck too. I looked into the code a bit, and it seems like the timeout is not even used before the connect attempt, nor does the socket's flags indicate that the connect should be non-blocking.
I have a quick fix, just use
fcntl(s, F_SETFL, O_NONBLOCK);
before trying to connect. Otherwise, the connect could possibly block. This fix solved my problem. Below is a patch (diff -u)
Next time this will happen, I will try to investigate why spread got stuck in the first place...
Shlomi
--- sp.c 2004-09-21 19:15:01.000000000 +0300
+++ /patch/spread/sp.c 2004-09-21 19:13:35.000000000 +0300
@@ -465,6 +465,7 @@
* connect completes or fails. This is a while loop but it is never
* done more then once. The while is so we can use 'break'
*/
+ fcntl(s, F_SETFL, O_NONBLOCK);
while( ((ret = connect( s, (struct sockaddr *)&inet_addr, sizeof(inet_addr) ) ) == -1)
&& ((sock_errno == EINTR) || (sock_errno == EAGAIN) || (sock_errno == EWOULDBLOCK)) )
{
@@ -552,6 +553,7 @@
unix_addr.sun_family = AF_UNIX;
sprintf( unix_addr.sun_path, "/tmp/%d", port );
+ fcntl(s, F_SETFL, O_NONBLOCK);
while( ((ret = connect( s, (struct sockaddr *)&unix_addr, sizeof(unix_addr) )) == -1)
&& ((sock_errno == EINTR) || (sock_errno == EAGAIN) || (sock_errno == EWOULDBLOCK)) )
{
More information about the Spread-users
mailing list