[Spread-users] after fork: session 'xxx' trying to make session 'yyy' do something

James Rauser j.rauser at science-factory.com
Mon Oct 7 09:21:09 EDT 2002


Yair Amir wrote:
> 
> Hi,
> 
> I would ask you to try one last thing that I suspect will work.
> In the child, immediately after the fork, call SP_kill( parent_mbox );
> instead of your current close( parent_mbox ); in the non-working version.
> And then do the SP_connect for the new fd.
> 
> SP_kill is not a function we make available in the interface of the
> spread library. It exists though, and I just want to check if that
> solution works in your case (I am pretty sure it will). If you want to
> avoid warnings - you might need to add it to sp_func.h, but I think this is
> not necessary.
> 
> I am not sure that a long terms solution for the problem is to make
> SP_kill available in the spread interface but lets see if that works
> to help figure out the situation.
> 

I tried SP_kill today, and I am reasonably certain that in a
single-threaded program it would do the trick; the Session struct
get correctly re-used in the child.

Unfortunately, my system is multi-threaded, and at the time of the
fork() call, there is usually a thread in the parent which is sitting
in SP_receive and thus holds a mailbox mutex in the spread client.
Since (POSIX) fork() only duplicates the calling thread, using SP_kill
in the child lets the SP_join and an SP_multicast succeed, but the
first SP_receive in the child results in a deadlock: the inherited
mbox_mutex is held by a thread which no longer exists :(.
To really fix this, the threaded spread library would have to use
pthread_atfork() to ensure that all of its mutexes are put into
a known state in the child process -- yuck.

My workaround at the moment is to dup2() the parent's mbox onto
a dummy file handle opened to /dev/null (thus closing the connection
to the spread daemon), and just leave the inherited session data alone
(i.e., *not* call SP_kill).  This wastes one FD per child, but I can
live with that.

Jim

-- 
------------------------------------------------------------------------
Jim Rauser                                          Science Factory GmbH
mailto:j.rauser at science-factory.com                       Unter Käster 1
Tel: +49 221 277 399 204                          50667 Cologne, Germany




More information about the Spread-users mailing list