[Spread-users] after fork: session 'xxx' trying to make session 'yyy' do something
Yair Amir
yairamir at cnds.jhu.edu
Tue Oct 1 07:53:39 EDT 2002
I miss some information. For example I need to see the EXACT
SP_multicast call you do (including all of its parameters).
:) Yair.
James> Hi,
James> I'm a Spread beginner. We're using Spread 3.16.2 in a system which
James> distributes work from a shared queue to to a number of distributed
James> compute servers. The servers each join spread groups with the same
James> name as the queue, and I use the group ordering to designate one as
James> the "leader". The leader is responsible for monitoring the queue
James> and dispatching jobs to the remaining servers.
James> When a job is dispatched, the server which is running it forks a
James> new process in which to execute it. The FD for the spread connection
James> is (of course) inherited across the fork. The child needs its own
James> connection to the spread daemon, so it closes the parent's mbox
James> (with close(2)), and then calls SP_connect to reconnect.
James> Schematically:
James> parent_mbox = SP_connect("xxx");
James> SP_join(parent_mbox, "queue");
James> while ( ... ) {
James> SP_receive();
James> if ( fork() == 0 ) {
James> // child process for a new job
James> close(parent_mbox);
James> child_mbox = SP_connect("yyy");
James> SP_join(child_mbox, "queue");
James> SP_multicast()
James> }
James> }
James> If parent_mbox and child_mbox happen to be assigned the same (numerical)
James> FD, then the spread daemon produces the error message quoted in the
James> subject line, where "xxx" and "yyy" are the private names of the parent
James> and child, respectively, and the child's first attempt to SP_multicast()
James> dies with error -8: "connection closed by spread". I can work around
James> the bug by doing a dummy open() or pipe() call between the close()
James> and the SP_connect() in the child.
James> After a quick examination of sp.c, it appears that I don't want to do
James> an SP_disconnect() in the child, because that actually updates the
James> membership at the daemon and the parent is not actually disconnecting.
James> The internal function SP_kill() looks like it does what I want (close
James> the mbox and update the client-side session table), but it's static and
James> undocumented.
James> Suggesstion? Any help would be appreciated.
James> Thanks, Jim
More information about the Spread-users
mailing list