[Spread-users] shutting down spread daemon cleanly

Thu Oct 28 12:13:48 EDT 2004

You can use spmonitor to shut it down, but that alone will not fix the 
TIME_WAIT state issue. 

The daemon is shutting down cleanly -- the reason TIME_WAIT state exists
is because of how TCP/IP shuts down network connections. You can find a
detailed discussion of the issue in the Stevens Unix Network Programming
book or the sockets FAQ on the Internet as this is a standard issue with
any network server and occurs when the server shuts down before it's
clients do. 

The most reliable and safe thing to do is just wait the 2 minutes until
TIME_WAIT state expires, however that is often impractical for servers
so you can turn on a socket option in the spread code that will allow
the spread daemon to reuse the same port/address for binding without
waiting. The option is in the spread.conf file and is called
SocketPortReuse. I've included the section of the sample.spread.conf
file below which explains the setting and some of the consequences. 

If you search the spread-users archive, a year or two ago we did a 
detailed examination of the security risks with allowing multiple binds 
and produced a table of which OS's were safe to do this on and which 
were not. Most of the 'uptodate' unix platforms were safe -- but I don't 
remember the exact list. 

#Set handling of SO_REUSEADDR socket option for the daemon's TCP
# listener.  This is useful for facilitating quick daemon restarts (OSes
# often hold onto the interface/port combination for a short period of time
# after daemon shut down).
#
# AUTO - Active when bound to specific interfaces (default).
# ON   - Always active, regardless of interface.
#        SECURITY RISK FOR ANY OS WHICH ALLOW DOUBLE BINDS BY DIFFERENT USERS
# OFF  - Always off.

#SocketPortReuse = AUTO

Cheers,

Jonathan

On Thu, Oct 28, 2004 at 10:08:20AM -0400, Ryan Caudy wrote:
> Yes.  Use spmonitor to instruct your daemons to exit.  We also have a
> stand-alone tool made by stripping out some of the monitor code that
> can be used to send a single command of that sort, suitable for
> scripting, but I don't think it's public anywhere.  If you need it,
> let me know, and I'll try to get you the source code.
> 
> Cheers,
> Ryan
> 
> On Thu, 28 Oct 2004 10:03:01 -0400, Paul Rubel <prubel at bbn.com> wrote:
> > Good morning,
> > 
> > We are using spread and have some automated tests that start start and
> > stop spread daemons. We would like to stop the daemon from outside the
> > process, i.e. with kill. However, we have had some problems when using
> > kill, especially on Solaris.
> > 
> > When the daemon is killed on solaris it often leaves its socket in the
> > TIME_WAIT state. When we go to start up a new daemon it fails because
> > the socket isn't available. Is there a way to ensure a clean shutdown
> > of the daemon?
> > 
> >     thank you for your help,
> >       Paul
> > 
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
> > 
> 
> 
> -- 
> ---------------------------------------------------------------------
> Ryan W. Caudy
> <rcaudy at gmail.com>
> ---------------------------------------------------------------------
> Bloomberg L.P.
> <rcaudy1 at bloomberg.net>
> ---------------------------------------------------------------------
> [Alumnus]
> <caudy at cnds.jhu.edu>         
> Center for Networking and Distributed Systems
> Department of Computer Science
> Johns Hopkins University          
> ---------------------------------------------------------------------
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------