[Spread-users] Memory leak? FD leak? Other?

Jonathan Stanton jonathan at cnds.jhu.edu
Fri Aug 20 00:51:27 EDT 2004


I think in the case David is describing the idea is that all of the 
messages are queued up in Session level per-mbox queues not in the global 
ordering/safety queues in membership.c. So once a destination client 
disconnects from the daemon all of the messages in it's session queue can 
be safely deleted as Spread never guarantees every client processed a 
messages, just that it reached the daemon. 

View it as if instead of Spread queuing the messages, a really really big
TCP/socket buffer was allocated by the kernel and it buffered all of the
messages as a byte stream to the client and then the client crashed, the
kernel would just free all of the stored (but undelivered) byte stream.

Jonathan


On Thu, Aug 19, 2004 at 10:40:03PM -0400, Ryan Caudy wrote:
> This is because (in the normal case), there is more than one process
> sending messages through Spread, and more than one Spread daemon. 
> Those messages are needed, once sent, to ensure that ordering and
> delivery guarantees are met.  It might be feasible to add some sort of
> garbage collection to the message queues, (i.e. replace a message with
> a smaller dummy if it has no targets left), but this isn't
> implemented.
> 
> Cheers,
> Ryan
> 
> 
> On Thu, 19 Aug 2004 18:57:07 -0400, David Shaw <dshaw at archivas.com> wrote:
> > I understand that my example is not realistic for a dozen reasons.
> > The main point is in my last paragraph: If I kill the process
> > generating all these messages, it seems spread does not notice.  It
> > does not release any of the memory it has taken up and since the
> > destination mbox has disconnected, there is no way the messages could
> > ever be delivered.  Spread doesn't even close all of the file
> > descriptors it had open to talk to the process.
> > 
> > David
> > 
> > On Thu, Aug 19, 2004 at 06:47:58PM -0400, Ryan Caudy wrote:
> > > Do any of the threads get disconnected?  Spread won't disconnect the
> > > entire process (it doesn't know about processes), but it should
> > > disconnect each of the individual worker threads once it manages to
> > > accumulate MAX_SESSION_MESSAGES (1000 by default) sent by that thread
> > > (since they are only sending to themselves, and not using
> > > SELF_DISCARD).
> > >
> > > What matters more for your purposes is that (a) the messages you are
> > > sending are huge, and (b) Spread limits messages based on number of
> > > messages, not total bytes.  So, what happens is that the daemon will
> > > queue up to 500 messages (about 64 MB, in your case) that haven't been
> > > sent, before it stops reading client messages.  After that, it will
> > > have up to 8192 messages outstanding that can't yet be delivered for
> > > whatever reason.  So, that's a lot of memory that it could be using in
> > > a case like yours.  I'm not sure what the solution is, except changing
> > > the parameters that dictate Spread's queueing behavior in these
> > > abusive cases.
> > >
> > > Maybe there's some way to make Spread more cautious, but I'm not sure
> > > what to suggest.
> > >
> > > Cheers,
> > > Ryan
> > >
> > > On Thu, 19 Aug 2004 12:07:26 -0400, David Shaw <dshaw at archivas.com> wrote:
> > > > I've been seeing odd behavior with spread 3.17.2 recently.  Basically,
> > > > the memory it uses grows steadily and never goes back down again.  To
> > > > be sure, Spread does some memory management internally so it may not
> > > > wish to give back memory when I expect it to, but the behavior I am
> > > > seeing is pretty far out of line.
> > > >
> > > > I've attached a simple program (sabuse) that spawns many threads, and
> > > > each stuffs large messages into spread without reading them back.
> > > > Obviously this is going to cause spread to grow since it must store
> > > > the messages.  However, there is no limit on the growing - spread will
> > > > happily grow until the system runs out of swap, rendering that machine
> > > > useless.  Limiting the memory via ulimit does not work since spread
> > > > will exit with a "Message_add_scat_element: Failed to allocate a new
> > > > PACKET_BODY" if it cannot get enough memory.
> > > >
> > > > Of course, under normal circumstances nobody would do such a thing.
> > > > However, when I kill the sabuse program.  I would expect spread to
> > > > give back some memory and it doesn't.  I know spread keeps some memory
> > > > around for performance reasons, but at this point it owns most of the
> > > > memory and swap on the system.  Similarly, I would expect spread to
> > > > close all the file descriptors it has open to talk to the sabuse
> > > > program and it doesn't (this is visible in /proc).
> > > >
> > > > This behavior is on linux 2.4.25, and glibc 2.3.2.
> > > >
> > > > David
> > 
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> > http://lists.spread.org/mailman/listinfo/spread-users
> > 
> 
> 
> -- 
> ---------------------------------------------------------------------
> Ryan W. Caudy
> <rcaudy at gmail.com>
> ---------------------------------------------------------------------
> Bloomberg L.P.
> <rcaudy1 at bloomberg.net>
> ---------------------------------------------------------------------
> [Alumnus]
> <caudy at cnds.jhu.edu>         
> Center for Networking and Distributed Systems
> Department of Computer Science
> Johns Hopkins University          
> ---------------------------------------------------------------------
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------




More information about the Spread-users mailing list