[Spread-users] Memory leak? FD leak? Other?
jonathan at cnds.jhu.edu
Fri Aug 20 00:47:51 EDT 2004
I understand what you are getting at. I have seen something I think is
related, I have to look back at my bug archives and find it. I'll look
into it tomorrow. What I think might be happening is that the Spread
daemon may not be noticing that the client program has quit and so it
still has the fd's active (like you noticed) and thus still has all of the
messages stored (since it still thinks the client is alive).
Usually Spread notices a client death immediately because of an error on
the tcp/unix socket, but I sort of remember a case where it didn't and
that's what I want to check.
If the fd's are released (which is a good proxy for the daemon noticing
the client death) and the memory is not released, then that indicates a
different sort of case. Have you triggered that case, or only the one
where both the fds and memory are not reclaimed?
On Thu, Aug 19, 2004 at 06:57:07PM -0400, David Shaw wrote:
> I understand that my example is not realistic for a dozen reasons.
> The main point is in my last paragraph: If I kill the process
> generating all these messages, it seems spread does not notice. It
> does not release any of the memory it has taken up and since the
> destination mbox has disconnected, there is no way the messages could
> ever be delivered. Spread doesn't even close all of the file
> descriptors it had open to talk to the process.
> On Thu, Aug 19, 2004 at 06:47:58PM -0400, Ryan Caudy wrote:
> > Do any of the threads get disconnected? Spread won't disconnect the
> > entire process (it doesn't know about processes), but it should
> > disconnect each of the individual worker threads once it manages to
> > accumulate MAX_SESSION_MESSAGES (1000 by default) sent by that thread
> > (since they are only sending to themselves, and not using
> > SELF_DISCARD).
> > What matters more for your purposes is that (a) the messages you are
> > sending are huge, and (b) Spread limits messages based on number of
> > messages, not total bytes. So, what happens is that the daemon will
> > queue up to 500 messages (about 64 MB, in your case) that haven't been
> > sent, before it stops reading client messages. After that, it will
> > have up to 8192 messages outstanding that can't yet be delivered for
> > whatever reason. So, that's a lot of memory that it could be using in
> > a case like yours. I'm not sure what the solution is, except changing
> > the parameters that dictate Spread's queueing behavior in these
> > abusive cases.
> > Maybe there's some way to make Spread more cautious, but I'm not sure
> > what to suggest.
> > Cheers,
> > Ryan
> > On Thu, 19 Aug 2004 12:07:26 -0400, David Shaw <dshaw at archivas.com> wrote:
> > > I've been seeing odd behavior with spread 3.17.2 recently. Basically,
> > > the memory it uses grows steadily and never goes back down again. To
> > > be sure, Spread does some memory management internally so it may not
> > > wish to give back memory when I expect it to, but the behavior I am
> > > seeing is pretty far out of line.
> > >
> > > I've attached a simple program (sabuse) that spawns many threads, and
> > > each stuffs large messages into spread without reading them back.
> > > Obviously this is going to cause spread to grow since it must store
> > > the messages. However, there is no limit on the growing - spread will
> > > happily grow until the system runs out of swap, rendering that machine
> > > useless. Limiting the memory via ulimit does not work since spread
> > > will exit with a "Message_add_scat_element: Failed to allocate a new
> > > PACKET_BODY" if it cannot get enough memory.
> > >
> > > Of course, under normal circumstances nobody would do such a thing.
> > > However, when I kill the sabuse program. I would expect spread to
> > > give back some memory and it doesn't. I know spread keeps some memory
> > > around for performance reasons, but at this point it owns most of the
> > > memory and swap on the system. Similarly, I would expect spread to
> > > close all the file descriptors it has open to talk to the sabuse
> > > program and it doesn't (this is visible in /proc).
> > >
> > > This behavior is on linux 2.4.25, and glibc 2.3.2.
> > >
> > > David
> Spread-users mailing list
> Spread-users at lists.spread.org
Jonathan R. Stanton jonathan at cs.jhu.edu
Dept. of Computer Science
Johns Hopkins University
More information about the Spread-users