[Spread-users] RE: Spread on Linux question
mayer.crystal at gs.com
Thu Jul 7 11:45:54 EDT 2005
Thanks for the response. Unfortunately, it doesn't seem to be that
easy. Using spmonitor and even taking down and bringing up the various
spread daemons doesn't seem to force the error to occur in my dev
environment. I'll keep on trying and let you know if I can reproduce it
From: Theo Schlossnagle [mailto:jesus at omniti.com]
Sent: Wednesday, July 06, 2005 9:29 PM
To: Crystal, Mayer
Cc: Theo Schlossnagle; 'spread-users at lists.spread.org'
Subject: Re: [Spread-users] RE: Spread on Linux question
On Jul 6, 2005, at 8:33 PM, Crystal, Mayer wrote:
> OK, it took a little while to generate (still not sure what is the
> root cause yet), but I ran my setup under valgrind and received the
> following errors in middle of the execution (the daemons are still
> running, but I have a feeling that this is not the desired behavior).
> Sorry for the long post, but I hope more information will be helpful.
> Has anyone seen anything like this? Is this intended, known and/or is
> there a patch if this is not intended?
clearly not intended. I've learned not to argue with valgrind -- you always
It appears that the members lists are used even after they are destroyed.
this might be fixed by carefully reseting the skiplist structures to a
post-init state after calling sl_destruct. The second error is sl_destruct
not cleaning up it's endnodes.
can you repeat the error by using spmonitor to fake a partition? and then
undo that partition? It seems that you should.
so, without trying to repeat your error, give this a whirl:
--- skiplist.c.old 2004-04-16 12:50:34.000000000 -0400
+++ skiplist.c 2005-07-06 21:25:20.000000000 -0400
@@ -552,6 +552,7 @@
m = p;
sl->top = sl->bottom = NULL;
+ sl->topend = sl->bottomend = NULL;
sl->height = 0;
sl->size = 0;
@@ -563,6 +564,7 @@
+ sl->index = NULL;
More information about the Spread-users