[Spread-users] Spread daemon crashing
Jonathan Stanton
jonathan at cnds.jhu.edu
Thu Feb 14 02:02:46 EST 2008
The first change (to MAX_SCATTER_ELEMENTS) is one of the known ways of supporting large
messages and has been used in the past by a number of people, so I wouldn't expect it to
cause problems.
Increaseing the size of private names is also something people have used in the past, but
I've had fewer reports of success.
the actual crash data now makes sense (you are on a 64 bit machine using large scatters,
so a 'SCATTER' is now 8 bytes (1 long) plus 100,000 * 16 bytes (pointer and long) or
1,600,008 bytes -- exactly what the block_len field says.
So this isn't corruption of the memory header.
However, the bytes_allocated should always be a multiple of the basic block_len size as
memory is allocated or freed as entire structures. Why it is less then 1 scatter is
unclear. I did a quick read through the code and I don't see anywhere it can be modified
except as a factor of the structure size. So this field could be corrupted in some way.
I'm a bit curious, are you running 3 separate daemons on this one large 8 core box? Or do
you mean you have 3 of these big boxes each with one Spread daemon?
Cheers,
Jonathan
On Wed, Feb 13, 2008 at 10:53:37PM +0100, Witold Kręcicki wrote:
> Using spread daemon with :
> -#define MAX_SCATTER_ELEMENTS 100
> +#define MAX_SCATTER_ELEMENTS 100000
>
> -#define MAX_PRIVATE_NAME 10 /* largest possible size of
> private_name field of SP_connect() */
> +#define MAX_PRIVATE_NAME 32 /* largest possible size of
> private_name field of SP_connect() */
> these changes, strange things occur under heavy load:
>
> <quote>
>
> Mem[obj_type].bytes_allocated : 1097144 mem_header_ptr(object)->block_len
> 1600008 sizeof(mem_header) 16 obj_type 20
> spread: memory.c:612: dispose: Assertion `0' failed.
> </quote>
> This assertion has been changed for debugging purposes, originally it looks
> like:
>
> assert(Mem[obj_type].bytes_allocated >= mem_header_ptr(object)->block_len +
> sizeof(mem_header));
>
> Machine is dual quad-core Opteron, 16GB RAM. "Heavy load" is 3 spread daemons,
> 10 groups, in each 4-5 members, ~500 msgs/sec total
>
> --
> Witold Kręcicki
>
> o2.pl Spółka z o.o., ul. Jutrzenki 177, 02-231 Warszawa,
> KRS 0000140518, Sąd Rejonowy dla m.st. Warszawy, Kapitał zakładowy 308.250,00
> zł., NIP 521-31-11-513
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
--
-------------------------------------------------------
Jonathan R. Stanton jonathan at cs.jhu.edu
Dept. of Computer Science
Johns Hopkins University
-------------------------------------------------------
More information about the Spread-users
mailing list