[Spread-users] Spread daemon crashing

Jonathan Stanton jonathan at cnds.jhu.edu
Thu Feb 14 02:02:46 EST 2008


The first change (to MAX_SCATTER_ELEMENTS) is one of the known ways of supporting large 
messages and has been used in the past by a number of people, so I wouldn't expect it to 
cause problems. 

Increaseing the size of private names is also something people have used in the past, but 
I've had fewer reports of success. 

the actual crash data now makes sense (you are on a 64 bit machine using large scatters, 
so a 'SCATTER' is now 8 bytes (1 long) plus 100,000 * 16 bytes (pointer and long) or 
1,600,008 bytes -- exactly what the block_len field says. 

So this isn't corruption of the memory header. 

However, the bytes_allocated should always be a multiple of the basic block_len size as 
memory is allocated or freed as entire structures. Why it is less then 1 scatter is 
unclear. I did a quick read through the code and I don't see anywhere it can be modified 
except as a factor of the structure size. So this field could be corrupted in some way. 

I'm a bit curious, are you running 3 separate daemons on this one large 8 core box? Or do 
you mean you have 3 of these big boxes each with one Spread daemon?

Cheers,

Jonathan

On Wed, Feb 13, 2008 at 10:53:37PM +0100, Witold Kręcicki wrote:
> Using spread daemon with :
> -#define        MAX_SCATTER_ELEMENTS    100
> +#define        MAX_SCATTER_ELEMENTS    100000
> 
> -#define         MAX_PRIVATE_NAME        10 /* largest possible size of 
> private_name field of SP_connect() */
> +#define         MAX_PRIVATE_NAME        32 /* largest possible size of 
> private_name field of SP_connect() */
> these changes, strange things occur under heavy load:
> 
> <quote>
> 
> Mem[obj_type].bytes_allocated : 1097144    mem_header_ptr(object)->block_len   
> 1600008       sizeof(mem_header) 16      obj_type 20
> spread: memory.c:612: dispose: Assertion `0' failed.
> </quote>
> This assertion has been changed for debugging purposes, originally it looks 
> like:
> 
> assert(Mem[obj_type].bytes_allocated >= mem_header_ptr(object)->block_len + 
> sizeof(mem_header));
> 
> Machine is dual quad-core Opteron, 16GB RAM. "Heavy load" is 3 spread daemons, 
> 10 groups, in each 4-5 members, ~500 msgs/sec total
> 
> -- 
> Witold Kręcicki
> 
> o2.pl Spółka z o.o., ul. Jutrzenki 177, 02-231 Warszawa,
> KRS 0000140518, Sąd Rejonowy dla m.st. Warszawy, Kapitał zakładowy 308.250,00 
> zł., NIP 521-31-11-513
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------




More information about the Spread-users mailing list