[Spread-users] crash bug report

Wed Feb 11 21:03:23 EST 2004

[Jonathan Stanton]
> I'm a bit concerned about this patch as it seems to be working around
> a compiler bug by making the code less clear (char instead of int
> when the value is an integer) -- and it isn't guaranteed to work if
> gcc changes how the algorithm by which it removes memcpys. I'd be
> happier with a flag that forced gcc not to remove the memcpys if
> there is no better solution.
>
> I'll admit I do not have the C standard memorized. Does anyone else
> know if this is valid compiler behavior on platforms where the memcpy
> is required for alignment reasons?

I was a language lawyer in a previous life (15 years in optimizing compiler
development).  There are two issues here:

1. The effect of the assignment:

	int32		*num_vs_ptr; /* num members in
	num_vs_ptr = &Mess_buf[ num_bytes ];

is undefined by C unless the address is properly aligned for an int32.  So
in the original code, it's not really the memcpy that's the problem, it's
that the input to memcpy has no defined semantics (unless the address is
already int32-aligned).

2. In the rewritten code:

	char		*num_vs_ptr; /* num members in vs set */
	num_vs_ptr = &Mess_buf[ num_bytes ];
	num_bytes += sizeof( int32 );
      temp = 1;
      memcpy(num_vs_ptr, &temp, sizeof(int32));

the assignment to num_vs_ptr is defined, and I believe gcc's memcpy is in
error if it doesn't work as intended.  The C standard defines memcpy in
terms of copying "characters", one at a time, and imposes no alignment
requirements.  When I wrote highly optimized memcpy implementations in
previous lives implementing C, it was universally accepted among C compiler
writers that you could not optimize memcpy at the expense of violating
platform alignment restrictions.

So you lose either way <wink>.  Here are some gcc developers debating the
same thing:

    http://gcc.gnu.org/ml/gcc-bugs/2000-03/msg00155.html

It *sounds* to me like the main developer is determined to be unreasonable
on this point, but I don't really know.

The practical thing to do is to reserve use of memcpy for purely
known-to-be-aligned cases, and write your own memcpy-workalike one-liner for
other cases.  You don't want to turn off the builtin memcpy if you ever use
it to move large chunks of (properly aligned) memory, because a good
compiler *can* do that much faster than one-byte-at-a-time.  Unaligned block
transfers are much harder to speed, and the overhead of trying to speed them
costs more than it saves unless "a lot" of bytes are getting moved -- so
it's usually no loss to run your own byte-at-a-time function in such cases.

If Spread doesn't memcpy large chunks of memory, screw it -- disable gcc's
inlined replacement and sleep well.