<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

  <title></title>

</head>

<body text="#000000" bgcolor="#ffffff">

<br>

<br>

Tim Peters wrote:<br>

<blockquote type="cite"

 cite="midLNBBLJKPBEHFEDALKOLCEEEKIOAB.tim@zope.com">

  <pre wrap="">[Jonathan Stanton]

  </pre>

  <blockquote type="cite">

    <pre wrap="">I'm a bit concerned about this patch as it seems to be working around

a compiler bug by making the code less clear (char instead of int

when the value is an integer) -- and it isn't guaranteed to work if

gcc changes how the algorithm by which it removes memcpys. I'd be

happier with a flag that forced gcc not to remove the memcpys if

there is no better solution.

I'll admit I do not have the C standard memorized. Does anyone else

know if this is valid compiler behavior on platforms where the memcpy

is required for alignment reasons?

    </pre>

  </blockquote>

  <pre wrap=""><!---->

I was a language lawyer in a previous life (15 years in optimizing compiler

development).  There are two issues here:

1. The effect of the assignment:

        int32                *num_vs_ptr; /* num members in

        num_vs_ptr = &amp;Mess_buf[ num_bytes ];

is undefined by C unless the address is properly aligned for an int32.  So

  </pre>

</blockquote>

This is not exactly what was in the code before the patch. The code was:<br>

<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; int32&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; *num_vs_ptr;<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; int32u&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; temp;<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; ....<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp; num_vs_ptr = (int 32 *)&amp;Mess_buf[ num_bytes ];<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; temp = 1;<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp; memcpy(num_vs_ptr, &amp;temp, sizeof(int32));<br>

<br>

In this case the assignment to num_vs_ptr is perfectly defined in C. It

is a pointer to some place inside the buffer Mess_buf. What is

undefined is an assignment like *num_vs_ptr = 1<br>

if the pointer is not properly aligned.<br>

<br>

<blockquote type="cite"

 cite="midLNBBLJKPBEHFEDALKOLCEEEKIOAB.tim@zope.com">

  <pre wrap="">in the original code, it's not really the memcpy that's the problem, it's

that the input to memcpy has no defined semantics (unless the address is

already int32-aligned).</pre>

</blockquote>

The input to memcpy is also perfectly defined because according to

memcpy(3) the <br>

prototype is:<br>

<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; void *memcpy(void *dest, const void *src, size_t n);<br>

<br>

Both src and dest are _void_ pointers i.e. pointers of any kind and

memcpy copyes _bytes_<br>

not ints, doubles etc.<br>

What happens here is that gcc is clever enough and takes the fact that

both source and<br>

destination are pointers to four byte sized variables and that the

number of bytes to copy is<br>

also four as a _hint_ from the programmer and optimizes the call to

memcpy out.<br>

<br>

<blockquote type="cite"

 cite="midLNBBLJKPBEHFEDALKOLCEEEKIOAB.tim@zope.com">

  <pre wrap="">

2. In the rewritten code:

        char                *num_vs_ptr; /* num members in vs set */

        num_vs_ptr = &amp;Mess_buf[ num_bytes ];

        num_bytes += sizeof( int32 );

      temp = 1;

      memcpy(num_vs_ptr, &amp;temp, sizeof(int32));

the assignment to num_vs_ptr is defined, and I believe gcc's memcpy is in

error if it doesn't work as intended.  The C standard defines memcpy in</pre>

</blockquote>

This works as expected. There is no problem with gcc in this case.<br>

<blockquote type="cite"

 cite="midLNBBLJKPBEHFEDALKOLCEEEKIOAB.tim@zope.com">

  <pre wrap="">

terms of copying "characters", one at a time, and imposes no alignment

requirements.  When I wrote highly optimized memcpy implementations in

previous lives implementing C, it was universally accepted among C compiler

writers that you could not optimize memcpy at the expense of violating

platform alignment restrictions.

So you lose either way &lt;wink&gt;.  Here are some gcc developers debating the

same thing:

    <a class="moz-txt-link-freetext" href="http://gcc.gnu.org/ml/gcc-bugs/2000-03/msg00155.html">http://gcc.gnu.org/ml/gcc-bugs/2000-03/msg00155.html</a>

  </pre>

</blockquote>

Exactly. The problem not in the memcpy but in the compiler optimization.<br>

Compiler trusts programmer too much ;)<br>

<blockquote type="cite"

 cite="midLNBBLJKPBEHFEDALKOLCEEEKIOAB.tim@zope.com">

  <pre wrap="">

It *sounds* to me like the main developer is determined to be unreasonable

on this point, but I don't really know.

The practical thing to do is to reserve use of memcpy for purely

known-to-be-aligned cases, and write your own memcpy-workalike one-liner for

other cases.  You don't want to turn off the builtin memcpy if you ever use

it to move large chunks of (properly aligned) memory, because a good

compiler *can* do that much faster than one-byte-at-a-time.  Unaligned block

transfers are much harder to speed, and the overhead of trying to speed them

costs more than it saves unless "a lot" of bytes are getting moved -- so

it's usually no loss to run your own byte-at-a-time function in such cases.

If Spread doesn't memcpy large chunks of memory, screw it -- disable gcc's

inlined replacement and sleep well.

  </pre>

</blockquote>

There is no need to write home grown memcpy, just don't give compiler

too much <br>

unfounded hints :)<br>

<br>

</body>

</html>