[Spread-users] crash bug report
Jonathan Stanton
jonathan at cnds.jhu.edu
Wed Feb 11 17:30:17 EST 2004
I'm a bit concerned about this patch as it seems to be working around a
compiler bug by making the code less clear (char instead of int when the
value is an integer) -- and it isn't guaranteed to work if gcc changes how
the algorithm by which it removes memcpys. I'd be happier with a flag that
forced gcc not to remove the memcpys if there is no better solution.
I'll admit I do not have the C standard memorized. Does anyone else know
if this is valid compiler behavior on platforms where the memcpy is
required for alignment reasons?
On platforms without the alignment requirements, I'd be happy to hear if
gcc could optimize away unnecessary memcpys, but it seems like that
behavior on systems with alignment requirements is wrong as the memcpy is
necessary.
Thanks,
Jonathan
On Wed, Feb 11, 2004 at 03:07:20PM -0500, Mikhail Terekhov wrote:
> It still gives a bus error. Attached patch fixes it for me.
> The explanation is that gcc takes definition "int32 *num_vs_ptr"
> as a hint and eliminates call to memcpy.
>
> Regards,
> Mikhail
>
>
> Jonathan Stanton wrote:
>
> >I have fixed all of this type of bus-error causing bugs in the groups code
> >that I can find. The fixes have been committed to CVS. If you want to try
> >it out, since I don't really have any Sparcs or Alphas I can test on, I'd
> >appreciate it.
> >
> >If you want to wait for a release to test, I hope to package up a test
> >version of 3.17.2 soon, and that will include this fix. I just have a bit
> >more I want to do first.
> >
> >Cheers,
> >
> >Jonathan
> >
> >On Wed, Dec 24, 2003 at 12:34:04PM -0500, Mikhail Terekhov wrote:
> >
> >
> >>The same thing could happen on lines 866,1121,1314,1512,1829.
> >>Am I correct?
> >>
> >>Mikhail
> >>
> >>Theo Schlossnagle wrote:
> >>
> >>
> >>
> >>>On Tue, 2003-12-23 at 17:04, Greg Shebert wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>line 797:
> >>>>
> >>>>num_vs_ptr = (int32 *)&Mess_buf[ num_bytes ];
> >>>>num_bytes += sizeof( int32 );
> >>>>*num_vs_ptr = 1;
> >>>>
> >>>>the last statement (line 799) can cause a bus error... basically, if
> >>>>num_bytes is not a multiple of the host systems word size then the last
> >>>>statement tries to write a word length value into a location that is not
> >>>>word aligned...
> >>>>
> >>>>the resulting bus error brings down the daemon :(
> >>>>
> >>>>i corrected this using a memcpy instead of the assignment and this seems
> >>>>to correct the problem...
> >>>>
> >>>>i experienced the problem on solaris2.8
> >>>>
> >>>>
> >>>>
> >>>>
> >>>You are not mislead. That will certainly cause a bus error on any
> >>>architecture that requires word aligned copies like that. It'll likely
> >>>bus error on alpha as well.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>_______________________________________________
> >>Spread-users mailing list
> >>Spread-users at lists.spread.org
> >>http://lists.spread.org/mailman/listinfo/spread-users
> >>
> >>
> >
> >
> >
> Copyright (C) 2003 Mikhail Terekhov
>
> Permission is hereby granted, free of charge, to any person obtaining
> a copy of this software and associated documentation files (the
> "Software"), to deal in the Software without restriction, including
> without limitation the rights to use, copy, modify, merge, publish,
> distribute, sublicense, and/or sell copies of the Software, and to
> permit persons to whom the Software is furnished to do so, subject to
> the following conditions:
>
> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> NONINFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR
> ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
> CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
> WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
>
>
> Index: groups.c
> ===================================================================
> RCS file: /storage/cvsroot/spread/daemon/groups.c,v
> retrieving revision 1.13
> diff -C3 -r1.13 groups.c
> *** groups.c 8 Feb 2004 15:07:23 -0000 1.13
> --- groups.c 11 Feb 2004 19:42:44 -0000
> ***************
> *** 684,690 ****
> group *grp, *new_grp;
> member *mbr, *new_mbr;
> int needed;
> ! int32 *num_vs_ptr; /* num members in virtual-synchrony/failure-atomicity set */
> int num_bytes;
> char proc_name[MAX_PROC_NAME];
> char private_name[MAX_PRIVATE_NAME+1];
> --- 684,690 ----
> group *grp, *new_grp;
> member *mbr, *new_mbr;
> int needed;
> ! char *num_vs_ptr; /* num members in virtual-synchrony/failure-atomicity set */
> int num_bytes;
> char proc_name[MAX_PROC_NAME];
> char private_name[MAX_PRIVATE_NAME+1];
> ***************
> *** 824,830 ****
> head_ptr->type |= CAUSED_BY_JOIN ;
>
> /* notify all local members */
> ! num_vs_ptr = (int32 *)&Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> --- 824,830 ----
> head_ptr->type |= CAUSED_BY_JOIN ;
>
> /* notify all local members */
> ! num_vs_ptr = &Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> ***************
> *** 892,898 ****
> head_ptr = Message_get_message_header(joiner_msg);
> head_ptr->type |= CAUSED_BY_NETWORK ;
> /* build a self vs set */
> ! num_vs_ptr = (int32 *)&Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> --- 892,898 ----
> head_ptr = Message_get_message_header(joiner_msg);
> head_ptr->type |= CAUSED_BY_NETWORK ;
> /* build a self vs set */
> ! num_vs_ptr = &Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> ***************
> *** 1022,1028 ****
> proc p, p1;
> group *grp;
> member *mbr;
> ! int32 *num_vs_ptr; /* num members in vs set */
> char *vs_ptr; /* the virtual synchrony set */
> message_link *mess_link;
> message_header *head_ptr;
> --- 1022,1028 ----
> proc p, p1;
> group *grp;
> member *mbr;
> ! char *num_vs_ptr; /* num members in vs set */
> char *vs_ptr; /* the virtual synchrony set */
> message_link *mess_link;
> message_header *head_ptr;
> ***************
> *** 1147,1153 ****
> head_ptr->type |= CAUSED_BY_LEAVE ;
>
> /* notify all local members */
> ! num_vs_ptr = (int32 *)&Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> --- 1147,1153 ----
> head_ptr->type |= CAUSED_BY_LEAVE ;
>
> /* notify all local members */
> ! num_vs_ptr = &Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> ***************
> *** 1219,1225 ****
> proc p, p1;
> group *grp, *nextgroup;
> member *mbr;
> ! int32 *num_vs_ptr; /* num members in vs set */
> char *vs_ptr; /* the virtual synchrony set */
> message_link *mess_link;
> message_header *head_ptr;
> --- 1219,1225 ----
> proc p, p1;
> group *grp, *nextgroup;
> member *mbr;
> ! char *num_vs_ptr; /* num members in vs set */
> char *vs_ptr; /* the virtual synchrony set */
> message_link *mess_link;
> message_header *head_ptr;
> ***************
> *** 1341,1347 ****
>
> head_ptr->type |= CAUSED_BY_DISCONNECT ;
>
> ! num_vs_ptr = (int32 *)&Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> --- 1341,1347 ----
>
> head_ptr->type |= CAUSED_BY_DISCONNECT ;
>
> ! num_vs_ptr = &Mess_buf[ num_bytes ];
> num_bytes += sizeof( int32 );
> temp = 1;
> memcpy( num_vs_ptr, &temp, sizeof( int32 ) ); /* *num_vs_ptr = 1; */
> ***************
> *** 1484,1490 ****
> int changed;
> int ret;
> int vs_bytes;
> ! int32 *num_vs_ptr; /* num members in virtual-synchrony/failure-atomicity set */
> int32 num_vs;
> int num_exist;
> struct worklist *indices[MAX_PROCS_RING];
> --- 1484,1490 ----
> int changed;
> int ret;
> int vs_bytes;
> ! char *num_vs_ptr; /* num members in virtual-synchrony/failure-atomicity set */
> int32 num_vs;
> int num_exist;
> struct worklist *indices[MAX_PROCS_RING];
> ***************
> *** 1544,1550 ****
> group *this_group;
> /* prepare vs set */
> vs_bytes = 0;
> ! num_vs_ptr = (int32 *)&Temp_buf[0];
> vs_bytes+= sizeof( int32 );
> num_vs = 0;
>
> --- 1544,1550 ----
> group *this_group;
> /* prepare vs set */
> vs_bytes = 0;
> ! num_vs_ptr = &Temp_buf[0];
> vs_bytes+= sizeof( int32 );
> num_vs = 0;
>
> ***************
> *** 1859,1865 ****
>
> int num_bytes;
> message_header *head_ptr;
> ! int32 *num_vs_ptr; /* num members in virtual-synchrony/failure-atomicity set */
> struct skiplistnode *iter;
> member *mbr;
> char *membs_ptr;
> --- 1859,1865 ----
>
> int num_bytes;
> message_header *head_ptr;
> ! char *num_vs_ptr; /* num members in virtual-synchrony/failure-atomicity set */
> struct skiplistnode *iter;
> member *mbr;
> char *membs_ptr;
> ***************
> *** 1870,1876 ****
>
> head_ptr->type = head_ptr->type | caused;
>
> ! num_vs_ptr = (int32 *)&buf[num_bytes];
> num_bytes += sizeof( int32 );
> head_ptr->data_len += sizeof( int32 );
> num_vs = 0;
> --- 1870,1876 ----
>
> head_ptr->type = head_ptr->type | caused;
>
> ! num_vs_ptr = &buf[num_bytes];
> num_bytes += sizeof( int32 );
> head_ptr->data_len += sizeof( int32 );
> num_vs = 0;
> ***************
> *** 1925,1931 ****
> char *gid_ptr;
> member *mbr;
> struct skiplistnode *giter, *iter;
> ! int16 *num_memb_ptr;
> int16 num_memb;
> char *memb_ptr;
> int size_for_this_group;
> --- 1925,1931 ----
> char *gid_ptr;
> member *mbr;
> struct skiplistnode *giter, *iter;
> ! char *num_memb_ptr;
> int16 num_memb;
> char *memb_ptr;
> int size_for_this_group;
> ***************
> *** 1955,1961 ****
> num_bytes += sizeof( group_id );
> memcpy( gid_ptr, &grp->grp_id, sizeof(group_id) );
>
> ! num_memb_ptr = (int16 *)&buf[num_bytes];
> num_bytes += sizeof( int16 );
> num_memb = 0;
>
> --- 1955,1961 ----
> num_bytes += sizeof( group_id );
> memcpy( gid_ptr, &grp->grp_id, sizeof(group_id) );
>
> ! num_memb_ptr = &buf[num_bytes];
> num_bytes += sizeof( int16 );
> num_memb = 0;
>
> ***************
> *** 1992,1998 ****
> group *grp;
> char *gid_ptr;
> member *mbr;
> ! int16 *num_memb_ptr;
> int16 num_memb;
> int i;
>
> --- 1992,1998 ----
> group *grp;
> char *gid_ptr;
> member *mbr;
> ! char *num_memb_ptr;
> int16 num_memb;
> int i;
>
> ***************
> *** 2041,2047 ****
> num_bytes += sizeof( group_id );
> memcpy( &grp->grp_id, gid_ptr, sizeof(group_id) );
>
> ! num_memb_ptr = (int16 *)&Temp_buf[num_bytes];
> num_bytes += sizeof( int16 );
> memcpy( &num_memb, num_memb_ptr, sizeof( int16 ) );
>
> --- 2041,2047 ----
> num_bytes += sizeof( group_id );
> memcpy( &grp->grp_id, gid_ptr, sizeof(group_id) );
>
> ! num_memb_ptr = &Temp_buf[num_bytes];
> num_bytes += sizeof( int16 );
> memcpy( &num_memb, num_memb_ptr, sizeof( int16 ) );
>
--
-------------------------------------------------------
Jonathan R. Stanton jonathan at cs.jhu.edu
Dept. of Computer Science
Johns Hopkins University
-------------------------------------------------------
More information about the Spread-users
mailing list