<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline;}
pre
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:Arial;
        color:windowtext;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
        {page:Section1;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Hi all:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>I have noticed that after long running test spread 4.0
crashes with corrupted stack in the following spot:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Membership.c:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>static void Fill_form1(
sys_scatter *scat )<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>{<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>....<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
/* New ring_info will fit, so create it */<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
for( index = Last_discarded+1; index <= Highest_seq; index++ )<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
{<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
pack_entry = index & PACKET_MASK;<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
if( ! Packets[pack_entry].exist )<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
{<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'> *new_holes_procs_ptr
= index;<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'> Alarm(
MEMB , "INSERT HOLE 2 IS %d\n",index);
<<<<<<<<<<<<<<<<<<<<<<<<<
CRASH HERE<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'> new_holes_procs_ptr++;<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'> num_bytes
+= sizeof(int32);<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'> new_rg_info->num_holes++;<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
}<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>
}<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>}<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>The first stack trace I got was this:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<pre><font size=2 face="Courier New"><span lang=DE style='font-size:10.0pt'>(gdb) where<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span lang=DE style='font-size:10.0pt'>#0 0xb7ebe9da in getenv () from /lib/libc.so.6<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#1 0xb7f10d19 in tzset_internal () from /lib/libc.so.6<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#2 0xb7f11a28 in __tz_convert () from /lib/libc.so.6<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#3 0xb7f0fca0 in localtime () from /lib/libc.so.6<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#4 0x08053498 in Alarm (mask=512, message=0x806c171 "INSERT HOLE 2 IS %d\n")<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> at alarm.c:146<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#5 0x0805864e in Fill_form1 (scat=Variable "scat" is not available.<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>) at membership.c:1837<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#6 0x00000f79 in ?? ()<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#7 0x00000f7a in ?? ()<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#8 0x00000f7b in ?? ()<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>#9 0x00000f7c in ?? ()<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>This obviously shows the stack was corrupted and since the only part of the code I could suspect of causing this corruption was the following buffer:<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> char rg_info_buf[sizeof(token_body)];<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>I added asserts to track exactly at what point we overrun the buffer and found the following:<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>....<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> rg_info_buf_end = rg_info_buf + sizeof(token_body);<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>....<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> /* New ring_info will fit, so create it */<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> for( index = Last_discarded+1; index <= Highest_seq; index++ )<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> {<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> pack_entry = index & PACKET_MASK;<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> if( ! Packets[pack_entry].exist )<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> {<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> assert((rg_info_buf + num_bytes + sizeof(int32)) <= rg_info_buf_end); <<<<< TRIGERRED ASSERT<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> *new_holes_procs_ptr = index;<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> Alarm( MEMB , "INSERT HOLE 2 IS %d\n",index);<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> new_holes_procs_ptr++;<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> num_bytes += sizeof(int32);<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> new_rg_info->num_holes++;<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> }<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'> }<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>So it looks like we overrun the rg_info_buf under some conditions and I am wondering if snybody has seen this problem or whether there have been patches issued for this issue? This was observed with 8 node cluster....<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'>Regards, Juan<o:p></o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre><pre><font
size=2 face="Courier New"><span style='font-size:10.0pt'><o:p> </o:p></span></font></pre>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
</FIELDSET>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
</div>
</body>
</html>