[Spread-users] fix for spread-4.4.0 crash in mixed endian environment
martin.sc11111 at gmail.com
Wed Sep 10 08:52:45 EDT 2014
We have some SPARC and some Intel hosts running spread-4.1. Everything was
working fine with spread-4.1 in that mixed endian environment.
After upgrading from spread-4.1 to spread 4.4.0 we observed problems when
spread-4.4 daemons are started on both SPARC(Solaris) and Intel(RHEL).
Everything is working fine as long as the spread-4.4 daemons are started
*only* on SPARC hosts or *only* on Intel hosts.
Now we first start all spread daemons on Intel. When the first daemon of
the “other endian” architecture is run up problems begin. We found the
following alarm message in a spread logfile. And a spread daemon was
crashing with core dump.
2014-09-08 17:43:59 GMT Prot_handle_bcast: invalid packet with seq
-1062731160 from 21, processed bytes not equal data_len 16905 1395
The program code emitting that alarm is new since spread-4.3. It is only
relevant in mixed endian environments.
Obviously something is wrong with the length calculation. The length
reported in the alarm message (16905) is far away from the desired value
We believe there is a bug in the pointer calculation of the frag_ptr in
protocol.c. Below a proposed patch for that.
Can you confirm this?
We would be glad if that could be included in a future release of spread.
--- daemon/protocol.c.orig 2014-05-15 17:04:35.000000000 +0200
+++ daemon/protocol.c 2014-09-09 14:37:40.845240310 +0200
@@ -409,7 +409,7 @@
pack_ptr->seq, processed_bytes, pack_ptr->data_len );
- frag_ptr = (fragment_header *)
+ frag_ptr = (fragment_header *)
(((char*)pack_body_ptr) + processed_bytes);
processed_bytes += sizeof(fragment_header) +
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Spread-users