[Spread-users] BUFFER_TOO_SHORT && endian_mismatch >= 0

Tim Peters tim at zope.com
Thu Jul 4 00:24:53 EDT 2002


[Jonathan Stanton, on our Spread wrapper's

 		if (size >= 0) {
 			if (num_groups < 0) {
 				/* XXX This really happens!
 				   Don't dare retry the receive, since we
 				   didn't get an error.  The extra names
 				   are forever lost. */
 				num_groups = max_groups;
 			}
]

I was tempted to blow this off as a distraction because I didn't write this
part of the code, the fellow who did is on vacation and unreachable, this
app never sends to more than one group, we reserve space for 10 groups on
the receiving end by default, and we're not having a problem with this part
of the code anyway ... but <wink>:

> The only way I can see this happening is if the DROP_RECV flag is passed
> in the service type field. This flag enables the "old" behavior of
> Spread prior to version 3.12. This behavior is still useful sometimes
> when you do not want to retry and you just want to make progress (or the
> buffer space is limited).

Ouch.  We don't want DROP_RECV behavior in this app, but I do believe our
code is treating service_type as a pure output parameter.  It's declared as
an auto like so

	service svc_type;

and we never initialize it before passing its address to SP_receive.  So its
value on *entry* to SP_receive is whatever trash happened to be sitting on
the stack.  The SP_receive man page says:

     Service_type is a pointer to a variable of type 'service' which will
     be set to the message type of the message just received.  This
     will be either a REG_MESSAGE or MEMBERSHIP_MESS, and the specific
     type.

I'm not sure any of us understood that it's also an input parameter.  Our
code certainly isn't treating it like one.  I assume we should initialize it
to 0 instead (we never want DROP_RECV behavior in this app).

> if DROP_RECV is enabled, then a 'successful' SP_receive call can have a
> negative num_groups. This indicates that some of the groups were dropped
> but the message was received.

It looks like we can enable DROP_RECV purely by accident.  Can specifying
DROP_RECV in service_type, or any other flag (if there are any others -- not
clear), on entry make it possible for SP_receive to return BUFFER_TOO_SHORT
and set endian_mismatch >= 0 too?  That would explain our user's symptom if
so; else that part remains a mystery.

> GROUPS_TOO_SHORT should be returned whenever DROP_RECV is NOT set and
> the groups array is too small. If the buffers are also too small then
> GROUPS_TOO_SHORT will be set but endian_mismatch will also be negative
> so you know how big a buffer is needed.
>
> BUFFER_TOO_SHORT is only returned if the groups array IS big enough

The groups array is always big enough in this specific app.

> and DROP_RECV is not set. OR if DROP_RECV is set then it will be returned
> whenever the buffer is too short, even if groups is also short.

In that case (DROP_RECV specified unwittingly, and buffer is too short, and
BUFFER_TOO_SHORT returned), would endian_mismatch still contain the negation
of the buffer size needed had DROP_RECV not been specified?  Or might it
contain a value >= 0 then?

> I agree this is a slight descrepency -- it results because originally
> only BUFFER_TOO_SHORT existed, GROUPS_TOO_SHORT was added when the
> semantics were changed from DROP_RECV to NOT DROP_RECV by default.

That's OK, I wasn't complaining about inconsistency.  This piece of the
Spread API may be too complicated for human use, but that's why we wrote a
wrapper <wink>.





More information about the Spread-users mailing list