[Spread-users] Concurrency issues with spread 4

John Schultz jschultz at spreadconcepts.com
Wed Sep 22 12:48:57 EDT 2010


If steps 2+3 are performed by the same thread, then the receiver thread typically (depending on the msg's service type) should first get the join msg for the group (assuming membership tracking is enabled on the connection) and then the subsequent msg to that group.

If steps 2+3 are performed by different threads and not coordinated to surely execute in a serial manner, then there is a race condition between the two threads accessing the server through the client library and the series of events you observe would certainly be possible.

So, my first question is which of the above scenarios correctly describes your threading model and usage?

My second question is what is the msg service type that you are using for sending the subsequent msg (e.g. - AGREED_MESS, UNRELIABLE_MESS, FIFO_MESS, etc.)?

My third question is what do other, previously established members of the group observe as the series of events?  Do they see the join followed by the msg or vice versa, or what?

To your question about closing the file descriptor while another thread is blocked in a read, the behavior of that is OS dependent.  I have personally observed similar behavior to what you describe on certain OSes.  The safest way to disconnect such a receiver thread would be to send an application msg to the private group of the connection indicating that you want the reader to stop running (and call SP_close, clean up, etc.).  The receiver thread will get this message, process it and realize it should stop blocking on receives, etc.


John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200

On Sep 22, 2010, at 11:49 AM, Johannes Wienke wrote:

Dear all,

I'm currently struggling with some issues probably related to
concurrency and would be glad to get some insight.

We're having one concurrently operating thread that constantly receives
messages on an mbox and processes them. From other threads than the
receiving, the memberships and lifecycle of the mbox are controlled.

Now I noticed a problem with the following sequence:

1. [receiver thread] blocks in SP_receive
2. [other thread] joins mbox of receiving thread to a group
3. [other thread] directly after the join call sends a message to this group

Now the receiving thread does not receive the message sent by the other
thread. If I add a short sleep between 2. and 3., the receiver thread
gets the message.

So, is there no guarantee that a concurrently operating receiver thread
receives every message directly sent after a join call from another
thread? Or what is the reason for this behavior?

A second problem I encountered is how to interrupt the blocking call to
SP_receive. Currently we're trying to interrupt SP_receive by
disconnecting the mbox from another thread. Unfortunately this does not
work in some cases and SP_receive endlessly hangs on the mbox. So again,
is there no guarantee that SP_receive exits with an error if the
underlying mbox is closed?

Thanks in advance for your help,

Spread-users mailing list
Spread-users at lists.spread.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3805 bytes
Desc: not available
Url : http://lists.spread.org/pipermail/spread-users/attachments/20100922/02b1928c/attachment.bin 

More information about the Spread-users mailing list