[Spread-users] Concurrency issues with spread 4

John Schultz jschultz at spreadconcepts.com
Sun Sep 26 10:11:02 EDT 2010


I believe the problem still exists and it is due to you using RELIABLE_MESS to send the following message.  RELIABLE_MESS has no ordering guarantees with respect to other messages so it can be delivered before or after the join membership message.  Indeed, different members of the group may see the same message before and after the join membership from their perspectives.

If you want to ensure that the message is ordered wrt all membership changes, then you should use AGREED_MESS for the service type.  Join/leave/disconnect memberships are essentially AGREED_MESS's themselves.

I hope that helps!

Cheers!

-----
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200

On Sep 26, 2010, at 7:51 AM, Johannes Wienke wrote:

On 09/22/2010 06:48 PM, John Schultz wrote:
> If steps 2+3 are performed by the same thread, then the receiver thread typically (depending on the msg's service type) should first get the join msg for the group (assuming membership tracking is enabled on the connection) and then the subsequent msg to that group.
> 
> If steps 2+3 are performed by different threads and not coordinated to surely execute in a serial manner, then there is a race condition between the two threads accessing the server through the client library and the series of events you observe would certainly be possible.
> 
> So, my first question is which of the above scenarios correctly describes your threading model and usage?

Joining and sending the message are preformed by the same thread, so the first of the scenarios you described is the correct one.

> My second question is what is the msg service type that you are using for sending the subsequent msg (e.g. - AGREED_MESS, UNRELIABLE_MESS, FIFO_MESS, etc.)?

RELIABLE_MESS.

> My third question is what do other, previously established members of the group observe as the series of events?  Do they see the join followed by the msg or vice versa, or what?

I did not check this so far.

> To your question about closing the file descriptor while another thread is blocked in a read, the behavior of that is OS dependent.  I have personally observed similar behavior to what you describe on certain OSes.  The safest way to disconnect such a receiver thread would be to send an application msg to the private group of the connection indicating that you want the reader to stop running (and call SP_close, clean up, etc.).  The receiver thread will get this message, process it and realize it should stop blocking on receives, etc.

Ok this is exactly what I've done now and for reasons I don't understand this also seems to have solved the first problem. I cannot reproduce the lost messages anymore eventhough this was the only change I've done.

Thanks for your help.
Johannes

_______________________________________________
Spread-users mailing list
Spread-users at lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3805 bytes
Desc: not available
Url : http://lists.spread.org/pipermail/spread-users/attachments/20100926/e67f2217/attachment.bin 


More information about the Spread-users mailing list