[Spread-users] Optimizing inter-segment communication to reduce packets

Sat Feb 9 17:01:14 EST 2008

Jonathan Stanton wrote:
> I'm hope to have a more detailed discussion on friday as this 
> gets compilcated, but John definitely covered some of the basics.
>
> It is possible to do some optimizations on the message dissmemination
> and we have tried out some ideas before, but it's hard to keep the 
> guaranteed services and performance while not sending some of the messages to
> everyone.
>
> A few comments below:
>
> On Sun, Feb 03, 2008 at 10:44:32AM -0800, Jim Kleckner wrote:
>   
>> I'm creating a new thread to hold discussion about the
>> optimization discussed in two other threads.  I would like to
>> explore how to make the smallest change that gets the most "bang
>> for the buck" and doesn't violate messaging guarantees.
>>
>> Here are some questions/ideas on the topic:
>>
>> - Would it simplify the problem if we limit the optimization to
>> omit messages only if they are "payload" messages and require
>> all membership messages to be transmitted globally?
>> In mainstream use, the payload messages ought be significantly
>> more numerous than membership messages.
>>     
>
> Membership messages are generally small (and less frequent), but currently they use the
> same ordering service as other AGREED data messages so they appear
> like normal 'data' messages to the lower level send/routeing code. This could be changed.
>
> The tricky bits aren't really knowing when we can try to optimize the content removal, but
> in making sure in failure cases we can get the message bodies when we need them to correctly
> deliver the messages with the appropriate guarantees.
>
>   
>> - Having a reliable and complete view of all memberships might
>> permit you to know definitively whether there are no members
>> of a particular group in a given segment and allow you to
>> elide sending non-membership messages to that segment.
>>     
>
> Spread currently has a 2-level model internally -- groups are implemented at the
> top level and the lower level message-sending and routing code doesn't know anything about them. To it 
> there is only one group -- that of currently connected daemons. 
>   

This seems to be the crux of the issue.
The upper level might be able to compute a "list of segments" that have 
members for groups.
The lower level might ignore this based on the selected guarantee?

>> - Would it simplify the optimization if we limited it to classes
>> of message with lesser guarantees?  What level of guarantee
>> causes the optimization to get very difficult?
>>     
>
> Good question. This is possible. 
>   

Is the UNRELIABLE_MESS case simple enough to implement?
UNRELIABLE_MESS could handle some useful applications.

One could structure an application manually to organize groups with 
different levels
of guarantee and even send the same message twice, one unreliable and 
the other reliable.

>> - Is there some strategic "weakening" of some type of guarantee
>> that would greatly simplify implementing the optimization?
>>     
>
> Good topic for a reserach paper or thesis :-)
>   

;-)

Another question is whether one could compose this at a higher 
application level.
What barriers exist to a single program connecting to two spread networks?

One could build an "active bridge" program that notices the memberships of
groups on both sides, computes the group-to-network/segment mapping,
and copies messages from one side to the other.  Each side works as today.

It doesn't address issues of guarantees at all though perhaps some coupling
of the server logic for accepting the message as delivered could depend on
the remote side being accepted.  This would probably be inefficient at best
and get hairy for the more stringent guarantees.

Just some thoughts.

Jim