[Spread-users] Multipath Spread #2
Marc.Zyngier at evidian.com
Mon Oct 22 07:46:52 EDT 2001
>>>>> "JS" == Jonathan Stanton <jonathan at cnds.jhu.edu> writes:
JS> Have you tried this with multicast addresses and not broadcast?
JS> Multicast have some 'not always useful' behaivor with multi-homed
JS> machines and I was curious if this approach worked there also. I
JS> think it should be fine once the routes were set right. (Some os's
JS> only send multicasts out one interface in a multi-homed machine --
JS> a way around this is to bind the source socket to the interface to
JS> force a send out of that interface as well)
We haven't tested multicast yet. This will at least need one send
socket per network, bound to the right interface, instead of a global
send socket. This is the next (planned) step...
JS> It turns out after a bit of poking at it that the cleanest way to
JS> fix this is to bind sockets that should receive broadcasts to the
JS> 'Broadcast' adress of the interface as well as the unicast
JS> address. So for network with broadcast 10.0.1.255 and interface
JS> 10.0.1.5, we bind a socket to 10.0.1.5 and a second socket to
JS> 10.0.1.255. The second socket will receive broadcasts and the
JS> first will receive unicasts. It appears like the same approach
JS> works for multicasts based on my local tests here.
Nice idea. Doesn't work on Windows (on W2K, bind fails on a non-local
address), but we can work around this, though.
JS> By tightening up you mean this line from the protocol.c patch?
JS> @@ -466,7 +469,7 @@
JS> - if( Get_arq(Token->type) == Get_arq(Last_token->type) )
JS> + if( Get_arq(Token->type) != ((Get_arq(Last_token->type)+1)%0x100) )
JS> if( Get_retrans(Token->type) > Get_retrans(Last_token->type) )
JS> So you drop into the resend or swallow case if the new token is
JS> not the 'expected' one instead of if it is the same as the one we
JS> already received? Was the problem that tokens 'older' then the
JS> previous one were appearing?
Yes, exactly. If you have multiple networks, they can be very out of
sync (congestion, for example). In such a situation, you can easily
receive a token with ARQ=8 on an interface, and an old token with
ARQ=1 on another. This Is Bad(tm).
Having this stricter test, as well as having a wider ARQ field helped
us having a system that stays up on very high loads, instead of
crashing after a few thousand messages.
JS> You added or changed a number of uses of IP macros to inet_ntoa. I
JS> think this is not the right direction for a few reasons. The
JS> reason we do not use inet_ntoa() in Alarms or other places is
JS> because it uses a static buffer to store the returned string, so
JS> if you use it multiple times in one alarm, then all of the printed
JS> values are the same and equal whoever was last. So even though
JS> your uses of it are only a single use per Alarm, it is easy to
JS> forget the problem and someone later adds another IP address to be
JS> printed and then you do not get correct values. So it always
JS> seemed too dangerous to use this way. I definitely think there
JS> probably is a better way then the IP1 macros, but I don't think
JS> inet_ntoa is the way.
Fair enough. Part of this was because of the int/in_addr switch (see
JS> Why did you change the interface to DL_send to use in_addr instead
JS> of directly passing the address? This also requires more includes
JS> (data_link.h) and a more complex interface for apps using
JS> data_link (more then spread use it). I know that using in_addr is
JS> more direct sockets programming, but part of the idea of the DL
JS> layer was to expose a slightly simpler and abstracted interface
JS> that did not reqiure using native sockets types.
If data_link was supposed to be transport agnostic, why using IP
addresses at this level ? Or am I misunderstanding something ?
At least having a specific type for an address would be nice
(net_addr_t, or anything like that). This would help a lot if someone
wants to switch to another transport (say IPv6 for example).
JS> Why did monitor need to use the mp send? Shouldn't sending the
JS> monitor commands on one network be sufficient, maybe with a choice
JS> of which one to use? Then if one fails you can pick the other, but
JS> you do not duplicate all of the monitor processing
JS> twice. (generating status messges twice, doing commands twice,
Yep, that would be nice. But the question still remains : how do you
detect that sending has failed ?
JS> I hope this helps, and if you have any comments on the changes I
JS> applied to fix stuff in CVS, just tell me.
I'll try to make an updated patch tomorrow, and will keep you posted
Thanks a lot for your comments.
Evidian - SafeKit Project
And don't forget you'll never get a dog to walk upright
Just 'cause you've got the power, that don't mean you've got the right.
More information about the Spread-users