[Spread-users] Retransmission and Connection Problems

Jonathan Stanton jonathan at cnds.jhu.edu
Fri Oct 3 01:35:56 EDT 2003


This is only a quick reply now as it's 1:30am and I'm really crashing. 

I have seen a number of issues with multi-homed machines. Sometimes they 
work just fine, sometimes they can work if configured (either Spread 
configuration, or networking) and sometimes they seem to have problems 
because of OS issues (usually Linux -- hopefully FreeBSD doesn't have the 
same problems). 

My experience has been that multicast with multihomed machines can be 
tricky because often only one interface is assigned to route multicast 
packets and so if it isn't the right one (meaning the one the Spread 
configuration is using) then nothing works.

Broadcast usually works in these cases though, as the bcast address is 
network address specific so it gets routed out the right interface. 

The D, C, spread network configuration syntax was designed to deal with 
multihomed machines, so it could help specify the bindings more precisely, 
but if the machines are all on the same network it should not be 
'necessary'.

I'd run 'netstat -tan' and 'netstat -rn' while the spread daemons are up 
and check the interfaces they are bound to and listening on, and the 
routing table and make sure it all matches up. If you can post them I'll 
take a look tomorrow and see if I notice anything.

The 'spsend' and 'sprecv' programs that are part of the spread source (but 
not built by default -- run 'make testprog' in a ./configured tree) send 
the same kind of unicast, broadcast, and multicast UDP packets that Spread 
itself does and can verify whether the networking is working without 
running all of Spread.

Hope this helps some,

Jonathan


On Thu, Oct 02, 2003 at 04:35:26PM -0700, Jeremy McDermond wrote:
> 
> On Thursday, October 2, 2003, at 03:46 PM, Yair Amir wrote:
> 
> >Hi Jeremy,
> >
> >It seems that your broadcast address does not work correctly on all of
> >the machines. This is why I am sure it works with any two machines and
> >will not work for you with three or more machines.
> >
> 
> I've looked to make sure the switch doesn't do any broadcast 
> limitations, and I've checked each of the three machines to make sure 
> the broadcast address is set correctly on the appropriate ethernet 
> interface.  FreeBSD should not be doing broadcast limiting either.  
> Interestingly when I change the ring address to 225.0.1.1 for 
> multicast, none of it seems to get through at all.  I've sniffed the 
> wire from one of them, and there's no IGMP requests sent out the 
> interface to join multicast groups.  If I use the mtest utility, I see 
> the IGMP traffic getting sent to start up group traffic.  These 
> machines are also multi-homed.  There are 3 ethernet interfaces on each 
> one, and the 10.1.0.0/16 network is the one that should be handling the 
> spread traffic.  Is this something that I need to use the D/C syntax to 
> make multicast work correctly?  It should know the interface to enable 
> by the address already on the config line, right?
> 
> a.www.peak.org [ /usr/local/etc ] # ifconfig bge0
> bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>         options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>
>         inet 10.1.4.1 netmask 0xffff0000 broadcast 10.1.255.255
>         inet6 fe80::206:5bff:feef:713c%bge0 prefixlen 64 scopeid 0x1
>         ether 00:06:5b:ef:71:3c
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
> 
> b.www.peak.org [ ~/spread-src-3.17.1 ] # ifconfig bge0
> bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>         options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>
>         inet 10.1.4.2 netmask 0xffff0000 broadcast 10.1.255.255
>         inet6 fe80::206:5bff:feef:7168%bge0 prefixlen 64 scopeid 0x1
>         ether 00:06:5b:ef:71:68
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
> 
> a.monitor.peak.org [ /usr/local/etc ] # ifconfig vlan20
> vlan20: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1496
>         inet 10.1.255.253 netmask 0xffff0000 broadcast 10.1.255.255
>         ether 00:06:5b:ec:5c:95
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
>         vlan: 20 parent interface: em0
> 
> Our switch could be flakey -- but I'm seeing no errors on the ethernet 
> interfaces themselves.
> 
> >You could use a multicast address instead of the 10.1.255.255 (which
> >does not work for you) - such as 225.10.1.4. That would probably
> >work for you. Otherwise, make sure all of the machines are set
> >correctly with the broadcast address.
> >
> 
> Like I say, with multicast it seems like it reverts completely to 
> unicast to try to make the ring work.
> 
> >There were several similar cases that were discussed on the mailing
> >list in the past.
> >
> >     :) Yair.
> 
> Thanks so much for your reply Yair -- I really appreciate the 
> assistance.
> 
> --
> Jeremy C. McDermond                                                     
>   mcdermj at peak.org
> Lead Engineer
> Peak Internet, LLC                                                      
>                 (541) 738-4921
> 
> 
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users

-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------




More information about the Spread-users mailing list