[Spread-users] RPMs for spread/wackamole/etc and wackamole issues

Jordan Mendler jmendler at ucla.edu
Thu Sep 13 18:25:35 EDT 2007


Hi all,

I want to preface by letting everyone know that I have built some RPMs for
spread, wackamole, mod_log_spread and spreadlogd. They are available from
http://biopackages.net/ for CentOS4 and some Fedora distributions. They will
eventually be built for every CentOS and Fedora release when we expand our
repository. And we have some updated RPMs in our testing repository that
should be pushed out in a few days. The SRPMs are also available if you
would like to build your own.


Now onto the problem...

In configuring Wackamole, I have been having some issues, so I am hoping
that someone will be able to help me get this working. My setup is CentOS4
x86_64 (linux 2.6.x) if that matters. Wackamole/spread builds and installs
fine and spread works fine. Wackamole starts fine and seems to think it
works, but in reality it does not. Therefore I am not sure if this is an
issue with Wackamole's interaction with OS, or what. I have been following
Theo's "Scalable Internet Architectures" book in an attempt to get wackamole
setup.  This is also in a testing environment (10.x.x.x IP's), before we
tried it with real IPs and had the same issue.

What happens:
-Start spread on both systems, everything works fine. It is configured to
use 10.67.183.121 and .122, respectively, which are both setup on eth1 and
independent of the wackamole IPs. The same thing happens if we setup spread
on the same IP as wackamole.
-Start wackamole on both systems and:
   (1) eth0 which is configured with 10.67.183.116 on one and
10.67.183.117on the other is taken down by wackamole such that an
'ifconfig' only shows
eth1 up
   (2) wackatrl -l appears to be working properly, showing the following on
each system:
   Owner: 10.67.183.116
           *    eth0:10.67.183.116/32
   Owner: 10.67.183.117
           *    eth0:10.67.183.117/32

Despite #2, neither machine brings up .116 or .117. There is obviously
something going on, because from another machine I can still ping/ssh into
116 and 117, which may be as a result of arp.

At this point if I kill spread on one of the 2 machines (say the one with
.117), wackatrl -l shows what appears to be correct:
Owner: 10.67.183.116
        *    eth0:10.67.183.116/32
Owner: 10.67.183.116
        *    eth0:10.67.183.117/32

Despite what wackatrl thinks, I am now able to ping/ssh into only one of the
IPs, and the IP of the machine that was taken down is not brought up on the
other machine. The whole thing seems to be acting weird.


The only indication I can find is /var/log/messages which shows the
following on both machines. I am not sure if this is a 2.6 kernel not
supported issue (hopefully not, cause I would really like to get wackamole
working):
Sep 13 07:44:50 JMM1 wackamole[26151]: connecting to 4803
Sep 13 07:44:50 JMM1 wackamole: wackamole startup succeeded
Sep 13 07:44:50 JMM1 wackamole[26151]: DOWN: eth0:
10.67.183.116/255.255.255.0
Sep 13 07:44:50 JMM1 wackamole[26151]: 953 No such interface
Sep 13 07:45:02 JMM1 wackamole[26151]: 911 No such interface
Sep 13 07:45:02 JMM1 wackamole[26151]: Re-queued arp spoof notifier for
virtual entry.

Also, when I try a wackamole.conf with 4 IPs, wackatrl shows:
   Owner: 10.67.183.116
           *    eth0:10.67.183.116/32
   Owner: 10.67.183.117
           *    eth0:10.67.183.117/32
   Owner: 10.67.183.124
           *    eth0:10.67.183.124/32
   Owner: 10.67.183.125
           *    eth0:10.67.183.125/32


My configurations look as follows (it is the same on both machines), though
I have tried many other configurations such as changing /32 to /24, trying
other IPs and so on:
[root at JMM2 etc]# cat /etc/wackamole.conf
Spread = 4803
SpreadRetryInterval = 5s
Group = wack1
Control = /var/run/wack.it

Prefer None

VirtualInterfaces {
        { eth0:10.67.183.116/32 }
        { eth0:10.67.183.117/32 }
}

Arp-Cache = 10s
mature = 5s

Notify {
        eth0:10.67.183.1/32
        arp-cache
}

Balance {
        AcquisitionsPerRound = all
        interval = 4s
}

The importand part of Spead.conf is:
Spread_Segment  10.255.255.255:4803 {
        JMM1    10.67.183.121
        JMM2    10.67.183.122
}

Any assistance is greatly appreciated.

Thanks so much,
Jordan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.spread.org/pipermail/spread-users/attachments/20070913/310cbc81/attachment.html 


More information about the Spread-users mailing list