[Spread-users] Fw: [Wackamole-users] some answers and other questions...

Sumeet Pannu sumeetp at hotmail.com
Thu Jan 30 05:33:33 EST 2003


----- Original Message -----
From: "Sumeet Pannu" <sumeetp at hotmail.com>
To: <wackamole-users at lists.backhand.org>;
<spread-users-request at lists.spread.org>
Sent: Thursday, January 30, 2003 1:33 AM
Subject: Re: [Wackamole-users] some answers and other questions...


> well, this is a pretty quiet town. I've found out further that it is
infact
> wackamole dying. running wackamole -d on both hosts gets me an interface
up
> and running. Then doing a wackatrl -f drops the interface on the first
host
> and on the second gives me
> wackamole: wackamole.c:672: Send_state_message: Assertion 'ret ==
> My.num_allocated' failed."
> Aborted (core dumped)
>
> I already have a VIP of x.x.x.x/32 as mentioned in some FreeBSD fix, so
that
> isn't the problem. Some other workarounds were mentioned but I'm not sure
> what the person is trying to say. Any suggestions would be appreciated.
>
> So i guess wackamole is pretty broken in two respects, obviously the above
> thing shouldn't happen and secondly when wackamole on machine A realizes
> that wackamole process on machine B has died, the correct behaviour would
be
> for it to snatch the ip right back, even though this may cause a core dump
> on the original host as well. On top of this spread still experiences a
core
> dump when i try to start it as a second interface. Maybe these problems
are
> related.
> Thanks, sumeet.
>  Sample testlog.out:
> handle_events: select with timeout (1, 999948)
> E_handle_events: next event
> E_handle_events: exec time event
> new: reusing pointer 0x813ee58 to object type 35 named time_event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> DL_send: sent a message of 28 bytes to (192.168.0.2,4804) on channel 5
> Prot_token_hurry: retransmiting token 9 1
> dispose: disposing pointer 0x813ee80 to object type 35 named time_event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999891)
> E_handle_events: exec handler for fd 4, fd_type 0, priority 1
> DL_recv: received 28 bytes on channel 4
> Received Token
> new: reusing pointer 0x813ee80 to object type 35 named time_event
> dispose: disposing pointer 0x813ee58 to object type 35 named time_event
> E_queue: dequeued a (first) simillar event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> new: reusing pointer 0x813ee58 to object type 35 named time_event
> dispose: disposing pointer 0x81851b8 to object type 35 named time_event
> E_queue: dequeued a simillar event
> E_queue: (last) event queued func 0x8054a04 code 0 data 0x0 in future
(5:0)
> dispose: disposing pointer 0x813edd8 to object type 8 named token_head_obj
> new: reusing pointer 0x813edd8 to object type 8 named token_head_obj
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999946)
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (0, 587)
> E_handle_events: next event
> E_handle_events: exec time event
> new: reusing pointer 0x81851b8 to object type 35 named time_event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> DL_send: sent a message of 28 bytes to (192.168.0.2,4804) on channel 5
> Prot_token_hurry: retransmiting token 10 1
> dispose: disposing pointer 0x813ee80 to object type 35 named time_event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999893)
> E_handle_events: exec handler for fd 4, fd_type 0, priority 1
> DL_recv: received 28 bytes on channel 4
> Received Token
> new: reusing pointer 0x813ee80 to object type 35 named time_event
> dispose: disposing pointer 0x81851b8 to object type 35 named time_event
> E_queue: dequeued a (first) simillar event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> new: reusing pointer 0x81851b8 to object type 35 named time_event
> dispose: disposing pointer 0x813ee58 to object type 35 named time_event
> E_queue: dequeued a simillar event
> E_queue: (last) event queued func 0x8054a04 code 0 data 0x0 in future
(5:0)
> dispose: disposing pointer 0x813ee00 to object type 8 named token_head_obj
> new: reusing pointer 0x813ee00 to object type 8 named token_head_obj
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999948)
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (0, 41)
> E_handle_events: next event
> E_handle_events: exec time event
> new: reusing pointer 0x813ee58 to object type 35 named time_event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> DL_send: sent a message of 28 bytes to (192.168.0.2,4804) on channel 5
> Prot_token_hurry: retransmiting token 11 1
> dispose: disposing pointer 0x813ee80 to object type 35 named time_event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999881)
> E_handle_events: exec handler for fd 4, fd_type 0, priority 1
> DL_recv: received 28 bytes on channel 4
> Received Token
> new: reusing pointer 0x813ee80 to object type 35 named time_event
> dispose: disposing pointer 0x813ee58 to object type 35 named time_event
> E_queue: dequeued a (first) simillar event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> new: reusing pointer 0x813ee58 to object type 35 named time_event
> dispose: disposing pointer 0x81851b8 to object type 35 named time_event
> E_queue: dequeued a simillar event
> E_queue: (last) event queued func 0x8054a04 code 0 data 0x0 in future
(5:0)
> dispose: disposing pointer 0x813edd8 to object type 8 named token_head_obj
> new: reusing pointer 0x813edd8 to object type 8 named token_head_obj
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999947)
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (0, 325)
> E_handle_events: next event
> E_handle_events: exec time event
> new: reusing pointer 0x81851b8 to object type 35 named time_event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> DL_send: sent a message of 28 bytes to (192.168.0.2,4804) on channel 5
> Prot_token_hurry: retransmiting token 12 1
> dispose: disposing pointer 0x813ee80 to object type 35 named time_event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999896)
> E_handle_events: exec handler for fd 4, fd_type 0, priority 1
> DL_recv: received 28 bytes on channel 4
> Received Token
> new: reusing pointer 0x813ee80 to object type 35 named time_event
> dispose: disposing pointer 0x81851b8 to object type 35 named time_event
> E_queue: dequeued a (first) simillar event
> E_queue: (first) event queued func 0x804b498 code 0 data 0x0 in future
(2:0)
> new: reusing pointer 0x81851b8 to object type 35 named time_event
> dispose: disposing pointer 0x813ee58 to object type 35 named time_event
> E_queue: dequeued a simillar event
> E_queue: (last) event queued func 0x8054a04 code 0 data 0x0 in future
(5:0)
> dispose: disposing pointer 0x813ee00 to object type 8 named token_head_obj
> new: reusing pointer 0x813ee00 to object type 8 named token_head_obj
> E_handle_events: next event
> E_handle_events: poll select
> E_handle_events: select with timeout (1, 999947)
> E_handle_events: next event
> ----- Original Message -----
> From: "Sumeet" <sumeetp at hotmail.com>
> To: <wackamole-users at lists.backhand.org>
> Sent: Tuesday, January 28, 2003 4:12 PM
> Subject: Re: [Wackamole-users] some answers and other questions...
>
>
> > thanks for the response.
> > >fewer Spread daemon's than Wackamole hosts
> > Does this imply that i have started another wackamole session on the
same
> > machine w/o killing another one? I can buy that.
> > However if i have only one interface i still get an error. I don't know
if
> > that has anything to do with the -11 and -6 spread errors.
> > I guess i can live with that as long as spread/wackamole works in a
> > perdicatable manner.
> > thanks again for your help.
> > ps. should i be cross posting this stuff?
> >
> > wackamole.conf
> > Spread = 4803
> > SpreadRetryInterval = 5s
> > Group = wack1
> > Control = /var/run/wack.it
> > Prefer None
> > VirtualInterfaces {
> >         { eth2:10.1.1.2/32 }
> > }
> > Arp-Cache = 90s
> > Notify {
> >         eth0:10.1.1.6/32
> >         eth0:10.1.1.4/32
> >         eth0:10.1.1.5.2/32
> >         eth2:192.168.0.0/24 throttle 128
> >         arp-cache
> > }
> > balance {
> >         AcquisitionsPerRound = all
> >         interval = 4s
> > }
> > mature = 5s
> >
> > spread.conf:
> > Spread_Segment  192.168.0.255:4803 {
> >
> >         spokanea        192.168.0.1
> >         spokaneb        192.168.0.2
> > }
> > DaemonUser = spread
> > DaemonGroup = spread
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "Ryan Caudy" <caudy at jhu.edu>
> > To: <wackamole-users at lists.backhand.org>
> > Sent: Tuesday, January 28, 2003 1:34 PM
> > Subject: Re: [Wackamole-users] some answers and other questions...
> >
> >
> > > I guess you've found out some of this stuff by yourself already.
> > >
> > > First, the change between 1.2.0 and 2.0.0 for configuration files
> > > basically consists of a more intuitive, flexible syntax.  "vip" on the
> > > old configuration was for a preferred address, and "of" was for the
> > > virtual interfaces to be managed.  In 2.0.0, you have "Prefer" and
> > > "VirtualInterfaces" declarations, and you can explicitly prefer None,
> > > rather than the dummy address 0.0.0.0.
> > >
> > > If the virtual interface lists between two daemons are not the same, I
> > > don't think you can expect things to work correctly, as this was
> > > originally an assumption in the system.  In your system, it sounds
like
> > > you want the main webserver to prefer the virtual address being
managed,
> > > the backup to Prefer None, and both to have the same list of virtual
> > > addresses (i.e. eth0:10.1.1.2/32).
> > >
> > > Connect failed and illegal session are complaints from the Spread
> > > library routines used to connect to Spread and join the appropriate
> > > group.  Could you send your current wackamole.conf and spread.conf
> files?
> > >
> > > The -6 return code translates to REJECT_NOT_UNIQUE.  This may happen
if
> > > the Wackamole daemon died without properly disconnecting from
Spread...
> > > after a short interval Spread should accept the same private group
name
> > > again, however.  The other possible cause for this is trying to use
> > > fewer Spread daemon's than Wackamole hosts.
> > >
> > > --Ryan
> > >
> > > Sumeet wrote:
> > > > So, i guess to answer some of my own questions, yes, i should have
the
> > > > same wackamole.conf for each box with the VIP the same for both. The
> > > > Changelog reference in 1.2.0 was referring to having more real
> machines
> > > > than VIPs by setting VIP to 0.0.0.0 on some of them. Must have been
> > > > changed in 2.0.0. Oh, well. The core dump is still perdicatably
> > > > happening. I am working around it as mentioned below. So, my other
> > > > question is that on one of the machines the failover is not working
to
> > > > well. It keep getting these messages in var log messages. I assump
the
> > > > connect failed and Illegal session are bad. I'm going to be a total
> jerk
> > > > and cross post since it is awfully quiet around here:
> > > >
> > > > Jan 27 23:57:24 spokane wackamole[1147]: No such interface
> > > > Jan 27 23:57:26 spokane wackamole[1147]: connecting to 4803
> > > > Jan 27 23:57:26 spokane wackamole[1147]: Dequeued arp spoof
notifier.
> > > > Jan 27 23:57:26 spokane wackamole[1147]: No such interface
> > > > Jan 27 23:57:26 spokane wackamole[1147]: Spread connect failed [-6].
> > > > Jan 27 23:57:29 spokane wackamole[1147]: SP_error: (-11) Illegal
> session
> > > > was supplied
> > > > Jan 27 23:57:29 spokane wackamole[1147]: connecting to 4803
> > > > Jan 27 23:57:29 spokane wackamole[1147]: Dequeued arp spoof
notifier.
> > > > Jan 27 23:57:29 spokane wackamole[1147]: No such interface
> > > > Jan 27 23:57:31 spokane wackamole[1147]: connecting to 4803
> > > > Jan 27 23:57:31 spokane wackamole[1147]: Dequeued arp spoof
notifier.
> > > > Jan 27 23:57:31 spokane wackamole[1147]: No such interface
> > > > Jan 27 23:57:31 spokane wackamole[1147]: Spread connect failed [-6].
> > > >
> > > >     ----- Original Message -----
> > > >     *From:* Sumeet Pannu <mailto:sumeetp at hotmail.com>
> > > >     *To:* wackamole-users at lists.backhand.org
> > > >     <mailto:wackamole-users at lists.backhand.org>
> > > >     *Sent:* Sunday, January 26, 2003 1:43 AM
> > > >     *Subject:* [Wackamole-users] wackamole core dump
> > > >
> > > >     i would like to setup wackamole as a simple failover mechanism
for
> a
> > > >     stateless web server running linux 2.2.16.
> > > >     the web servers have IP addresses 192.168.0.1 and 192.168.0.2
> > > >     respectively
> > > >     i setup the following wackmole.conf on each:
> > > >
> > > >     Spread = 4803
> > > >     SpreadRetryInterval = 5s
> > > >     Group = wack1
> > > >     Control = /var/run/wack.it
> > > >     Prefer None
> > > >     VirtualInterfaces {
> > > >             { eth0:10.1.1.2/32 }
> > > >     }
> > > >     Arp-Cache = 90s
> > > >     Notify {
> > > >             eth0:10.1.1.5/32
> > > >             eth0:10.1.1.4/32
> > > >             eth0:10.1.1.6/32
> > > >             eth0:192.168.0.0/24 throttle 128
> > > >             arp-cache
> > > >     }
> > > >     balance {
> > > >              AcquisitionsPerRound = all
> > > >              interval = 4s
> > > >     }
> > > >     mature = 5s
> > > >
> > > >     My first question is -- would that suffice for a failover
> scenario,
> > > >     or should the .2 backup web server have a virtual interface of
> > > >     eth0:0.0.0.0/32 (as per a change log i read)?
> > > >     This of course is all theoretical since my real problem seems to
> be
> > > >     that wackamole seg faults when i try to start it. If i create a
> > > >     secondary interface (eth2 for eg) change references to eth2 for
> > > >     virtual interface it does not core dump. I can attach the core
if
> > > >     you are interested, but it is 7.6megs...
> > > >     spread seems to run fine (according to spmonitor and spuser
passes
> > > >     messages back and forth), although i do start it with a
spread -n
> > > >     hostname.
> > > >     thx for your time.
> > > >
> > >
> > >
> > >
> > > _______________________________________________
> > > wackamole-users mailing list
> > > wackamole-users at lists.backhand.org
> > > http://lists.backhand.org/mailman/listinfo/wackamole-users
> > >
> >
> > _______________________________________________
> > wackamole-users mailing list
> > wackamole-users at lists.backhand.org
> > http://lists.backhand.org/mailman/listinfo/wackamole-users
> >
>
> _______________________________________________
> wackamole-users mailing list
> wackamole-users at lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/wackamole-users
>




More information about the Spread-users mailing list