[Spread-users] Need debugging advice
David Avraamides
David.Avraamides at SevernRiverCapital.com
Fri Oct 29 11:47:56 EDT 2004
Is there a win32 build of spmonitor? I don't see it in the 3.17.1
distribution.
-----Original Message-----
From: Jonathan Stanton [mailto:jonathan at cnds.jhu.edu]
Sent: Friday, October 29, 2004 11:10 AM
To: David Avraamides
Cc: spread-users at lists.spread.org
Subject: Re: [Spread-users] Need debugging advice
Hi,
I would first run the spmonitor program. Select the option to display
status information and select a few of your machines. That will report a
table of information about how the spread daemons are functioning.
You should see that the number of messages and packets that are being
sent. You can also check the "state" and 'gstate' of the daemons. If
they are not 1 and 1 then the daemons are in a membership change. If
that lasts for longer then 10-20 seconds (under normal load) then that's
a problem. It will also report how many daemons are in the currnet
membership and that should total the same as the number that you think
are running. If not then they might hav partitioned because of a network
problem.
The other useful information is in the logs that Spread generates. If
you have selected to log to a file in the spread.conf you can turn on
more DebugFlags in teh spread.conf and see more detailed information.
For example adding DATA_LINK will print out for every packet sent or
received (which will be a log of log records if the load is at all high,
but might show you an error that is occuring if you just run it for a
brief time.,
Jonathan
On Fri, Oct 29, 2004 at 10:08:51AM -0400, David Avraamides wrote:
> I'm trying to diagnose a problem that just came up on our spread-based
> messaging layer. For months we have had applications running fine in
> production and yesterday I noticed some problems. It seems I can only
> see messaging traffic when the client and server are both running on
> the same box. This was never a problem before and I can't think of
> anything that changed (no new software, no new config file, etc.).
> I've written my own application-level "sniff" tool, but its not
> helpful since its not seeing any cross-machine traffic. I was
> wondering if there are any spread-level sniffing/debugging tools that
> could help me understand what might be wrong.
>
> Thanks,
> -Dave
>
> --
>
> The relevant part of the config file I use is:
>
> Spread_Segment 10.10.1.255:4803 {
> ct-srvwebin-01
> ct-srvmon-01
> ct-srvapp-06
> ct-devbuild-01
> }
>
> Spread_Segment 10.10.2.255:4803 {
> ct-dev-01
> ct-dev-02
> ct-dev-04
> }
>
> And here is the log when I start up a daemon:
>
> ip_init: using file: spread.access_ip
> Conf_init: using file: spread.conf
> Successfully configured Segment 0 [10.10.1.255:4803] with 4 procs:
> ct-srvwebin-01: 10.10.1.28
> ct-srvmon-01: 10.10.1.37
> ct-srvapp-06: 10.10.1.117
> ct-devbuild-01: 10.10.1.110 Successfully configured
> Segment 1 [10.10.2.255:4803] with 3 procs:
> ct-dev-01: 10.10.2.20
> ct-dev-02: 10.10.2.41
> ct-dev-04: 10.10.2.50 Finished configuration file.
> Conf_init: My name: ct-dev-01, id: 10.10.2.20, port: 4803 Membership
> id is ( 168427804, 1099058653)
> --------------------
> Configuration at ct-dev-01 is:
> Num Segments 2
> 4 10.10.1.255 4803
> ct-srvwebin-01 10.10.1.28
> ct-srvmon-01 10.10.1.37
> ct-srvapp-06 10.10.1.117
> ct-devbuild-01 10.10.1.110
> 1 10.10.2.255 4803
> ct-dev-01 10.10.2.20
> ====================
> ++++++++++++++++++++++
> Num of groups: 3
> [1] group data with 4 members:
> [1] #r5694-216#ct-devbuild-01
> [2] #r7467-132#ct-srvwebin-01
> [3] #r8958-1920#ct-srvmon-01
> [4] #r9140-144#ct-srvwebin-01
> ----------------------
> [2] group mail with 3 members:
> [1] #r5694-216#ct-devbuild-01
> [2] #r8958-1920#ct-srvmon-01
> [3] #r9140-144#ct-srvwebin-01
> ----------------------
> [3] group xbtest with 3 members:
> [1] #r5694-216#ct-devbuild-01
> [2] #r8958-1920#ct-srvmon-01
> [3] #r9140-144#ct-srvwebin-01
> ----------------------
>
>
--
-------------------------------------------------------
Jonathan R. Stanton jonathan at cs.jhu.edu
Dept. of Computer Science
Johns Hopkins University
-------------------------------------------------------
More information about the Spread-users
mailing list