[Spread-users] high cpu load
George Schlossnagle
george at omniti.com
Fri Aug 31 11:17:48 EDT 2001
Also, a really really hackish way around your problem would be to run a
single spread daemon on every host, in it's own provate ring, then have
spreadlogd connect to each of those daemons directly (so you don't really
have any rings). Extremely hackish, but it may be worth a shot until these
problems get nailed down.
George
----- Original Message -----
From: "Dirk Vleugels" <dvl at 2scale.net>
To: "Yair Amir" <yairamir at cnds.jhu.edu>
Cc: <spread-users at lists.spread.org>
Sent: Friday, August 31, 2001 11:00 AM
Subject: Re: [Spread-users] high cpu load
> Hi,
>
> On Fri, Aug 31, 2001 at 10:29:13AM -0400, Yair Amir wrote:
> > On a one segment network, unicat retrans are ALWAYS go only to the
> > daemon immediately before (in a circular fashion) the one that reports
> > sending the u retrans. That is why I asked to see all of the reports
> > and not only from one machine.
>
> Ok, these are status messages from all cluster members, two samples with
> a delta of 10 seconds. cluster2 has a very low 'u retrans' count, i have
> no clue why.
>
> cluster1:
>
> Status at cluster1 V 3.16. 0 (state 1, gstate 1) after 1109946 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1037983201 tok_hurry : 2980781 memb change: 6
> sent pack: 10469332 recv pack : 32696435 retrans : 10040412
> u retrans: 9774822 s retrans : 265590 b retrans : 0
> My_aru : 440354 Aru : 440354 Highest seq: 440354
> Sessions : 183 Groups : 3 Window : 60
> Deliver M: 44489748 Deliver Pk: 44640025 Pers Window: 15
> Delta Mes: 44489748 Delta Pack: 440354 Delta sec : 1109946
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster1 V 3.16. 0 (state 1, gstate 1) after 1109956 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1037995305 tok_hurry : 2980794 memb change: 6
> sent pack: 10469537 recv pack : 32696830 retrans : 10040622
> u retrans: 9775032 s retrans : 265590 b retrans : 0
> My_aru : 440951 Aru : 440950 Highest seq: 440951
> Sessions : 183 Groups : 3 Window : 60
> Deliver M: 44490343 Deliver Pk: 44640622 Pers Window: 15
> Delta Mes: 595 Delta Pack: 596 Delta sec : 10
>
>
> cluster2:
>
> Status at cluster2 V 3.16. 0 (state 1, gstate 1) after 1110277 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1038077390 tok_hurry : 2890797 memb change: 8
> sent pack: 10008960 recv pack : 42422848 retrans : 1203859
> u retrans: 416 s retrans : 1203443 b retrans : 0
> My_aru : 447204 Aru : 447204 Highest seq: 447204
> Sessions : 144 Groups : 3 Window : 60
> Deliver M: 44496594 Deliver Pk: 44646887 Pers Window: 15
> Delta Mes: 44496594 Delta Pack: 447204 Delta sec : 1110277
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster2 V 3.16. 0 (state 1, gstate 1) after 1110287 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1038090613 tok_hurry : 2890806 memb change: 8
> sent pack: 10009046 recv pack : 42423589 retrans : 1203949
> u retrans: 416 s retrans : 1203533 b retrans : 0
> My_aru : 447919 Aru : 447919 Highest seq: 447919
> Sessions : 144 Groups : 3 Window : 60
> Deliver M: 44497306 Deliver Pk: 44647602 Pers Window: 15
> Delta Mes: 712 Delta Pack: 715 Delta sec : 10
>
> cluster3:
>
> Status at cluster3 V 3.16. 0 (state 1, gstate 1) after 1110543 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1038118263 tok_hurry : 2890932 memb change: 10
> sent pack: 10166703 recv pack : 42931735 retrans : 10321326
> u retrans: 10197701 s retrans : 123625 b retrans : 0
> My_aru : 449597 Aru : 449597 Highest seq: 449597
> Sessions : 128 Groups : 3 Window : 60
> Deliver M: 44498985 Deliver Pk: 44649290 Pers Window: 15
> Delta Mes: 44498985 Delta Pack: 449597 Delta sec : 1110543
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster3 V 3.16. 0 (state 1, gstate 1) after 1110553 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1038130650 tok_hurry : 2890935 memb change: 10
> sent pack: 10166712 recv pack : 42932830 retrans : 10321499
> u retrans: 10197868 s retrans : 123631 b retrans : 0
> My_aru : 450564 Aru : 450563 Highest seq: 450564
> Sessions : 128 Groups : 3 Window : 60
> Deliver M: 44499950 Deliver Pk: 44650257 Pers Window: 15
> Delta Mes: 965 Delta Pack: 966 Delta sec : 10
>
> cluster4:
>
> Status at cluster4 V 3.16. 0 (state 1, gstate 1) after 1110701 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1038160146 tok_hurry : 2890996 memb change: 12
> sent pack: 10426772 recv pack : 41442543 retrans : 10214651
> u retrans: 9781646 s retrans : 433005 b retrans : 0
> My_aru : 452960 Aru : 452960 Highest seq: 452960
> Sessions : 237 Groups : 3 Window : 60
> Deliver M: 44502348 Deliver Pk: 44652679 Pers Window: 15
> Delta Mes: 44502348 Delta Pack: 452960 Delta sec : 1110701
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster4 V 3.16. 0 (state 1, gstate 1) after 1110711 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1038171603 tok_hurry : 2890997 memb change: 12
> sent pack: 10427453 recv pack : 41442982 retrans : 10214751
> u retrans: 9781746 s retrans : 433005 b retrans : 0
> My_aru : 454007 Aru : 454007 Highest seq: 454007
> Sessions : 237 Groups : 3 Window : 60
> Deliver M: 44503387 Deliver Pk: 44653726 Pers Window: 15
> Delta Mes: 1039 Delta Pack: 1047 Delta sec : 10
>
> loghost:
>
> Status at loghost V 3.16. 0 (state 1, gstate 1) after 980920 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1016056425 tok_hurry : 2670049 memb change: 4
> sent pack: 2 recv pack : 52600503 retrans : 9147503
> u retrans: 8805205 s retrans : 342298 b retrans : 0
> My_aru : 455683 Aru : 455683 Highest seq: 455683
> Sessions : 1 Groups : 3 Window : 60
> Deliver M: 44280203 Deliver Pk: 44429302 Pers Window: 15
> Delta Mes: 44280203 Delta Pack: 455683 Delta sec : 980920
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at loghost V 3.16. 0 (state 1, gstate 1) after 980930 seconds :
> Membership : 5 procs in 1 segments, leader is cluster1
> rounds : 1016068733 tok_hurry : 2670050 memb change: 4
> sent pack: 2 recv pack : 52601860 retrans : 9147508
> u retrans: 8805206 s retrans : 342302 b retrans : 0
> My_aru : 456573 Aru : 456573 Highest seq: 456573
> Sessions : 1 Groups : 3 Window : 60
> Deliver M: 44281092 Deliver Pk: 44430192 Pers Window: 15
> Delta Mes: 889 Delta Pack: 890 Delta sec : 10
>
> Flow control:
>
> Flow Control Parameters:
> ------------------------
>
> Window size: 0
>
> cluster1 0
> cluster2 0
> cluster3 0
> cluster4 0
> loghost 0
>
> The system is in production, so we can't debug spread to our liking ...
>
> > The way I see it: it is either one network card there is bad / lacking
buffers
> > or because of the very high CPU usage (which I don't know why it
happens)
> > the daemon just misses messages.
>
> I tried to check s - r communication from any host to the loghost, the
> loss ratio is very low (seldom 0.1 - 0.2 %, mostly 0%).
>
> > An option is to run flooder on a clean system there and is what happens.
> > Now, flow control can be tuned to not loose messages even in a busy
system.
> > The flow control parameters in the vanilla version should be usually ok
> > (e.g. conservative). I assume Spread was not changed. Do you run
> > our own build or did you build it yourself?
>
> Which flow-control settings should be tried?
>
> Cheers,
> Dirk
>
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>
More information about the Spread-users
mailing list