[Spread-users] high cpu load

George Schlossnagle george at omniti.com
Fri Aug 31 11:17:48 EDT 2001


Also, a really really hackish way around your problem would be to run a
single spread daemon on every host, in it's own provate ring, then have
spreadlogd connect to each of those daemons directly (so you don't really
have any rings).  Extremely hackish, but it may be worth a shot until these
problems get nailed down.

George

----- Original Message -----
From: "Dirk Vleugels" <dvl at 2scale.net>
To: "Yair Amir" <yairamir at cnds.jhu.edu>
Cc: <spread-users at lists.spread.org>
Sent: Friday, August 31, 2001 11:00 AM
Subject: Re: [Spread-users] high cpu load


> Hi,
>
> On Fri, Aug 31, 2001 at 10:29:13AM -0400, Yair Amir wrote:
> > On a one segment network, unicat retrans are ALWAYS go only to the
> > daemon immediately before (in a circular fashion) the one that reports
> > sending the u retrans. That is why I asked to see all of the reports
> > and not only from one machine.
>
> Ok, these are status messages from all cluster members, two samples with
> a delta of 10 seconds. cluster2 has a very low 'u retrans' count, i have
> no clue why.
>
> cluster1:
>
> Status at cluster1 V 3.16. 0 (state 1, gstate 1) after 1109946 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1037983201   tok_hurry : 2980781     memb change:       6
> sent pack: 10469332     recv pack : 32696435    retrans    : 10040412
> u retrans: 9774822      s retrans :  265590     b retrans  :       0
> My_aru   :  440354      Aru       :  440354     Highest seq:  440354
> Sessions :     183      Groups    :       3     Window     :      60
> Deliver M: 44489748     Deliver Pk: 44640025    Pers Window:      15
> Delta Mes: 44489748     Delta Pack:  440354     Delta sec  : 1109946
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster1 V 3.16. 0 (state 1, gstate 1) after 1109956 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1037995305   tok_hurry : 2980794     memb change:       6
> sent pack: 10469537     recv pack : 32696830    retrans    : 10040622
> u retrans: 9775032      s retrans :  265590     b retrans  :       0
> My_aru   :  440951      Aru       :  440950     Highest seq:  440951
> Sessions :     183      Groups    :       3     Window     :      60
> Deliver M: 44490343     Deliver Pk: 44640622    Pers Window:      15
> Delta Mes:     595      Delta Pack:     596     Delta sec  :      10
>
>
> cluster2:
>
> Status at cluster2 V 3.16. 0 (state 1, gstate 1) after 1110277 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1038077390   tok_hurry : 2890797     memb change:       8
> sent pack: 10008960     recv pack : 42422848    retrans    : 1203859
> u retrans:     416      s retrans : 1203443     b retrans  :       0
> My_aru   :  447204      Aru       :  447204     Highest seq:  447204
> Sessions :     144      Groups    :       3     Window     :      60
> Deliver M: 44496594     Deliver Pk: 44646887    Pers Window:      15
> Delta Mes: 44496594     Delta Pack:  447204     Delta sec  : 1110277
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster2 V 3.16. 0 (state 1, gstate 1) after 1110287 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1038090613   tok_hurry : 2890806     memb change:       8
> sent pack: 10009046     recv pack : 42423589    retrans    : 1203949
> u retrans:     416      s retrans : 1203533     b retrans  :       0
> My_aru   :  447919      Aru       :  447919     Highest seq:  447919
> Sessions :     144      Groups    :       3     Window     :      60
> Deliver M: 44497306     Deliver Pk: 44647602    Pers Window:      15
> Delta Mes:     712      Delta Pack:     715     Delta sec  :      10
>
> cluster3:
>
> Status at cluster3 V 3.16. 0 (state 1, gstate 1) after 1110543 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1038118263   tok_hurry : 2890932     memb change:      10
> sent pack: 10166703     recv pack : 42931735    retrans    : 10321326
> u retrans: 10197701     s retrans :  123625     b retrans  :       0
> My_aru   :  449597      Aru       :  449597     Highest seq:  449597
> Sessions :     128      Groups    :       3     Window     :      60
> Deliver M: 44498985     Deliver Pk: 44649290    Pers Window:      15
> Delta Mes: 44498985     Delta Pack:  449597     Delta sec  : 1110543
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster3 V 3.16. 0 (state 1, gstate 1) after 1110553 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1038130650   tok_hurry : 2890935     memb change:      10
> sent pack: 10166712     recv pack : 42932830    retrans    : 10321499
> u retrans: 10197868     s retrans :  123631     b retrans  :       0
> My_aru   :  450564      Aru       :  450563     Highest seq:  450564
> Sessions :     128      Groups    :       3     Window     :      60
> Deliver M: 44499950     Deliver Pk: 44650257    Pers Window:      15
> Delta Mes:     965      Delta Pack:     966     Delta sec  :      10
>
> cluster4:
>
> Status at cluster4 V 3.16. 0 (state 1, gstate 1) after 1110701 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1038160146   tok_hurry : 2890996     memb change:      12
> sent pack: 10426772     recv pack : 41442543    retrans    : 10214651
> u retrans: 9781646      s retrans :  433005     b retrans  :       0
> My_aru   :  452960      Aru       :  452960     Highest seq:  452960
> Sessions :     237      Groups    :       3     Window     :      60
> Deliver M: 44502348     Deliver Pk: 44652679    Pers Window:      15
> Delta Mes: 44502348     Delta Pack:  452960     Delta sec  : 1110701
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at cluster4 V 3.16. 0 (state 1, gstate 1) after 1110711 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1038171603   tok_hurry : 2890997     memb change:      12
> sent pack: 10427453     recv pack : 41442982    retrans    : 10214751
> u retrans: 9781746      s retrans :  433005     b retrans  :       0
> My_aru   :  454007      Aru       :  454007     Highest seq:  454007
> Sessions :     237      Groups    :       3     Window     :      60
> Deliver M: 44503387     Deliver Pk: 44653726    Pers Window:      15
> Delta Mes:    1039      Delta Pack:    1047     Delta sec  :      10
>
> loghost:
>
> Status at loghost V 3.16. 0 (state 1, gstate 1) after 980920 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1016056425   tok_hurry : 2670049     memb change:       4
> sent pack:       2      recv pack : 52600503    retrans    : 9147503
> u retrans: 8805205      s retrans :  342298     b retrans  :       0
> My_aru   :  455683      Aru       :  455683     Highest seq:  455683
> Sessions :       1      Groups    :       3     Window     :      60
> Deliver M: 44280203     Deliver Pk: 44429302    Pers Window:      15
> Delta Mes: 44280203     Delta Pack:  455683     Delta sec  :  980920
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at loghost V 3.16. 0 (state 1, gstate 1) after 980930 seconds :
> Membership  :  5  procs in 1 segments, leader is cluster1
> rounds   : 1016068733   tok_hurry : 2670050     memb change:       4
> sent pack:       2      recv pack : 52601860    retrans    : 9147508
> u retrans: 8805206      s retrans :  342302     b retrans  :       0
> My_aru   :  456573      Aru       :  456573     Highest seq:  456573
> Sessions :       1      Groups    :       3     Window     :      60
> Deliver M: 44281092     Deliver Pk: 44430192    Pers Window:      15
> Delta Mes:     889      Delta Pack:     890     Delta sec  :      10
>
> Flow control:
>
> Flow Control Parameters:
> ------------------------
>
> Window size:  0
>
>         cluster1        0
>         cluster2        0
>         cluster3        0
>         cluster4        0
>         loghost 0
>
> The system is in production, so we can't debug spread to our liking ...
>
> > The way I see it: it is either one network card there is bad / lacking
buffers
> > or because of the very high CPU usage (which I don't know why it
happens)
> > the daemon just misses messages.
>
> I tried to check s - r communication from any host to the loghost, the
> loss ratio is very low (seldom 0.1 - 0.2 %, mostly 0%).
>
> > An option is to run flooder on a clean system there and is what happens.
> > Now, flow control can be tuned to not loose messages even in a busy
system.
> > The flow control parameters in the vanilla version should be usually ok
> > (e.g. conservative). I assume Spread was not changed. Do you run
> > our own build or did you build it yourself?
>
> Which flow-control settings should be tried?
>
> Cheers,
> Dirk
>
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>







More information about the Spread-users mailing list