[Spread-users] high cpu load
Dirk Vleugels
dvl at 2scale.net
Fri Aug 31 11:00:49 EDT 2001
Hi,
On Fri, Aug 31, 2001 at 10:29:13AM -0400, Yair Amir wrote:
> On a one segment network, unicat retrans are ALWAYS go only to the
> daemon immediately before (in a circular fashion) the one that reports
> sending the u retrans. That is why I asked to see all of the reports
> and not only from one machine.
Ok, these are status messages from all cluster members, two samples with
a delta of 10 seconds. cluster2 has a very low 'u retrans' count, i have
no clue why.
cluster1:
Status at cluster1 V 3.16. 0 (state 1, gstate 1) after 1109946 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1037983201 tok_hurry : 2980781 memb change: 6
sent pack: 10469332 recv pack : 32696435 retrans : 10040412
u retrans: 9774822 s retrans : 265590 b retrans : 0
My_aru : 440354 Aru : 440354 Highest seq: 440354
Sessions : 183 Groups : 3 Window : 60
Deliver M: 44489748 Deliver Pk: 44640025 Pers Window: 15
Delta Mes: 44489748 Delta Pack: 440354 Delta sec : 1109946
==================================
Monitor> Monitor: send status query
============================
Status at cluster1 V 3.16. 0 (state 1, gstate 1) after 1109956 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1037995305 tok_hurry : 2980794 memb change: 6
sent pack: 10469537 recv pack : 32696830 retrans : 10040622
u retrans: 9775032 s retrans : 265590 b retrans : 0
My_aru : 440951 Aru : 440950 Highest seq: 440951
Sessions : 183 Groups : 3 Window : 60
Deliver M: 44490343 Deliver Pk: 44640622 Pers Window: 15
Delta Mes: 595 Delta Pack: 596 Delta sec : 10
cluster2:
Status at cluster2 V 3.16. 0 (state 1, gstate 1) after 1110277 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1038077390 tok_hurry : 2890797 memb change: 8
sent pack: 10008960 recv pack : 42422848 retrans : 1203859
u retrans: 416 s retrans : 1203443 b retrans : 0
My_aru : 447204 Aru : 447204 Highest seq: 447204
Sessions : 144 Groups : 3 Window : 60
Deliver M: 44496594 Deliver Pk: 44646887 Pers Window: 15
Delta Mes: 44496594 Delta Pack: 447204 Delta sec : 1110277
==================================
Monitor> Monitor: send status query
============================
Status at cluster2 V 3.16. 0 (state 1, gstate 1) after 1110287 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1038090613 tok_hurry : 2890806 memb change: 8
sent pack: 10009046 recv pack : 42423589 retrans : 1203949
u retrans: 416 s retrans : 1203533 b retrans : 0
My_aru : 447919 Aru : 447919 Highest seq: 447919
Sessions : 144 Groups : 3 Window : 60
Deliver M: 44497306 Deliver Pk: 44647602 Pers Window: 15
Delta Mes: 712 Delta Pack: 715 Delta sec : 10
cluster3:
Status at cluster3 V 3.16. 0 (state 1, gstate 1) after 1110543 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1038118263 tok_hurry : 2890932 memb change: 10
sent pack: 10166703 recv pack : 42931735 retrans : 10321326
u retrans: 10197701 s retrans : 123625 b retrans : 0
My_aru : 449597 Aru : 449597 Highest seq: 449597
Sessions : 128 Groups : 3 Window : 60
Deliver M: 44498985 Deliver Pk: 44649290 Pers Window: 15
Delta Mes: 44498985 Delta Pack: 449597 Delta sec : 1110543
==================================
Monitor> Monitor: send status query
============================
Status at cluster3 V 3.16. 0 (state 1, gstate 1) after 1110553 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1038130650 tok_hurry : 2890935 memb change: 10
sent pack: 10166712 recv pack : 42932830 retrans : 10321499
u retrans: 10197868 s retrans : 123631 b retrans : 0
My_aru : 450564 Aru : 450563 Highest seq: 450564
Sessions : 128 Groups : 3 Window : 60
Deliver M: 44499950 Deliver Pk: 44650257 Pers Window: 15
Delta Mes: 965 Delta Pack: 966 Delta sec : 10
cluster4:
Status at cluster4 V 3.16. 0 (state 1, gstate 1) after 1110701 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1038160146 tok_hurry : 2890996 memb change: 12
sent pack: 10426772 recv pack : 41442543 retrans : 10214651
u retrans: 9781646 s retrans : 433005 b retrans : 0
My_aru : 452960 Aru : 452960 Highest seq: 452960
Sessions : 237 Groups : 3 Window : 60
Deliver M: 44502348 Deliver Pk: 44652679 Pers Window: 15
Delta Mes: 44502348 Delta Pack: 452960 Delta sec : 1110701
==================================
Monitor> Monitor: send status query
============================
Status at cluster4 V 3.16. 0 (state 1, gstate 1) after 1110711 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1038171603 tok_hurry : 2890997 memb change: 12
sent pack: 10427453 recv pack : 41442982 retrans : 10214751
u retrans: 9781746 s retrans : 433005 b retrans : 0
My_aru : 454007 Aru : 454007 Highest seq: 454007
Sessions : 237 Groups : 3 Window : 60
Deliver M: 44503387 Deliver Pk: 44653726 Pers Window: 15
Delta Mes: 1039 Delta Pack: 1047 Delta sec : 10
loghost:
Status at loghost V 3.16. 0 (state 1, gstate 1) after 980920 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1016056425 tok_hurry : 2670049 memb change: 4
sent pack: 2 recv pack : 52600503 retrans : 9147503
u retrans: 8805205 s retrans : 342298 b retrans : 0
My_aru : 455683 Aru : 455683 Highest seq: 455683
Sessions : 1 Groups : 3 Window : 60
Deliver M: 44280203 Deliver Pk: 44429302 Pers Window: 15
Delta Mes: 44280203 Delta Pack: 455683 Delta sec : 980920
==================================
Monitor> Monitor: send status query
============================
Status at loghost V 3.16. 0 (state 1, gstate 1) after 980930 seconds :
Membership : 5 procs in 1 segments, leader is cluster1
rounds : 1016068733 tok_hurry : 2670050 memb change: 4
sent pack: 2 recv pack : 52601860 retrans : 9147508
u retrans: 8805206 s retrans : 342302 b retrans : 0
My_aru : 456573 Aru : 456573 Highest seq: 456573
Sessions : 1 Groups : 3 Window : 60
Deliver M: 44281092 Deliver Pk: 44430192 Pers Window: 15
Delta Mes: 889 Delta Pack: 890 Delta sec : 10
Flow control:
Flow Control Parameters:
------------------------
Window size: 0
cluster1 0
cluster2 0
cluster3 0
cluster4 0
loghost 0
The system is in production, so we can't debug spread to our liking ...
> The way I see it: it is either one network card there is bad / lacking buffers
> or because of the very high CPU usage (which I don't know why it happens)
> the daemon just misses messages.
I tried to check s - r communication from any host to the loghost, the
loss ratio is very low (seldom 0.1 - 0.2 %, mostly 0%).
> An option is to run flooder on a clean system there and is what happens.
> Now, flow control can be tuned to not loose messages even in a busy system.
> The flow control parameters in the vanilla version should be usually ok
> (e.g. conservative). I assume Spread was not changed. Do you run
> our own build or did you build it yourself?
Which flow-control settings should be tried?
Cheers,
Dirk
More information about the Spread-users
mailing list