[Spread-users] More Performance Issues/Questions
Yair Amir
yairamir at cnds.jhu.edu
Mon Dec 20 21:39:36 EST 2004
Mike,
From the logs you sent I see that your machine frln22 is
loosing a lot of messages (~15%). You probably have something
wrong with your network card / network connection / operating system
on this machine.
This is the reason for the skewed latency. Using the fast timeouts
does not solve the problem - it just masks it, so that
you get .2 sec latency and not 2 sec latencies.
In my opinion, using the "fast settings" as you call it, is not
a good thing in general (unless you have some special needs - and
you don't). In regular settings, if it helps you, it means
something is not well with your basic network setup or that Spread
flow control is way to greedy, which out of the box should be
very conservative.
I would expect a problem like that if some of the senders are on
1Gig while some receivers (e.g. frln22) are on 100Mbits sec
and the switch is not doing its job well. Just a guess.
Take frln22 out and see what happens.
Cheers,
:) Yair.
Mike Perik wrote:
> I've resolved most of my issues and I'm seeing decent
> latency times.
>
> Occasionally, I'll see the receive function take 2
> seconds to receive something. The way I'm seeing this
> is that I'm calling gettimeofday() in the client
> before and after the receive call and diff'ing the
> times.
>
> Also, I'm putting the gettimeofday() data into the
> packet I'm sending to the client and on the client
> side I'm printing out both the server gettimeofday
> info and the client's gettimeofday info.
>
> A sample of the timings follows,
> The first column is the servers gettimeofday
> timestamp, the 2nd column is the clients gettimeofday
> timestamp and the third is the difference or latency.
>
> Has anyone seen this behavior before?
>
> 1103560465.254283,1103560465.254709,0.000425
> 1103560465.254544,1103560465.254739,0.000195
> 1103560465.254571,1103560465.254760,0.000189
> 1103560465.259511,1103560467.269183,2.009672
> 1103560465.272042,1103560467.269263,1.997220
> 1103560465.272302,1103560467.269290,1.996988
> 1103560465.273088,1103560467.269317,1.996229
>
> I reniced the client process to see if it was a
> scheduling problem but that did not help the
> situation. I started two more clients and they all
> experience this delay at the same time.
>
> I changed the membership.c timeouts to in accordance
> with what was mentioned back in 07/2004 called the
> "Very Fast" settings. The 2 second latencys stopped,
> although I still got a handful of .2 sec latencies.
> I then restored the timeouts back to what was
> originally there and the 2 second latencies came back.
>
> Here is output from the spmonitor utility during the
> test runs.
>
> ******************
> STATS w/ORIGINAL TIMEOUTS:
> ******************
>
> Monitor>
> ============================
> Status at wango V 3.17. 3 (state 1, gstate 1) after
> 1501 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 25902 tok_hurry : 5234 memb
> change: 2
> sent pack: 10 recv pack : 14537
> retrans : 2323
> u retrans: 2323 s retrans : 0 b
> retrans : 0
> My_aru : 56638 Aru : 56638
> Highest seq: 56638
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 56527 Deliver Pk: 56641 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : 7
> ==================================
>
> Monitor>
> ============================
> Status at gamma V 3.17. 3 (state 1, gstate 1) after
> 1499 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 25902 tok_hurry : 5234 memb
> change: 2
> sent pack: 10 recv pack : 14537
> retrans : 0
> u retrans: 0 s retrans : 0 b
> retrans : 0
> My_aru : 56638 Aru : 56638
> Highest seq: 56638
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 56527 Deliver Pk: 56641 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : -2
> ==================================
>
> Monitor>
> ============================
> Status at frln22 V 3.17. 3 (state 1, gstate 1) after
> 1503 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 25904 tok_hurry : 5627 memb
> change: 2
> sent pack: 10 recv pack : 16860
> retrans : 0
> u retrans: 0 s retrans : 0 b
> retrans : 0
> My_aru : 56638 Aru : 56638
> Highest seq: 56638
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 56527 Deliver Pk: 56641 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : 4
> ==================================
>
> Monitor> Monitor: send status query
>
> ============================
> Status at nero V 3.17. 3 (state 1, gstate 1) after
> 1504 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 25973 tok_hurry : 5305 memb
> change: 2
> sent pack: 14518 recv pack : 30
> retrans : 0
> u retrans: 0 s retrans : 0 b
> retrans : 0
> My_aru : 4 Aru : 4
> Highest seq: 4
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 56527 Deliver Pk: 56642 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: -56634 Delta
> sec : 1
> ==================================
>
>
> *******************************************
> Stats using the "Very Fast" timeouts:
> *******************************************
>
> Monitor> Monitor: send status query
>
> ============================
> Status at nero V 3.17. 3 (state 1, gstate 1) after
> 1678 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 120838 tok_hurry : 43947 memb
> change: 3
> sent pack: 37677 recv pack : 29
> retrans : 0
> u retrans: 0 s retrans : 0 b
> retrans : 0
> My_aru : 88568 Aru : 88568
> Highest seq: 88568
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 88457 Deliver Pk: 88573 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : 11
> ==================================
>
> Monitor>
> ============================
> Status at frln22 V 3.17. 3 (state 1, gstate 1) after
> 1685 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 120994 tok_hurry : 46051 memb
> change: 7
> sent pack: 12 recv pack : 44592
> retrans : 0
> u retrans: 0 s retrans : 0 b
> retrans : 0
> My_aru : 88568 Aru : 88568
> Highest seq: 88568
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 88457 Deliver Pk: 88577 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : 7
> ==================================
>
> Monitor>
> ============================
> Status at gamma V 3.17. 3 (state 1, gstate 1) after
> 1677 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 120812 tok_hurry : 43927 memb
> change: 2
> sent pack: 10 recv pack : 37694
> retrans : 0
> u retrans: 0 s retrans : 0 b
> retrans : 0
> My_aru : 88568 Aru : 88568
> Highest seq: 88568
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 88457 Deliver Pk: 88571 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : -8
> ==================================
>
> Monitor>
> ============================
> Status at wango V 3.17. 3 (state 1, gstate 1) after
> 1675 seconds :
> Membership : 4 procs in 1 segments, leader is
> frln22
> rounds : 120777 tok_hurry : 43898 memb
> change: 1
> sent pack: 8 recv pack : 37693
> retrans : 6894
> u retrans: 6894 s retrans : 0 b
> retrans : 0
> My_aru : 88568 Aru : 88568
> Highest seq: 88568
> Sessions : 0 Groups : 0 Window
> : 60
> Deliver M: 88457 Deliver Pk: 88568 Pers
> Window: 15
> Delta Mes: 0 Delta Pack: 0 Delta
> sec : -2
> ==================================
>
>
> Is it safe to use the "Very Fast" settings? The
> machines I'm using are on a mixed 100/1000 Mbit
> network and the machines are all 2+ Ghz machines. In
> the above tests I have only 4 machines in the segment
> all on the same network.
>
>
> Spread_Segment 225.0.1.1:5003 {
> frln22 10.0.103.100
> wango 10.0.103.102
> gamma 10.0.103.101
> nero 10.0.103.141
> }
>
> Additional info,
> I'm running both the server & client(s) on Redhat
> Linux machines with 2.4.x kernels.
>
>
> What is causing this 2+ sec. delay?
> Why does it go away with the "Very Fast" settings?
>
> Thanks,
> Mike
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Meet the all-new My Yahoo! - Try it today!
> http://my.yahoo.com
>
>
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>
>
More information about the Spread-users
mailing list