[Spread-users] Retransmission and Connection Problems

Yair Amir yairamir at cnds.jhu.edu
Thu Oct 2 18:46:51 EDT 2003


Hi Jeremy,

It seems that your broadcast address does not work correctly on all of
the machines. This is why I am sure it works with any two machines and
will not work for you with three or more machines.

You could use a multicast address instead of the 10.1.255.255 (which
does not work for you) - such as 225.10.1.4. That would probably
work for you. Otherwise, make sure all of the machines are set
correctly with the broadcast address.

There were several similar cases that were discussed on the mailing
list in the past.

     :) Yair.
     

On Thursday, October 02, 2003 6:29 PM
Jeremy McDermond mcdermj at peak.org wrote:

Jeremy> I feel kinda like an idiot, but I've been fighting these problems for a 
Jeremy> few days now, and feel like I'm at the end of my rope.  I am attempting 
Jeremy> to use spread for apache logging using mod_log_spread and spreadlogd.  
Jeremy> I have 3 machines in the spread ring, two webservers, and a log server. 
Jeremy>   My spread.conf on all 3 machines looks like:

Jeremy> DebugFlags = { PRINT EXIT }
Jeremy> EventLogFile = /var/log/spread.log
Jeremy> EventTimeStamp
Jeremy> DangerousMonitor = false
Jeremy> SocketPortReuse = AUTO
Jeremy> RuntimeDir = /var/run/spread
Jeremy> DaemonUser = spread
Jeremy> DaemonGroup = spread

Jeremy> RequiredAuthMethods = "NULL"
Jeremy> AllowedAuthMethods = "NULL"

Jeremy> #Set the current access control policy.
Jeremy> # This is only needed if you want to establish a customized policy.
Jeremy> # The default policy is to allow any actions by authenticated clients.
Jeremy> AccessControlPolicy = "PERMIT"

Jeremy> Spread_Segment 10.1.255.255:3333 {
Jeremy>    a.monitor.peak.org  10.1.255.253
Jeremy>    a.www.peak.org      10.1.4.1
Jeremy>    b.www.peak.org      10.1.4.2
Jeremy>    sysadmin01.peak.org 10.1.255.254
Jeremy> }

Jeremy> There's an extra machine in there that we use for test log catching 
Jeremy> every once in a while, but I'm not putting it in the ring right now.  
Jeremy> The first issue I have is that the apache servers seem to not be able 
Jeremy> to contact their local spread daemon on a UNIX socket.  I get 
Jeremy> "connection refused" errors in the error log:

Jeremy> Thu Oct  2 22:16:53 2003] [error] (61)Connection refused: Could not 
Jeremy> connect to spread 3333 with private_name ap15031. Error -2

Jeremy> The socket (/tmp/3333) seems to be there and available, and I can 
Jeremy> connect with 'spuser -r -s 3333' to the ring.

Jeremy> The second issue is that, on the off chance I can get apache to 
Jeremy> connect, I get tons of retransmissions on the ring as measured using 
Jeremy> spmonitor.  It also seems to impact the timely delivery of the log 
Jeremy> records to the log writers.  Even with a completely idle ring, I'm 
Jeremy> seeing retransmissions of 601, when there's 258 sent packets:

Jeremy> ============================
Jeremy> Status at b.www.peak.org V 3.17. 1 (state 1, gstate 1) after 830 
Jeremy> seconds :
Jeremy> Membership  :  3  procs in 1 segments, leader is 167903229
Jeremy> rounds   :  327621      tok_hurry :     714     memb change:       3
Jeremy> sent pack:     260      recv pack :     346     retrans    :     609
Jeremy> u retrans:     268      s retrans :     341     b retrans  :       0
Jeremy> My_aru   :     612      Aru       :     612     Highest seq:     612
Jeremy> Sessions :      68      Groups    :       0     Window     :      60
Jeremy> Deliver M:     616      Deliver Pk:     621     Pers Window:      15
Jeremy> Delta Mes:       7      Delta Pack:       0     Delta sec  :     114
Jeremy> ==================================

Jeremy> I'm kinda at the end of my rope, I've checked the archives of this 
Jeremy> list, and haven't found anything really useful.  I'm running FreeBSD 
Jeremy> 5.1R on Intel Xeon procs.  Anyone have any ideas?

Jeremy> --
Jeremy> Jeremy C. McDermond                                                     
Jeremy>    mcdermj at peak.org
Jeremy> Lead Engineer
Jeremy> Peak Internet, LLC                                                      
Jeremy>                  (541) 738-4921


Jeremy> _______________________________________________
Jeremy> Spread-users mailing list
Jeremy> Spread-users at lists.spread.org
Jeremy> http://lists.spread.org/mailman/listinfo/spread-users






More information about the Spread-users mailing list