[Spread-users] Spread daemon ceases to work on mac, stuck processes

Tue May 7 12:29:16 EDT 2013

Hi,

On 07.05.13 18:15 schrieb John Schultz:
> Please post the config file you are actually using and how you are running spread from the command line.

The config is attached, apart from that I removed the logging again.
This is the default file generated from the source-based installation.

I am launching spread with "spread -n localhost"

> I waded through your log file.  It looks like what is happening is that the daemon is running through its normal membership algorithm, but when it sends the token to itself it never receives it back.  So, it thinks it has lost the token and then restarts its membership algorithm.  But because it will always fail to get its token back, these membership attempts will always fail and loop like this forever.
> 
> It seems like the problem is that when the daemon tries to send to itself at 127.0.1.1 at port 4804 that it never receives it back.  Are you sure your addresses are correct?  Are you sure there isn't a firewall or something similar interfering?
> 
> I noticed that your id is 127.0.1.1 rather than 127.0.0.1.   I'm wondering how that happened?  Normally, I would think you would need an entry in /etc/hosts that maps "localhost" to 127.0.1.1 and change Spread's config file for it to start up happily.  Is that what you did?  Or is something else going on?  Are you running Spread by forcing it to use a particular name in the config (e.g. - spread -n localhost)?

Now I see this, too. My IP for localhost is definitely 127.0.0.1. So
this probably explains the problem.

> Are you running on a 64 bit machine?  There is a known bug in 4.1.0 on such machines that can exhibit behavior like this.  You should probably try 4.2.0 or even the 4.3.0 release candidate and see if you have the same problem.

Yes, Mountain Lion is 64 bit. However, after changing the IP problem
everything seems to work correctly.

So it just seems to me that the default config is generated in a wrong
way. Is that the known bug?

> As to why it seems to work a little bit, Spread accepts messages from users even when it can't send them immediately (e.g. - during a membership algorithm), but eventually will refuse more messages (e.g. - after 500) if you send it a lot.  So, while Spread accepts the messages from your load application for a while it eventually refuses any more, so he seems to lock up.
> 
> If you have your receive client join before the load client does it ever receive anything?  My hunch is that it would not.

I never received anything.

So thanks for pointing the IP address issue out!

Cheers,
Johannes
-------------- next part --------------
# Blank lines are permitted in this file.
# spread.conf sample file
# 
# questions to spread at spread.org
#

#MINIMAL REQUIRED FILE
#
# Spread should work fine on one machine with just the uncommented 
# lines below. The rest of the file documents all the options and
# more complex network setups.
#
# This configures one spread daemon running on port 4803 on localhost.

Spread_Segment  127.0.0.255:4803 {

	localhost		127.0.0.1
}

# Spread options
#---------------------------------------------------------------------------
#---------------------------------------------------------------------------
#Set what internal Spread events are logged to the screen or file 
# (see EventLogFile).
# Default setting is to enable PRINT and EXIT events only. 
#The PRINT and EXIT types should always be enabled. The names of others are:
#    	EXIT PRINT DEBUG DATA_LINK NETWORK PROTOCOL SESSION 
#	CONFIGURATION MEMBERSHIP FLOW_CONTROL STATUS EVENTS 
#	GROUPS MEMORY SKIPLIST ALL NONE	
#    ALL and NONE are special and represent either enabling every type 
#                                           or enabling none of them.
#    You can also use a "!" sign to negate a type, 
#        so { ALL !DATA_LINK } means log all events except data_link ones.

#DebugFlags = { PRINT EXIT }

# Set priority level of events to output to log file or screen
# The possible levels are: 
#	pDEBUG INFO WARNING ERROR CRITICAL FATAL
# Once selected all events tagged with that priority or higher will
# be output. FATAL events are always output and cause the daemon to 
# shut down. Some Events are tagged with a priority of PRINT which
# causes them to print out no matter what priority level is set. 
#
# The default level used if nothing is set is INFO.

#EventPriority =  INFO

#Set whether to log to a file as opposed to stdout/stderr and what 
# file to log to.
# Default is to log to stdout.
#
#If option is not set then logging is to stdout.
#If option is set then logging is to the filename specified.
# The filename can include a %h or %H escape that will be replaced at runtime
# by the hostname of the machine upon which the daemon is running.
# For example "EventLogFile = spreadlog_%h.log" with 2 machines 
# running Spread (machine1.mydomain.com and machine2.mydomain.com) will
# cause the daemons to log to "spreadlog_machine1.mydomain.com.log" and
# "spreadlog_machine2.mydomain.com.log" respectively.

#EventLogFile = testlog.out

#Set whether to add a timestamp in front of all logged events or not.
# Default is no timestamps. Default format is "[%a %d %b %Y %H:%M:%S]".
#If option is commented out then no timestamp is added.
#If option is enabled then a timestamp is added with the default format
#If option is enabled and set equal to a string, then that string is used
#   as the format string for the timestamp. The string must be a valid time
#   format string as used by the strftime() function.

#EventTimeStamp
# or
#EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"

#Set whether to add a precise (microsecond) resolution timestamp to all logged
# events or not. This option requires that EventTimeStamp is also enabled. 
# If the option is commented out then the microsecond timestamp is not added
# If the option is uncommented then a microsecond time will print in addition
#  to the H:M:S resolution timestamp provided by EventTimeStamp. 

#EventPreciseTimeStamp

# Set to initialize daemon sequence numbers to a 'large' number for testing
# this is purely a debugging capability and should never be enabled on
# production systems (note one side effect of enabling this is that 
# your system will experience an extra daemon membership every few messages
# so you REALLY do not want this turned on)
# If you want to change the initial value the sequence number is set to
# you need to edit the #define INITIAL_SEQUENCE_NEAR_WRAP at the top
# of configuration.h

#DebugInitialSequence

#Set whether to allow dangerous monitor commands 
# like "partition, flow_control, or kill"
# Default setting is FALSE.
#If option is set to false then only "safe" monitor commands are allowed 
#    (such as requesting a status update).
#If option is set to true then all monitor commands are enabled. 
#   THIS IS A SECURTIY RISK IF YOUR NETWORK IS NOT PROTECTED!

#DangerousMonitor = false

#Set handling of SO_REUSEADDR socket option for the daemon's TCP
# listener.  This is useful for facilitating quick daemon restarts (OSes
# often hold onto the interface/port combination for a short period of time
# after daemon shut down).
#
# AUTO - Active when bound to specific interfaces (default).
# ON   - Always active, regardless of interface.
#        SECURITY RISK FOR ANY OS WHICH ALLOW DOUBLE BINDS BY DIFFERENT USERS
# OFF  - Always off.

#SocketPortReuse = AUTO

#Set what the maximum per-session queue should be for messages before disconnecting
# a session. Spread will buffer upto that number of messages that are destined to the 
# session, but that can not be delivered currently because the session is not reading fast enough. 
# The compiled in default is usually 1000 if you havn't changed it in the spread_params.h file. 

#MaxSessionMessages = 5000

#Sets the runtime directory used when the Spread daemon is run as root
# as the directory to chroot to.  Defaults to the value of the
# compile-time preprocessor define SP_RUNTIME_DIR, which is generally
# "/var/run/spread".

#RuntimeDir = /var/run/spread

#Sets the unix user that the Spread daemon runs as (when launched as
# the "root" user).  Not effective on a Windows system.  Defaults to
# the user and group "spread".

#DaemonUser = spread
#DaemonGroup = spread

#Set the list of authentication methods that the daemon will allow
# and those which are required in all cases.
# All of the methods listed in "RequiredAuthMethods" will be checked,
# irregardless of what methods the client chooses.
# Of the methods listed is "AllowedAuthMethods" the client is
# permitted to choose one or more, and all the ones the client chooses
# will also be checked.
#
# To support older clients, if NULL is enabled, then older clients can
# connect without any authentication. Any methods which do not require
# any interaction with the client (such as IP) can also be enabled
# for older clients. If you enable methods that require interaction,
# then essentially all older clients will be locked out.
#
#The current choices are:
#	NULL for default, allow anyone authentication
#	IP for IP based checks using the spread.access_ip file

#RequiredAuthMethods = "   "
#AllowedAuthMethods = "NULL"

#Set the current access control policy.
# This is only needed if you want to establish a customized policy.
# The default policy is to allow any actions by authenticated clients.
#AccessControlPolicy = "PERMIT"

# network description line.
# Spread_Segment <multicast address for subnet> <port> {
# port is optional, if not specified the default 4803 port is used.

#Spread_Segment  127.0.0.255:4803 {

# either a name or IP address.  If both are given, than the name is taken 
# as-is, and the IP address is used for that name.

#	localhost		127.0.0.1
#}
# repeat for next sub-network

#Spread_Segment x.2.2.255 {

#	other1			128.2.2.10
#				128.2.2.11
#	other3.my.com
#}
# Spread will feel free to use broadcast messages within a sub-network.
# if you do not want this to happen, you should specify your machines on
# different logical sub-networks.

# IP-Multicast addresses can also be used as the multicast address for
# the logical sub-network as in this example. If IP-multicast is supported
# by the operating system, then the messages will only be received
# by those machines who are in the group and not by all others in the same
# sub-network as happens with broadcast addresses

#Spread_Segment 225.0.1.1:3333 {
#	mcast1			1.2.3.4
#	mcast2			1.2.3.6
#}

# Multi-homed host setup
#
# If you run Spread on hosts with multiple interfaces you may want to 
# control which interfaces Spread uses for client connections and for
# the daemon-to-daemon (and monitor control) messages. This can be done
# by adding an extra stanza to each configured machine. 
#
#Sample:
#
#Spread_Segment 225.0.1.1 {
# 	multihomed1		1.2.3.4 {
#		D 192.168.0.4
#		C 1.2.3.4 }
#	multihomed2		1.2.3.5 {
#		D 192.168.0.5
#		C 1.2.3.5
#		C 127.0.0.1 }
#	multihomed3		1.2.3.6 {
#		192.168.0.6
#		1.2.3.6 }
#}
# This configuration sets up three multihomed machines into a Spread segment.
# The first host has a 'main' IP address of 1.2.3.4 and listens for client
# connections only on that interface. All daemon-to-daemon UDP multicasts and
# the tokens and any monitor messages must use the 192.168.0.4 interface.
# The second host multihomed2 has a similar setup, except it also listens for
# client connections on the localhost interface as well as the 1.2.3.5 interface.
# If you make any use of the extra interface stanza ( a { } block ) then you must
# explicitly configure ALL interfaces you want as Spread removes all defaults when
# you use the explicit notation.
# The third multihomed3 host uses a shorthand form of omitting the D or C option and
# just listening for all types of traffic and events on both the 192.168.0 and 1.2.3 
# networks. If no letter is listed before the interface address then ALL types of 
# events are handled on that interface.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://lists.spread.org/pipermail/spread-users/attachments/20130507/c7723f22/attachment.bin