[Spread-users] Circular token over spread: 2 seconds lap time?

Gautam H. Thaker gthaker at atl.lmco.com
Tue Jul 27 11:19:26 EDT 2004


The "2 second" value is a results of the default spread timing 
parameters which are:

Default Spread parameters:

Token_timeout.sec  =   5; Token_timeout.usec  = 0;
Hurry_timeout.sec  =   2; Hurry_timeout.usec  = 0;
Alive_timeout.sec  =   1; Alive_timeout.usec  = 0;
Join_timeout.sec   =   1; Join_timeout.usec   = 0;
Rep_timeout.sec    =   2; Rep_timeout.usec    = 500000;
Seg_timeout.sec    =   2; Seg_timeout.usec    = 0;
Gather_timeout.sec =   5; Gather_timeout.usec = 0;
Form_timeout.sec   =   5; Form_timeout.usec   = 0;
Lookup_timeout.sec =  60; Lookup_timeout.usec = 0;

In my tests I have noted that these values results in Spread 
communications suffering a maximum latency of 2 seconds. When I change 
these parameters to values below the maximum latencies I observe are 
much less.

"Very Fast" Spread parameters:

Token_timeout.sec  =   0; Token_timeout.usec  = 100000;
Hurry_timeout.sec  =   0; Hurry_timeout.usec  =  40000;
Alive_timeout.sec  =   0; Alive_timeout.usec  =  20000;
Join_timeout.sec   =   0; Join_timeout.usec   =  20000;
Rep_timeout.sec    =   0; Rep_timeout.usec    =  60000;
Seg_timeout.sec    =   0; Seg_timeout.usec    =  40000;
Gather_timeout.sec =   0; Gather_timeout.usec = 100000;
Form_timeout.sec   =   0; Form_timeout.usec   = 100000;
Lookup_timeout.sec =   1; Lookup_timeout.usec = 200000;


The latencies ranges observed for a variety of message sizes for these 
two parameter values are shown in the attached graphic. (All our test 
results are also available online at:

http://www.atl.external.lmco.com/projects/QoS/compare/cgi-bin/left2_part1.cgi?filter=emulab.*%28spread%7Ctcp%29

I was wondering if anyone has pushed Spread parameter to even much lower 
than "very fast" values. Certainly on Linux 2.6 kernel or on Solaris 
both of which have 1000 HZ clocks the lowest value of parameter should 
be settable at about 2 msec (rather than 20 msec in "very fast" above.)

Gautam

Andreu Moreno i Vendrell wrote:
> Hello,
> 
> We have 2 seconds lap time in a circular token over spread. Do you know what's 
> wrong?
> 
> Test description:
> 
> a) 3 computers in an isolated LAN: Machine 1, Machine 2 and Machine 3.
> b) Spread 3.17.2 version installed in every machine.
> c) RedHat 8.0 Linux installed in every machine.
> d) Machine 1: runs a program that joins group "1" and on reception of a 
> message it sends a message to group "2".
> e) Machine 2: runs a program that joins group "2" and on reception of a 
> message it sends a message to group "3".
> f) Machine 3: runs a program that joins group "3" and on reception of a 
> message it sends a message to group "3". This program is the last to be 
> executed and also sends a message to group "1" to start the token to 
> circulate.
> 
> Results:
> 
> The lap time is about 2 seconds?????
> 
> Thanks,
> 
> Andreu
> 
> *********************************
> The source code of the program is:
> 
> /**
> * Class tgcs. Group Communication System for dFSM
> */
> class tSpread
> {
> public:
> 
> 	/**
> 	* Configure Group Communication System
> 	*
> 	* @return True if ok
> 	*/
> 	bool init(int next, int event);
> 
> 	/**
> 	* Join group 
> 	*
> 	*
> 	* @param group_id Group id.
> 	* @return true if OK
> 	*/
> 	bool  tSpread::join(int group_id);
> 
> 	/**
> 	* Leave group 
> 	*
> 	*
> 	* @param group_id Group id.
> 	* @return true if OK
> 	*/
> 	bool leave(int group_id);
> 
> 	/**
> 	* Send a message to other machines using Group Communication System
> 	*
> 	* @param group_id Group id.
> 	* @param data Extra data
> 	* @param len Length data
> 	* @return Bytes sended
> 	*/
> 	static int send(int group_id, const char *data=0,int len=0);
> 
> 	/**
> 	* Receive message callback
> 	*
> 	* @param fd fd.
> 	* @param code code
> 	* @param data data
> 	*/
> 	static void	receive(int fd, int code, void *data);
> 	
> 	/**
> 	* Get connection identification
> 	*
> 	* @param id identification
> 	*/
> 	void	getid(char *id);
> 
> 	/**
> 	* Main loop
> 	* Handle events
> 	*
> 	* */
> 	void mainloop(void);
> 	
> private:
> 	static mailbox Mbox; /**< Spread Id connection   */
> 	char	User[80]; /**< Private name connection  */
> 	char    Spread_name[80]; /**< Spread name daemon   */
> 	char    Private_group[MAX_GROUP_NAME]; /**< Private Group Name   */
> 	static int 	next;
> 	static char message[64];
> };
> 
> /**
> * Configure Group Communication System
> *
> * 
> * @return True if ok
> */
> bool tSpread::init(int _next, int event)
> {
> 	int	ret;
> 	sp_time timeout;
> 
> 	sprintf(message, "Hello, (%d)", event);
> 	
> 	next= _next;
> 	
>  	timeout.sec = CONNECT_TIMEOUT;
>         timeout.usec = 0;
> 
> 	sprintf( User, "" );
> 	sprintf( Spread_name, "4803 at localhost");
> 
> 	ret = SP_connect_timeout( Spread_name, User, 0, 1, &Mbox, Private_group, 
> timeout );
> 	if( ret != ACCEPT_SESSION )
> 	{
> 		SP_error( ret );
> 		SP_disconnect( Mbox );
> 		return false;
> 	}
> 
> 	E_init();
> 	E_attach_fd( Mbox, READ_FD, &tSpread::receive, 0, NULL, HIGH_PRIORITY );
> 	
> 	return true;
> }
> 
> /**
> * Join group 
> *
> *
> * @param group_id Group id.
> * @return true if OK
> */
> bool  tSpread::join(int group_id)
> {
> 	int	ret;
> 	char	group[80];
> 
> 	sprintf(group, "%i", group_id);
> 
> 	ret = SP_join( Mbox, group );
> 
> 	if( ret < 0 )
> 	{
> 		SP_error( ret );
> 		return false;
> 	}else
> 	{
> 		return true;
> 	}
> }
> 
> 
> /**
> * Leave group 
> *
> *
> * @param group_id Group id.
> * @return true if OK
> */
> bool  tSpread::leave(int group_id)
> {
> 	int	ret;
> 	char	group[80];
> 
> 	sprintf(group, "%i", group_id);
> 
> 	ret = SP_leave( Mbox, group );
> 
> 	if( ret < 0 )
> 	{
> 		SP_error( ret );
> 		return false;
> 	}else
> 	{
> 		return true;
> 	}
> }
> /**
> * Send a message to other machines using Group Communication System
> *
> * @param group_id Message id.
> * @param data Extra data
> * @param len Length data
> * @return Bytes sended
> */
> int tSpread::send(int group_id, const char *data, int len)
> {
> 	int	ret;
> 	char	group[80];
> 
> 	sprintf(group, "%i", group_id);
> 
> 	// SAFE_MESS: in order to achieve global consistency, it is needed total 
> order and safe message delivery
> 	ret= SP_multicast( Mbox, SAFE_MESS, group, 1, len, data );
> 	if( ret < 0 )
> 	{
> 		SP_error( ret );
> 		return -1;
> 	}else
> 	{
> 		return ret;
> 	}	
> }
> 
> 
> /**
> * Receive message callback
> *
> */
> 
> 
> void	tSpread::receive(int fd, int code, void *data)
> {
> 
> static	char		mess[102400];
> 	char		sender[MAX_GROUP_NAME];
> 	char		target_groups[100][MAX_GROUP_NAME];
> 	int		num_groups;
> 	int		service_type;
> 	int16		mess_type;
> 	int		endian_mismatch;
> 	int		ret;
> 
> 	service_type = 0;
> 
> 	ret = SP_receive( tSpread::Mbox, &service_type, sender, 100, &num_groups, 
> target_groups,
> 		&mess_type, &endian_mismatch, sizeof(mess), mess );
> 
> 	printf("\n========Missatge rebut=======: service_type: %x\n", service_type);
> 	printf("\n========Missatge rebut=======: target_groups:: %s\n", 
> target_groups[0]);
> 
> 	if( ret < 0 )
> 	{
>                 if ( (ret == GROUPS_TOO_SHORT) || (ret == BUFFER_TOO_SHORT) ) 
> {
>                         printf("\n========Buffers or Groups too 
> Short=======\n");
>                 }
>         }
>         if (ret < 0 )
>         {
> 		SP_error( ret );
> 	}
> 	if( Is_regular_mess( service_type ) )
> 	{
> 		if (ret > 0)
> 		{
> 			struct timeval tv;
> 			gettimeofday(&tv, NULL);
> 			printf("%ld:%ld\n", tv.tv_sec, tv.tv_usec);
> 			printf("message from %s, of type %d, (endian %d) to %d groups \n(%d bytes): 
> %s\n", sender, mess_type, endian_mismatch, num_groups, ret, mess );
> 			
> 			send(next, message, strlen(message));
> 		}
> 	}
> }
> 
> /**
> * Get connection identification
> *
> * @param id identification
> **/
> void     tSpread::getid(char *id)
> {
> 	id = Private_group;
> }
> 
> 	
> 
> 
> 
> /**
> * Main loop
> * Handle events
> * 
> */
> void tSpread::mainloop(void)
> {
> 	E_handle_events();
> }
> 
> void print_help()
> {
> 	printf("\n");
> 	
> 	printf("****************\n");
> 	printf("Token\n");
> 	printf("****************\n");
> 	printf("Help:\n");
> 	printf(" -h : Print this help\n");
> 	printf(" -e : Event to trigger transtion\n");
> 	printf(" -n : Event to send in transition action\n");
> 	printf(" -f : Event to send in first transition\n");
> }
> 
> mailbox tSpread::Mbox;
> char tSpread::message[64];
> int 	tSpread::next;
> 
> 
> int main (int argc, char* argv[])
> {
> 	int event=-1, next=-1, first=-1;
> 	int i=0;
> 	tSpread sp;
> 	char message[64];
> 
> 	// Parse command line parameters
> 
> 	for(i=1; i < argc; i++)
> 	{
> 		if (argv[i][0] == '-')
> 		{
> 			if (argv[i][1] == 'h')
> 			{
> 				print_help();
> 				return 0;
> 			}
> 			if (argv[i][1] == 'e')
> 			{
> 				sscanf(argv[i+1], "%d", &event);
> 				i++;
> 				continue;
> 			}
> 
> 			if (argv[i][1] == 'n')
> 			{
> 				sscanf(argv[i+1], "%d", &next);
> 				i++;
> 				continue;
> 			}
> 
> 			if (argv[i][1] == 'f')
> 			{
> 				sscanf(argv[i+1], "%d", &first);
> 				i++;
> 				continue;
> 			}		
> 			printf("Bad parameter: %s\n", argv[i][1]);
> 			return -1;
> 		}
> 	}
> 	
> 	sprintf(message, "Hello, (%d)", event);
> 	
> 	sp.init(next, event);
> 	
> 	sp.join(event);
> 
> 	if (first !=-1)	 sp.send(first, message, strlen(message));
> 	
> 	sp.mainloop();
> 	
> 	sp.leave(event);
> 	
> }
> 
> 
> 
> *********************************
> The spread.conf:
> 
> # Blank lines are permitted in this file.
> # spread.conf sample file
> # 
> # questions to spread at spread.org
> #
> 
> #MINIMAL REQUIRED FILE
> #
> # Spread should work fine on one machine with just the uncommented 
> # lines below. The rest of the file documents all the options and
> # more complex network setups.
> #
> # This configures one spread daemon running on port 4803 on localhost.
> 
> #Spread_Segment  127.0.0.255:4803 {
> #
> #	localhost		127.0.0.1
> #}
> 
> 
> 
> 
> # Spread options
> #---------------------------------------------------------------------------
> #---------------------------------------------------------------------------
> #Set what internal Spread events are logged to the screen or file 
> # (see EventLogFile).
> # Default setting is to enable PRINT and EXIT events only. 
> #The PRINT and EXIT types should always be enabled. The names of others are:
> #    	EXIT PRINT DEBUG DATA_LINK NETWORK PROTOCOL SESSION 
> #	CONFIGURATION MEMBERSHIP FLOW_CONTROL STATUS EVENTS 
> #	GROUPS MEMORY SKIPLIST ALL NONE	
> #    ALL and NONE are special and represent either enabling every type 
> #                                           or enabling none of them.
> #    You can also use a "!" sign to negate a type, 
> #        so { ALL !DATA_LINK } means log all events except data_link ones.
> 
> DebugFlags = { PRINT EXIT }
> 
> #DebugFlags = { PRINT EXIT } Originalmente estaba así
> 
> #Set whether to log to a file as opposed to stdout/stderr and what 
> # file to log to.
> # Default is to log to stdout.
> #
> #If option is not set then logging is to stdout.
> #If option is set then logging is to the filename specified.
> # The filename can include a %h or %H escape that will be replaced at runtime
> # by the hostname of the machine upon which the daemon is running.
> # For example "EventLogFile = spreadlog_%h.log" with 2 machines 
> # running Spread (machine1.mydomain.com and machine2.mydomain.com) will
> # cause the daemons to log to "spreadlog_machine1.mydomain.com.log" and
> # "spreadlog_machine2.mydomain.com.log" respectively.
> 
> #EventLogFile = testlog.out		Originalmente estaba así
> 
> #EventLogFile = testlog_%h.log
> 
> #Set whether to add a timestamp in front of all logged events or not.
> # Default is no timestamps. Default format is "[%a %d %b %Y %H:%M:%S]".
> #If option is commented out then no timestamp is added.
> #If option is enabled then a timestamp is added with the default format
> #If option is enabled and set equal to a string, then that string is used
> #   as the format string for the timestamp. The string must be a valid time
> #   format string as used by the strftime() function.
> 
> #EventTimeStamp
> # or
> EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
> 
> #Set whether to allow dangerous monitor commands 
> # like "partition, flow_control, or kill"
> # Default setting is FALSE.
> #If option is set to false then only "safe" monitor commands are allowed 
> #    (such as requesting a status update).
> #If option is set to true then all monitor commands are enabled. 
> #   THIS IS A SECURTIY RISK IF YOUR NETWORK IS NOT PROTECTED!
> 
> #DangerousMonitor = false
> 
> #Set handling of SO_REUSEADDR socket option for the daemon's TCP
> # listener.  This is useful for facilitating quick daemon restarts (OSes
> # often hold onto the interface/port combination for a short period of time
> # after daemon shut down).
> #
> # AUTO - Active when bound to specific interfaces (default).
> # ON   - Always active, regardless of interface.
> #        SECURITY RISK FOR ANY OS WHICH ALLOW DOUBLE BINDS BY DIFFERENT USERS
> # OFF  - Always off.
> 
> #SocketPortReuse = AUTO
> 
> #Sets the runtime directory used when the Spread daemon is run as root
> # as the directory to chroot to.  Defaults to the value of the
> # compile-time preprocessor define SP_RUNTIME_DIR, which is generally
> # "/var/run/spread".
> 
> #RuntimeDir = /var/run/spread
> 
> #Sets the unix user that the Spread daemon runs as (when launched as
> # the "root" user).  Not effective on a Windows system.  Defaults to
> # the user and group "spread".
> 
> #DaemonUser = spread
> #DaemonGroup = spread
> 
> 
> #Set the list of authentication methods that the daemon will allow
> # and those which are required in all cases.
> # All of the methods listed in "RequiredAuthMethods" will be checked,
> # irregardless of what methods the client chooses.
> # Of the methods listed is "AllowedAuthMethods" the client is
> # permitted to choose one or more, and all the ones the client chooses
> # will also be checked.
> #
> # To support older clients, if NULL is enabled, then older clients can
> # connect without any authentication. Any methods which do not require
> # any interaction with the client (such as IP) can also be enabled
> # for older clients. If you enable methods that require interaction,
> # then essentially all older clients will be locked out.
> #
> #The current choices are:
> #	NULL for default, allow anyone authentication
> #	IP for IP based checks using the spread.access_ip file
> 
> #RequiredAuthMethods = "   "
> #AllowedAuthMethods = "NULL"
> 
> #Set the current access control policy.
> # This is only needed if you want to establish a customized policy.
> # The default policy is to allow any actions by authenticated clients.
> #AccessControlPolicy = "PERMIT"
> 
> 
> # network description line.
> # Spread_Segment <multicast address for subnet> <port> {
> # port is optional, if not specified the default 4803 port is used.
> 
> #Spread_Segment  127.0.0.255:4803 {
> 
> # either a name or IP address.  If both are given, than the name is taken 
> # as-is, and the IP address is used for that name.
> 
> #	localhost		127.0.0.1
> #}
> # repeat for next sub-network
> 
> #Spread_Segment x.2.2.255 {
> 
> #	other1			128.2.2.10
> #				128.2.2.11
> #	other3.my.com
> #}
> # Spread will feel free to use broadcast messages within a sub-network.
> # if you do not want this to happen, you should specify your machines on
> # different logical sub-networks.
> 
> # IP-Multicast addresses can also be used as the multicast address for
> # the logical sub-network as in this example. If IP-multicast is supported
> # by the operating system, then the messages will only be received
> # by those machines who are in the group and not by all others in the same
> # sub-network as happens with broadcast addresses
> 
> #Spread_Segment 225.0.1.1:3333 {
> #	mcast1			1.2.3.4
> #	mcast2			1.2.3.6
> #}
> 
> # Multi-homed host setup
> #
> # If you run Spread on hosts with multiple interfaces you may want to 
> # control which interfaces Spread uses for client connections and for
> # the daemon-to-daemon (and monitor control) messages. This can be done
> # by adding an extra stanza to each configured machine. 
> #
> #Sample:
> #
> #Spread_Segment 225.0.1.1 {
> # 	multihomed1		1.2.3.4 {
> #		D 192.168.0.4
> #		C 1.2.3.4 }
> #	multihomed2		1.2.3.5 {
> #		D 192.168.0.5
> #		C 1.2.3.5
> #		C 127.0.0.1 }
> #	multihomed3		1.2.3.6 {
> #		192.168.0.6
> #		1.2.3.6 }
> #}
> # This configuration sets up three multihomed machines into a Spread segment.
> # The first host has a 'main' IP address of 1.2.3.4 and listens for client
> # connections only on that interface. All daemon-to-daemon UDP multicasts and
> # the tokens and any monitor messages must use the 192.168.0.4 interface.
> # The second host multihomed2 has a similar setup, except it also listens for
> # client connections on the localhost interface as well as the 1.2.3.5 
> interface.
> # If you make any use of the extra interface stanza ( a { } block ) then you 
> must
> # explicitly configure ALL interfaces you want as Spread removes all defaults 
> when
> # you use the explicit notation.
> # The third multihomed3 host uses a shorthand form of omitting the D or C 
> option and
> # just listening for all types of traffic and events on both the 192.168.0 and 
> 1.2.3 
> # networks. If no letter is listed before the interface address then ALL types 
> of 
> # events are handled on that interface.
> 
> Spread_Segment 172.16.63.255:4803 {
> 
> 		localhost	127.0.0.1
> 
> 		linin01		172.16.62.153
> 		linin02		172.16.60.102
> 		linin03		172.16.60.103
> }
> 
> 
> 

-- 
Gautam H. Thaker
Distributed Processing Lab; Lockheed Martin Adv. Tech. Labs
3 Executive Campus; Cherry Hill, NJ 08002
856-792-9754, fax 856-792-9925  email: gthaker at atl.lmco.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gplot_3477.png
Type: image/png
Size: 6619 bytes
Desc: not available
Url : http://lists.spread.org/pipermail/spread-users/attachments/20040727/6d1df32c/attachment.png 


More information about the Spread-users mailing list