[Spread-users] SP_error: (-6) Connection rejected, name not unique

Jonathan Stanton jonathan at spread.org
Thu Oct 23 23:53:20 EDT 2008


Three notes here. 

1) In the message showing your spread configuration activated:

> Successfully configured Segment 0 [10.0.7.255:3333] with 2 procs:
>                        wwwt001: 10.0.7.40
> Successfully configured Segment 1 [10.5.5.255:3333] with 1 procs:
>                         noc1: 10.5.5.36
>

It lists ("2 procs") in the first segment, but the config you show only lists one, are you sure this is the exact setup -- 
with only one machine listed in segment 1?

2) The sp_user behaviour you see is exactly what I would expect if the two machines were not able to communicate successfully 
and so the two Spread nodes did not form a common membership (they veiwed themselves as two partiioned daemons). I know you 
checked for firewalls (which is my first thought) but could you rerun this configuration the the DebugFlags set to

{ PRINT EXIT PROTOCOL MEMBERSHIP DATA_LINK GROUPS SESSION }

and then send me the output from the log? That will show what the daemons found when they tried to communicate. 

3) For the restart problem:

> I then restarted spread on Node 2 and try to rerun spuser and get
> Kill -9 pid of spread
> ./spread -n wwwt001 &
> ./spuser -s 3333
> Spread library version is 4.0.0
> SP_error: (-6) Connection rejected, name not unique
> 
> Verified that only the one process was running, and that it was bound to a
> name in spread.conf
> spread    4705  4620  0 14:30 pts/0    00:00:00 ./spread -n wwwt001
> 

This would normally occur if you killed the spuser (not the daemon) and then reran it quickly. In that case 
the daemon still has a record (for a brief time) that a client by the name "$user$wwt001$" exists (as it takes a brief time to 
synchronize the leave of the client and until that completes the client name is still marked as in use. This period should 
only be a secord or so unless there is a strange problem with messages being passed -- which may be occuring here becuase of 
the other problems you note. 

The log output I mention earlier will help diagnose this.

Cheers,

Jonathan

On Thu, Oct 23, 2008 at 03:58:31PM -0500, Valentino, Paul wrote:
> I've just performed a fresh install of spread using Version 4.00.00 Built
> 29/November/2006 on RHEL 4 2.6.9-67.0.20.EL on 2 nodes
> 
> I created a spread.conf as follows on both nodes:
> 
> Spread_Segment 10.0.7.255:3333 {
>         wwwt001    10.0.7.40
> }
> Spread_Segment 10.5.5.255:3333 {
>         noc1    10.5.5.36
> }
> DebugFlags = { PRINT EXIT }
> EventLogFile = spreadlog.out
> EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
> DangerousMonitor = false
> 
> 
> Then I started spread on both nodes:
> Node 1
> ./spread -n wwwt001 &
> Node 2
> ./spread -n noc1 &
> 
> I get valid output on both nodes when starting:
> Conf_load_conf_file: using file: spread.conf
> Successfully configured Segment 0 [10.0.7.255:3333] with 2 procs:
>                        wwwt001: 10.0.7.40
> Successfully configured Segment 1 [10.5.5.255:3333] with 1 procs:
>                         noc1: 10.5.5.36
> 
> Then I run spuser:
> Node 1
> ./spuser -s 3333
> j www
> Received REGULAR membership for group www with 1 members, where I am member
> 0:
>         #user#wwwt001
> grp id is 167973156 1224790252 1
> Due to the JOIN of #user#wwwt001
> 
> Node 2
> ./spuser -s 3333
> j www
> Received REGULAR membership for group www with 1 members, where I am member
> 0:
>         #user#noc1
> grp id is 167904562 1224789343 1
> Due to the JOIN of #user#noc1
> 
> But when I send message with s www only the node the message I send from
> receives the message
> 
> I verified firewall and iptables and have no problem telnet to 3333 on
> either host
> 
> I then restarted spread on Node 2 and try to rerun spuser and get
> Kill -9 pid of spread
> ./spread -n wwwt001 &
> ./spuser -s 3333
> Spread library version is 4.0.0
> SP_error: (-6) Connection rejected, name not unique
> 
> Verified that only the one process was running, and that it was bound to a
> name in spread.conf
> spread    4705  4620  0 14:30 pts/0    00:00:00 ./spread -n wwwt001
> 
> Please advise 
> 
> Regards,
> 
> Paul



> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users


-- 
-------------------------------------------------------
Jonathan Stanton         jonathan at spread.org
Spread Group Messaging   www.spread.org
Spread Concepts LLC      www.spreadconcepts.com
-------------------------------------------------------




More information about the Spread-users mailing list