[Spread-users] Problems with more than 2 hosts in a segment

Hans Juergen von Lengerke lengerkeh at sixt.de
Tue Dec 12 10:54:45 EST 2000


I am experiencing problems in a spread segment which looks like this:

  hermes:/usr/local/bin # cat spread.conf
  Spread_Segment  172.21.1.255:3333 {
          hermes  172.21.1.10
          baggins 172.21.1.25
          gamgee  172.21.1.26
          took    172.21.1.27
  }

The config is exactly the same on all of those hosts. All hosts run SuSE
Linux with kernel 2.2.16-SMP apart from hermes which runs a 2.2.14
kernel.

When spread runs on any two of the machines everything works as
expected. As soon as a third machine joins the segment things go wrong.
'user' sessions don't work anymore. For example:

  gamgee:~ # user -s 3333
  Spread library version is 3.14
  recv_nointr_timeout: Timed out
  SP_error: (-8) Connection closed by spread

  Bye.

and also existing 'user' sessions do no longer receive messages sent
from themselves or other group members. For example, I have spread and
'user' sessions running on baggins and gamgee. Everything works fine:

[on gamgee]
  User> j test

  User> s test 
  enter message: foo

  User> 
  ============================
  received SAFE message from #user#gamgee, of type 1, (endian 0) to 1
  groups 
  (4 bytes): foo

Now, I start spread on hermes. For some reason, it takes a fair amount
of time (prob ~30 seconds) until all spread deamons report that hermes
has joined. After this was reported we go on with the 'user' session:

[still on gamgee]
  User> s test
  enter message: bar

  User> 

Nothing happens. Nobody receives the message although nothing has
changed apart from hermes joining the spread ring.

Can anyone help?

Thx, Hans






More information about the Spread-users mailing list