[Spread-users] Need DL_send patch !

gulekim at samsung.co.kr gulekim at samsung.co.kr
Wed Nov 20 21:34:21 EST 2002


Hi, everybody.

I had met serious problem in using spread like this:

- membership timeout: 
  token_timeout 500ms, hurry_timeout 200ms, lookup_timeout 3sec
- configuration: 13 daemons per machine (and only 3 machines are running)
- there is only 1 daemon is running !

  As expected, running daemon holds token after for a while 
  and generates hurry token every 200ms,
  and generates lookup_new_members every 3sec
  lookup_new_members do Net_ucast to remain 12 daemons
  and go to Send_join and Form_or_fail ...

  Now problem, Net_ucast -> DL_send -> sendto called.
  and sendto return fail for 10 daemons becasue the machine is down
  (EHOSTDWON error) and try 10 times with 10ms period.

  Now calculate it. 10msec * 10 daemons * 10 times == 1000msec == 1 sec

  The daemon will be pend for about 1sec and then TOKEN_LOSS will be accurred!


I think this problem will be removed very easily like this.
---------------------------------------------------------------------------
ret = sendto(chan, pseudo_scat, total_len, 0, (struct sockaddr *)&soc_addr, sizeof(soc_addr) );
if(sock_errno == EWOULDBLOCK) {
 /* delay for a short while */
  .........
}
else
 num_try = 10;
}
---------------------------------------------------------------------------
Am I right?

bye.





More information about the Spread-users mailing list