[Spread-users] Token too long for packet
Melissa Jenkins
melissa-spread at temeletry.co.uk
Fri Dec 3 09:22:46 EST 2010
I've been having some troubles with a specific host trying to join my spread network. Here is the log for my latest attempt... The Spread 'cluster' has been up and running for several weeks no problems.
[Fri 03 Dec 2010 14:09:44] G_handle_join: #dbupbogons#manch1 joins group radius
[Fri 03 Dec 2010 14:09:44] G_handle_join in GOP
[Fri 03 Dec 2010 14:09:44] G_handle_join: #dbupbogons#manch1 joins group resync
[Fri 03 Dec 2010 14:09:44] G_handle_join in GOP
[Fri 03 Dec 2010 14:09:44] G_handle_join: #dbupbogons#manch1 joins group alive
[Fri 03 Dec 2010 14:09:44] G_handle_join in GOP
[Fri 03 Dec 2010 14:09:46] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:09:56] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:10:06] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:10:16] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:10:26] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:10:36] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:10:37] Send_join: State is 4
[Fri 03 Dec 2010 14:10:38] Send_join: State is 4
[Fri 03 Dec 2010 14:10:39] Send_join: State is 4
[Fri 03 Dec 2010 14:10:40] Send_join: State is 4
[Fri 03 Dec 2010 14:10:41] Send_join: State is 4
[Fri 03 Dec 2010 14:10:46] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:10:56] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:11:06] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:11:16] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:11:26] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:11:36] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:11:41] G_handle_leave: #dbupbogons#manch1 leaves group radius
[Fri 03 Dec 2010 14:11:41] G_handle_leave in GOP
[Fri 03 Dec 2010 14:11:41] G_handle_leave: #dbupbogons#manch1 leaves group alive
[Fri 03 Dec 2010 14:11:41] G_handle_leave in GOP
[Fri 03 Dec 2010 14:11:41] G_handle_kill: #dbupbogons#manch1 is killed
[Fri 03 Dec 2010 14:11:41] G_handle_kill in GOP
It seems to join ok - but in the process totally disrupts the whole group :( Shortly after it leaves the following showed up in the spread log of the daemon it had connected to:
[Fri 03 Dec 2010 14:14:05] Handle_alive in OP
[Fri 03 Dec 2010 14:14:05] Memb_token_loss: I lost my token, state is 1
[Fri 03 Dec 2010 14:14:05] Scast_alive: State is 2
[Fri 03 Dec 2010 14:14:05] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:05] Handle_alive in SEG
[Fri 03 Dec 2010 14:14:06] Scast_alive: State is 2
[Fri 03 Dec 2010 14:14:06] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:06] Handle_alive in SEG
[Fri 03 Dec 2010 14:14:06] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:06] Handle_alive in SEG
[Fri 03 Dec 2010 14:14:06] Stat_handle_message: sent status to Monitor at manch2
[Fri 03 Dec 2010 14:14:07] Send_join: State is 4
[Fri 03 Dec 2010 14:14:07] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:07] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:07] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:07] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:08] Send_join: State is 4
[Fri 03 Dec 2010 14:14:08] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:08] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:08] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:08] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:09] Send_join: State is 4
[Fri 03 Dec 2010 14:14:09] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:09] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:09] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:09] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:09] Memb_handle_message: handling join message from 192.168.124.1, State is 4
[Fri 03 Dec 2010 14:14:10] Send_join: State is 4
[Fri 03 Dec 2010 14:14:10] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:10] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:10] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:10] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:10] Memb_handle_message: handling join message from 192.168.124.1, State is 4
[Fri 03 Dec 2010 14:14:11] Send_join: State is 4
[Fri 03 Dec 2010 14:14:11] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:11] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:11] Memb_handle_message: handling alive message
[Fri 03 Dec 2010 14:14:11] Handle_alive in GATHER
[Fri 03 Dec 2010 14:14:11] Memb_handle_message: handling join message from 192.168.124.1, State is 4
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1092099 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1092100 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1092101 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1093635 My_aru is 1092098, Highest_seq is 1093702
<snip - loads of these removed>
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1093663 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1093694 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1093695 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1093696 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] INSERT HOLE 1 IS 1093697 My_aru is 1092098, Highest_seq is 1093702
[Fri 03 Dec 2010 14:14:12] =========== Form Token ==========
[Fri 03 Dec 2010 14:14:12] FORM 1 Token, sent by 192.168.124.203. Seq: 1097035
[Fri 03 Dec 2010 14:14:12] Configuration hash: 1656243324 (local hash 1656243324)
[Fri 03 Dec 2010 14:14:12] ProcID: 192.168.124.203 ARU: 1291385651, ARU LastID: 0.0.0.0
[Fri 03 Dec 2010 14:14:12] FlowControl: -1215 RTR Len: 1572
[Fri 03 Dec 2010 14:14:12] Form Token members list -- Active (1) Pending (2)
0: 192.168.124.203
Pending Members:
1: 192.168.124.213 2: 192.168.124.129
[Fri 03 Dec 2010 14:14:12] Form Token reps list -- Count (2) index (1)
0: 192.168.124.203 (T 1 SegInd 0) 1: 192.168.124.1 (T 1 SegInd 1)
[Fri 03 Dec 2010 14:14:12] Form Token RING list -- Count (1)
[Fri 03 Dec 2010 14:14:12] Ring 0: MembID 192.168.124.203 - 1291260114 TransTime 0
[Fri 03 Dec 2010 14:14:12] ARU: 1092098 HighSeq: 1093702 NumHoles: 204
[Fri 03 Dec 2010 14:14:12] NumCommit: 1 NumTrans: 1
[Fri 03 Dec 2010 14:14:12] Message Holes: 1092099 1092100 1092101 1092102 1092103 1092104 1092105
1092133 1092134 1092135 1092136 1092137 1092138 1092139 1092140 1092141 1092142
1092143 1092144 1092145 1092146 1092147 1092148 1092186 1092187 1092188 1092229
1092230 1092235 1092236 1092237 1092238 1092239 1092240 1092241 1092242 1092243
1092284 1092285 1092286 1092291 1092292 1092293 1092294 1092295 1092335 1092336
1092337 1092372 1092373 1092374 1092375 1092376 1092381 1092382 1092383 1092384
1092385 1092386 1092387 1092388 1092389 1092390 1092391 1092392 1092393 1092394
1092395 1092396 1092397 1092398 1092455 1092456 1092457 1092458 1092466 1092467
1092468 1092469 1092470 1092471 1092472 1092473 1092474 1092475 1092476 1092477
1092478 1092519 1092520 1092544 1092545 1092546 1092547 1092548 1092564 1092565
1092566 1092567 1092568 1092584 1092585 1092586 1092587 1092588 1092604 1092605
1092606 1092607 1092608 1092624 1092625 1092626 1092929 1092942 1092943 1092944
1092945 1092946 1092963 1092964 1092965 1092966 1092967 1092983 1092984 1092985
1092986 1092987 1092988 1092989 1093080 1093081 1093082 1093083 1093084 1093085
1093101 1093102 1093103 1093104 1093105 1093121 1093122 1093123 1093124 1093125
1093141 1093142 1093143 1093144 1093145 1093161 1093162 1093163 1093164 1093165
1093166 1093167 1093168 1093184 1093185 1093186 1093187 1093188 1093204 1093205
1093206 1093207 1093208 1093224 1093225 1093226 1093227 1093228 1093244 1093245
1093246 1093292 1093307 1093322 1093367 1093412 1093413 1093429 1093430 1093446
1093461 1093476 1093521 1093522 1093538 1093539 1093555 1093585 1093600 1093601
1093632 1093635 1093663 1093694 1093695 1093696 1093697
[Fri 03 Dec 2010 14:14:12] Trans List: 0: 192.168.124.203
[Fri 03 Dec 2010 14:14:12] Commit List:
[Fri 03 Dec 2010 14:14:12] ====================================================
[Fri 03 Dec 2010 14:14:12] Net_ucast_token: Token too long for packet!
Exit caused by Alarm(EXIT)
At which point the Spread daemon was no longer running :( Other daemons in the group/segment seemed to start having messages out of order and everything went a bit weird.
I had originally tried to run a spread daemon on the server but that totally breaks all other daemons, with a massively increasing "s retrans" count. The whole network grinds to a halt as that number goes up (and we hit > 90Mbit/sec!) I don't believe there is a networking problem, but the "s retrans" issue would suggest there is. I've confirmed and it's not a firewall issue, another host on the same LAN (next switch port along) is absolutely line. All are running FreeBSD and all have been patched for the size_t 64 bit problem.
Anybody have any idea where to look as I really need to get this machine in the loop!
Thanks,
Mel
More information about the Spread-users
mailing list