Hello.
I’m having a problem with multicasting on fog 0.32, CentOS 5.6 (final) and windows 7 machines (Dell Optiplex gx620).
It’s a rather strange problem where I’m trying to multicast 145 machines and it’s all fine with the first small partition, it multicasts that one just fine, but when it’s about to start the second partition it just sits and wait on the “please wait” screen.
[B]Now the log in /opt/fog/log/ show’s the following:[/B]
[SIZE=2]([COLOR=#ff0000]This is the multicast.log.udpcast.28 log and not the one called only “multicast.log”[/COLOR]) [/SIZE]
Udp-sender 2007-12-28
Using mcast address 236.21.238.31
UDP sender for (stdin) at 172.21.238.31 on eth0
Broadcasting control to 224.0.0.1
New connection from 172.21.238.123 (#0) 00000009
New connection from 172.21.238.156 (#1) 00000009
New connection from 172.21.238.152 (#2) 00000009
New connection from 172.21.238.151 (#3) 00000009
New connection from 172.21.238.136 (#4) 00000009 etc etc…
[B]The it show’s:[/B]
Starting transfer: 00000009
bytes= 97 552 re-xmits=0000001 ( 1.4%) slice=0066 73 709 551 615 - 132
bytes= 193 648 re-xmits=0000001 ( 0.7%) slice=0066 73 709 551 615 - 131
bytes= 289 744 re-xmits=0000001 ( 0.5%) slice=0066 73 709 551 615 - 131
bytes= 385 840 re-xmits=0000001 ( 0.3%) slice=0066 73 709 551 615 - 131
bytes= 481 936 re-xmits=0000001 ( 0.3%) slice=0066 73 709 551 615 - 131
bytes= 578 032 re-xmits=0000001 ( 0.2%) slice=0066 73 709 551 615 - 131
bytes= 674 128 re-xmits=0000001 ( 0.2%) slice=0066 73 709 551 615 - 130 etc etc…
[B]And then the interesting bits happen:[/B]
bytes= 25 370 800 re-xmits=0000001 ( 0.0%) slice=0066 73 709 551 615 - 130
bytes= 25 375 168 re-xmits=0000001 ( 0.0%) slice=0066 73 709 551 615 - 133
bytes= 25 375 612 re-xmits=0000001 ( 0.0%) slice=0066 73 709 551 615 - 132
Timeout notAnswered=[2,4,7,8,10,13,14,15,18,19,20,21,63,106,110,111,114,115,118,119,120,121,122,124,127,128,129,131,133,134,135,136,138,139,140,142,143,145] nrAns=108 nrRead=108 nrPart=146 avg=3661
Disconnecting #24 (172.21.239.10)
Disconnecting #89 (172.21.238.240)
Disconnecting #78 (172.21.238.143)
Disconnecting #28 (172.21.238.100)
Disconnecting #23 (172.21.238.227)
Disconnecting #31 (172.21.238.132)
Disconnecting #25 (172.21.238.111)
Disconnecting #22 (172.21.238.226) etc etc…
[B]And then follows:[/B]
Disconnecting #126 (172.21.238.174)
Disconnecting #130 (172.21.238.119)
Disconnecting #131 (172.21.238.194)
Bad command 0300
Bad command 0300
Bad command 0300
Bad command 0300
Bad command 0300
Bad command 0300 etc etc…
[B]After that I’m getting:[/B]
Dropping client #2 because of timeout
Disconnecting #2 (172.21.238.152)
Dropping client #4 because of timeout
Disconnecting #4 (172.21.238.136)
Dropping client #7 because of timeout
Disconnecting #7 (172.21.238.149)
Dropping client #8 because of timeout
Disconnecting #8 (172.21.238.138)
Dropping client #10 because of timeout
Disconnecting #10 (172.21.238.205)
Dropping client #13 because of timeout
Disconnecting #13 (172.21.238.104) etc etc…
[B]Almost at the end it says:[/B]
Dropping client #142 because of timeout
Disconnecting #142 (172.21.238.120)
Dropping client #145 because of timeout
Disconnecting #145 (172.21.238.173)
Transfer complete.^G
Disconnecting #0 (172.21.238.123)
Disconnecting #1 (172.21.238.156)
Disconnecting #3 (172.21.238.151)
Disconnecting #5 (172.21.238.140)
Disconnecting #6 (172.21.238.129)
Disconnecting #9 (172.21.238.130)
Disconnecting #12 (172.21.238.207)
Disconnecting #16 (172.21.238.211)
Disconnecting #27 (172.21.239.8)
[B]And then finally:[/B]
Udp-sender 2007-12-28
Using mcast address 236.21.238.31
UDP sender for (stdin) at 172.21.238.31 on eth0
Broadcasting control to 224.0.0.1
Any idea’s what can be wrong? It complains about timeout altho the machines all started up within 20 minutes from the first to the last and looking at [URL=‘http://fogproject.org/forum/threads/multicast-timeout.529/’]this[/URL] post I’ve also checked the Config.php --> UPDSENDER_MAXWAIT setting and I see that it’s on 0 (so I guess that means it will wait forever and not timeout anything).