Multicast does not work to multiple clients - only to single client



  • I can’t multicast an image to multiple hosts in a group. I don’t think this is a FOG issues as much as it is a UDPCast issue but, I’m running out of ideas.

    Things I CAN do:
    [LIST=1]
    []I can create a group with a single host and multicast to that (pointless but it shows it can work).
    [
    ]If I multicast to a group and prevent all the hosts except one from checking in (say by unplugging them) then it will happily multicast to that single client after it times out (UDPSENDER_MAXWAIT = 90).
    [/LIST]
    Below is the log files where:

    [COLOR=#ff0000]myFOGserver[/COLOR] is running Scientific Linux 6.2 (Carbon), FOG 0.32, FOG Kernel-3.2.4.core and Syslinux 4.05
    [COLOR=#ff0000]TEST-HOST-1[/COLOR] is a Dell Optiplex 990
    [COLOR=#ff0000]TEST-HOST-2[/COLOR] is a Dell Optiplex 780

    I also swapped out the IP’s in the log files with the above host names.

    Here’s the udp-sender command running on the FOG server (wrapped to make it easier to read):
    [CODE]sh -c gunzip -c “/images/TestLab/rec.img.000” |
    /usr/local/sbin/udp-sender --min-receivers 2 --portbase 63140 --interface eth0 --max-wait 90 --half-duplex --ttl 32 --nokbd;
    gunzip -c “/images/TestLab/sys.img.000” |
    /usr/local/sbin/udp-sender --min-receivers 2 --portbase 63140 --interface eth0 --max-wait 90 --half-duplex --ttl 32 --nokbd;
    /usr/local/sbin/udp-sender --min-receivers 2 --portbase 63140 --interface eth0 --max-wait 90 --half-duplex --ttl 32 --nokbd[/CODE]
    So, the server has fired up udp-sender on port 63140

    Here’s the contents of /tftboot/pxelinux.cfg/[COLOR=#ff0000]MAC_ADDRESS[/COLOR] for [COLOR=#ff0000]TEST-HOST-1 [/COLOR](identical to [COLOR=#ff0000]TEST-HOST-2[/COLOR]):
    [CODE]append initrd=fog/images/init.gz root=/dev/ram0 rw ramdisk_size=127000 ip=dhcp dns= type=down img=<TestLab>
    mc=yes port=63140 storageip=<myFOGserver> storage=<myFOGserver>:/images/ mac=<foobar>
    ftp=myFOGserver web=<myFOGserver>/fog/ osid=5 imgType=n shutdown= loglevel=4 consoleblank=0
    reboot=bios fdrive= chkdsk=0 hostname=<TEST-HOST-1>[/CODE]
    The clients are given the correct information to launch udp-receiver

    Here’s /var/log/messages:
    [CODE]rpc.mountd: authenticated mount request from <TEST-HOST-1>:816 for /images (/images)
    udpcast: New connection from <TEST-HOST-1> (#0)
    udpcast: first connection: min wait[0] secs - max wait[90] - min clients[2]
    rpc.mountd: authenticated mount request from <TEST_HOST-2>:822 for /images (/images)
    udpcast: New connection from <TEST_HOST-2> (#1)
    udpcast: min receivers[2] reached: starting
    udpcast: Starting transfer: file[] pipe[] port[63142] if[eth0] participants[2]
    …unrelated entries…
    udpcast: dropped client #0 because of timeout
    udpcast: Disconnecting #0 (<TEST-HOST-1>)
    udpcast: dropped client #1 because of timeout
    udpcast: Disconnecting #1 (<TEST-HOST-2>)[/CODE]
    Both clients connect, broadcast apparently starts and then times out.

    Here’s /opt/fog/log/multicast.log.udpcast.1:
    [CODE]Udp-sender 2007-12-28
    Using mcast address 232.blah.blah.blah
    UDP sender for (stdin) at <myFOGserver> on eth0
    Broadcasting control to 224.0.0.1
    New connection from <TEST-HOST-1> (#0) 00000009
    New connection from <TEST-HOST-2> (#1) 00000009
    Starting transfer: 00000009
    Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
    …repeats above line several times…
    Dropping client #0 because of timeout
    Disconnecting #0 (<TEST-HOST-1>)
    Dropping client #1 because of timeout
    Disconnecting #1 (<TEST-HOST-2>)

    gzip: stdout: Broken pipe
    Udp-sender 2007-12-28
    Using mcast address 232.blah.blah.blah
    UDP sender for (stdin) at <myFOGserver> on eth0
    Broadcasting control to 224.0.0.1[/CODE]
    Same thing here as in /var/log/messages.

    Meanwhile both clients show “Please wait…” on their screens.
    Does anyone have any ideas on how to fix this or to continue troubleshooting?



  • Bump



  • I have just started receiving the above issue. My Multicast reaches about 50% at 4.01GiB/Min then hangs completely. I see a message in the multicast log saing:

    Bad command 0300
    Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=106672

    then

    Dropping client #0 because of timeout
    Disconnecting #0 (192.168.3.85)
    Dropping client #1 because of timeout
    Disconnecting #1 (192.168.3.170)

    gzip: stdout: Broken pipe
    Udp-sender 2007-12-28
    UDP sender for (stdin) at 192.168.4.2 on eth0
    Broadcasting control to 224.0.0.1

    We recently upgraded all our switches to HP ProCurve 5406zl and I think this error has something to do with IGMP Snooping, not fog…

    Multicast log shows:
    –mcast-data-address 239.168.4.2

    anyone else had any experience with this problem?


  • Moderator

    think i may have to try your 239. hack i received a
    gzip: stdout: Broken pipe today and multicast has been real slow. also multicast log has lots of timeout lines even though it still multicasts @ 550 mb

    cheers for the how to



  • I’d almost forgotten about that - we’re still using the 239 hack for now.
    The main change I wanted to make was to dynamically find the last three octets of the FOG servers IP address instead of statically assigning it. I can imagine, if we ever had to change the FOG server’s IP, that this static assignment might cause some problems until we remember that we had done that.


  • Moderator

    [quote=“afmrick, post: 4316, member: 417”]We got it to work manually across or WAN by adding the following flag to [B]udp-sender[/B] command (above):
    [code]–mcast-data-address 239.blah.blah.blah[/code]
    The default [B]232[/B].blah.blah.blah is reserved/special apparently. Here’s a link that sort of breaks that down:
    [COLOR=#000000][FONT=arial][URL=‘http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml’][U][COLOR=#0000cc]http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml[/COLOR][/U][/URL][/FONT][/COLOR]

    To fix it in FOG, I add that command line parameter to every [B]udp-sender[/B] command in [B]/opt/fog/service/common/lib/MulticastTask.class.php[/B] and restarted all the FOG services. Everything works just the way it should have now. However, adding this parameter to every udp-sender command in the file is just a “make-it-go” hack that I’ll have to clean up next week. This should really be an option to set within FOG. …Still, Yahoo!!!

    Note: [B]blah.blah.blah[/B] is the last 3 octets of the FOG server’s IP address.[/quote]

    have you managed to clean this up/sort out the issue?



  • [quote=“Corey Cochran, post: 4932, member: 1582”]Hi, I am having the same issues you have with only being able to multicast one client. I am not versed well in php. Can you send me a little more details about where in the file you inserted the address and maybe a screenshot.[/quote]

    Er, how to make it clearer? Let’s say your FOG server’s IP address is 192.168.1.1. Udp-sender will use [B]232[/B].168.1.1 as the data address by default. Something like [B]239[/B] might work better on your network. You can test it from the command line with something like this on your FOG server: [CODE]udp-sender --file /opt/fog/log/multicast.log --ttl 32 --mcast-data-address 239.168.1.1 --min-receivers 2[/CODE] and the following on each of your two clients: [CODE]udp-receiver --mcast-rdv-address 192.168.1.1[/CODE]

    IF that works, you could add the “–mcast-data-address 239.168.1.1” to all four udp-sender commands in [B]/opt/fog/service/common/lib/MulticastTask.class.php [/B]with whichever text editor makes you happy. The four modified lines will look something like this but, the only change was adding “–mcast-data-address 239.168.1.1”: [CODE]$cmd .= “gunzip -c “” . $strSys . “” | " . UPDSENDERPATH . " --min-receivers " . $this->getClientCount() . " --portbase " . $this->getPortBase() . " " . $interface . " --mcast-data-address 239.168.1.1 $wait --half-duplex --ttl 32 --nokbd;”;[/CODE]
    Eventually I want to add some code to define the 239.168.1.1 dynamically (in case the server IP ever changes) instead of statically assigning it but, it’s pretty low on my list right now.



  • [quote=“afmrick, post: 4316, member: 417”]We got it to work manually across or WAN by adding the following flag to [B]udp-sender[/B] command (above):
    [code]–mcast-data-address 239.blah.blah.blah[/code]
    The default [B]232[/B].blah.blah.blah is reserved/special apparently. Here’s a link that sort of breaks that down:
    [COLOR=#000000][FONT=arial][URL=‘http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml’][U][COLOR=#0000cc]http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml[/COLOR][/U][/URL][/FONT][/COLOR]

    To fix it in FOG, I add that command line parameter to every [B]udp-sender[/B] command in [B]/opt/fog/service/common/lib/MulticastTask.class.php[/B] and restarted all the FOG services. Everything works just the way it should have now. However, adding this parameter to every udp-sender command in the file is just a “make-it-go” hack that I’ll have to clean up next week. This should really be an option to set within FOG. …Still, Yahoo!!!

    Note: [B]blah.blah.blah[/B] is the last 3 octets of the FOG server’s IP address.[/quote]
    Just started to read your thread and was going to suggest you look at CISCO devices with multi-cast issues.



  • Hi, I am having the same issues you have with only being able to multicast one client. I am not versed well in php. Can you send me a little more details about where in the file you inserted the address and maybe a screenshot.



  • We got it to work manually across or WAN by adding the following flag to [B]udp-sender[/B] command (above):
    [code]–mcast-data-address 239.blah.blah.blah[/code]
    The default [B]232[/B].blah.blah.blah is reserved/special apparently. Here’s a link that sort of breaks that down:
    [COLOR=#000000][FONT=arial][URL=‘http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml’][U][COLOR=#0000cc]http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml[/COLOR][/U][/URL][/FONT][/COLOR]

    To fix it in FOG, I add that command line parameter to every [B]udp-sender[/B] command in [B]/opt/fog/service/common/lib/MulticastTask.class.php[/B] and restarted all the FOG services. Everything works just the way it should have now. However, adding this parameter to every udp-sender command in the file is just a “make-it-go” hack that I’ll have to clean up next week. This should really be an option to set within FOG. …Still, Yahoo!!!

    Note: [B]blah.blah.blah[/B] is the last 3 octets of the FOG server’s IP address.



  • Just more information:

    I’m trying this again manually from the command line:
    From the server i am running:[CODE]udp-sender --file /opt/fog/log/multicast.log --ttl 32 --min-receivers 2[/CODE]
    and from two different clients I’m running:[CODE]udp-receiver --mcast-rdv-address <myFOGserver>[/CODE]
    …and here’swhat I get on the server:[CODE]# udp-sender --file /opt/fog/log/multicast.log --ttl 32 --min-receivers 2
    Udp-sender 2007-12-28
    Using mcast address 232.blah.blah.blah
    UDP sender for /opt/fog/log/multicast.log at <myFOGserver> on eth0
    Broadcasting control to 224.0.0.1
    New connection from <TEST-HOST-1> (#0) 00000009
    Ready. Press any key to start sending data.
    New connection from <TEST-HOST-2> (#1) 00000009
    Ready. Press any key to start sending data.
    Starting transfer: 00000009
    Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
    Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
    Bad command 0200
    Bad command 0200
    Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=103166
    Disconnecting #0 (<TEST-HOST-1>)
    Disconnecting #1 (<TEST-HOST-2>)[/CODE]

    The “Bad command 0200”'s appear when I “Press any key to start receiving data!” on the clients. If I add the “–nokbd” flag to all three then I don’t get the “Bad command 0200” error but still get the “Timeout notAnswered” errors.

    To me, this suggests that the “OK Go” message is not getting communicated correctly. It still works fine to only 1 client.



  • I’m keeping an eye on this one. I have the same problem.


Log in to reply
 

531
Online

39008
Users

10721
Topics

101841
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.