UNSOLVED FOG Multicast Not working

  • I’ve been trying to get Multicast working with FOG but I’m stuck. Unicast works fine, but multicast just stops at preparing. I went through the troubleshooting steps at https://wiki.fogproject.org/wiki/index.php?title=Troubleshooting_a_multicast#Troubleshooting, but when I try testing with 1 client, I don’t get the .fogsettings file listed (or anything at all). I’ve also tried specifying the interface on the server, and the address on the receiver, but still nothing. multicast log file only shows this:

    10:04:18.973480 Using mcast address
    10:04:18.973604 UDP sender for /opt/fog/.fogsettings at on em2 
    10:04:18.973621 Broadcasting control to

    Server is on 1.5.0, CentOS7. Right now, there is only a simple switch between the server and the client, and the client does boot fine. Anybody know what I’m doing wrong here?

  • @george1421 Didn’t see any packets except for a single UDP packet every 2 seconds or so after the client checked in and loaded up into partclone. Turned out there’s some hardware issue with the server I had been using. Thanks for your help!

  • So I went ahead and installed fog and centos on a different (much older) machine and it worked fine. Looks like there’s some hardware issue with the original server I had been using, but that issue is not the NIC because I also tried a new one of those to no avail. Thanks for all your help @george1421!

  • Moderator

    @awellis If you run tcpdump on the fog server you could use this capture filter "ether multicast or host <ip_addr_of_target> "

    Setup a test multicast from the image definition with a single host (client=1) then pxe boot the target system and join the session.

    You should see the beacons from the fog server and then the client will chat back using a unicast message. Once the stream starts you should see the a bunch of multicasts from the fog server and then the client chatting back with checksums.

  • Correct on all counts. Its running on a single switch on the same subnet. I did hook up wireshark and saw all the messages go back and forth with the initial test, but when running fog, the server just sends the one packet out and nothing else happens

  • Moderator

    @awellis Just for clarity, you don’t have ANY firewalls or screening routers between the fog server and the target computer?

    During my testing of multiple subnet multicasting FOG uses 239. (plus the first 3 octets of the fog server IP address) as the multicast address. Such as the fog server IP is so the multicast address will be and the port will be a random high port.

    The fog server will send out messages in as multicast and the target will respond with short unicast messages with checksum data back to the fog server.

    For your testing the target system and fog server are on the same subnet all on ethernet switches? You don’t have any fiber or other links where the MTU will be less than 1400?

  • @george1421 Did another full reinstall with a different NIC just in case and still nothing. Any chance the type of image taken matters? Partclone vs partimage? compression?

  • Moderator

    @awellis Nothing special. in this setup I created a dedicated imaging network and then fog server had a second network connection for internet access while installing. I didn’t even configure anything in fog. I copied over an image from my production server and created the image definition, and from right in the image definition I started the multicast. (note I did register a target VM first then rebooted into FOG iPXE and then setup the multicast in the fog ui). It multicast imaged right away.

  • @george1421 No it’s bare metal ATM. I’m going to just do a reinstall and see if that fixes it, since at this point I don’t know what else could be going wrong. Did you do anything special with the install or just basically fog installer on the fresh install?

  • Moderator

    @awellis I have some sad news for you. I setup a clean 1.5.0 fog environment and multicasting works “as described on the tin”.

    Is your fog server in a virutal environment?

  • @sebastian-roth said in FOG Multicast Not working:

    It was scheduled for 3 clients and all 3 opened up partclone. That said, shouldn’t the timeout kick in after 10 minutes and go off anyway?

    I did check for the udp process and it is running after starting it. I’ve also tried running it immediately after a server restart and no udp processes running and no tasks scheduled.

    edit: FOG does start up 2 separate udp-senders with the task… is that normal?

    edit 2: looks like there’s 1 task for each partition

  • Moderator

    @awellis said in FOG Multicast Not working:

    … Command: /usr/local/sbin/udp-sender --interface em2 --min-receivers 3 --max-wait 600 …

    When you see this in the log file, was this scheduled for three clients? As you see the command wants at least/exactly those three clients to connect and would not start otherwise! Beside that, have you checked to see if the process is really running (ps ax | grep udp)? Before your next try make really sure there is no old multicast task still scheduled and kill any still running stuff (killall -9 udp-sender) on your FOG server to make sure nothing is interfering.

  • It actually does join the session, the session just never starts. I left it running overnight yesterday and all the computers got to the blue partclone screen and just sat there. I don’t recall if I did the same thing with just 1 multicast host but Ill give that a shot tomorrow morning when I get back to the office

  • Moderator

    @awellis So just for clarity, if we discount all of the debugging you’ve done up to this point. if you shutdown the centos firewall and then setup a mutlicast job with 1 client the client just sits there an never joins the multicast?

    I have it on my task list to spin up a new fog server to retest multicasting across subnets again. My OS of choice is centos. So the system I’ll spinup will be centos 7.4. I’ll do everything in a virtual environment to make it easy on myself.

  • @george1421 Yes they are all on the same subnet

  • Moderator

    @awellis Are the fog server and target computers on the same subnet.

  • @george1421 I don’t think the firewall should be an issue for FOG (just udp-sender/udp-receiver), as I don’t see any traffic hitting any port but 59106 and the ones after that, but I did disable the firewall and am still seeing the same issue.

  • Moderator

    @awellis can you drop the firewall all together until you can get multicasting working? If you need the firewall per company policy we can reverse engineer what ports are being used.
    systemctl stop firewalld

  • @george1421 Sure. On the Centos 7 Wiki Installation page (https://wiki.fogproject.org/wiki/index.php?title=CentOS_7#Continue_pre-config) the instructions say “Open UDP port 49152 through 65532” and don’t mention port 9000 at all

  • Moderator

    @awellis said in FOG Multicast Not working:

    whereas the initial firewall setup

    Will you explain this statement? What firewall?