FOG Multicast Not working
-
@awellis can you drop the firewall all together until you can get multicasting working? If you need the firewall per company policy we can reverse engineer what ports are being used.
systemctl stop firewalld
-
@george1421 I don’t think the firewall should be an issue for FOG (just udp-sender/udp-receiver), as I don’t see any traffic hitting any port but 59106 and the ones after that, but I did disable the firewall and am still seeing the same issue.
-
@awellis Are the fog server and target computers on the same subnet.
-
@george1421 Yes they are all on the same subnet
-
@awellis So just for clarity, if we discount all of the debugging you’ve done up to this point. if you shutdown the centos firewall and then setup a mutlicast job with 1 client the client just sits there an never joins the multicast?
I have it on my task list to spin up a new fog server to retest multicasting across subnets again. My OS of choice is centos. So the system I’ll spinup will be centos 7.4. I’ll do everything in a virtual environment to make it easy on myself.
-
It actually does join the session, the session just never starts. I left it running overnight yesterday and all the computers got to the blue partclone screen and just sat there. I don’t recall if I did the same thing with just 1 multicast host but Ill give that a shot tomorrow morning when I get back to the office
-
@awellis said in FOG Multicast Not working:
… Command: /usr/local/sbin/udp-sender --interface em2 --min-receivers 3 --max-wait 600 …
When you see this in the log file, was this scheduled for three clients? As you see the command wants at least/exactly those three clients to connect and would not start otherwise! Beside that, have you checked to see if the process is really running (
ps ax | grep udp
)? Before your next try make really sure there is no old multicast task still scheduled and kill any still running stuff (killall -9 udp-sender
) on your FOG server to make sure nothing is interfering. -
@sebastian-roth said in FOG Multicast Not working:
It was scheduled for 3 clients and all 3 opened up partclone. That said, shouldn’t the timeout kick in after 10 minutes and go off anyway?
I did check for the udp process and it is running after starting it. I’ve also tried running it immediately after a server restart and no udp processes running and no tasks scheduled.
edit: FOG does start up 2 separate udp-senders with the task… is that normal?
edit 2: looks like there’s 1 task for each partition
-
@awellis I have some sad news for you. I setup a clean 1.5.0 fog environment and multicasting works “as described on the tin”.
Is your fog server in a virutal environment?
-
@george1421 No it’s bare metal ATM. I’m going to just do a reinstall and see if that fixes it, since at this point I don’t know what else could be going wrong. Did you do anything special with the install or just basically fog installer on the fresh install?
-
@awellis Nothing special. in this setup I created a dedicated imaging network and then fog server had a second network connection for internet access while installing. I didn’t even configure anything in fog. I copied over an image from my production server and created the image definition, and from right in the image definition I started the multicast. (note I did register a target VM first then rebooted into FOG iPXE and then setup the multicast in the fog ui). It multicast imaged right away.
-
@george1421 Did another full reinstall with a different NIC just in case and still nothing. Any chance the type of image taken matters? Partclone vs partimage? compression?
-
@awellis Just for clarity, you don’t have ANY firewalls or screening routers between the fog server and the target computer?
During my testing of multiple subnet multicasting FOG uses 239. (plus the first 3 octets of the fog server IP address) as the multicast address. Such as the fog server IP is 192.168.1.20 so the multicast address will be 239.192.168.1 and the port will be a random high port.
The fog server will send out messages in as multicast and the target will respond with short unicast messages with checksum data back to the fog server.
For your testing the target system and fog server are on the same subnet all on ethernet switches? You don’t have any fiber or other links where the MTU will be less than 1400?
-
Correct on all counts. Its running on a single switch on the same subnet. I did hook up wireshark and saw all the messages go back and forth with the initial test, but when running fog, the server just sends the one packet out and nothing else happens
-
@awellis If you run tcpdump on the fog server you could use this capture filter "ether multicast or host <ip_addr_of_target> "
Setup a test multicast from the image definition with a single host (client=1) then pxe boot the target system and join the session.
You should see the beacons from the fog server and then the client will chat back using a unicast message. Once the stream starts you should see the a bunch of multicasts from the fog server and then the client chatting back with checksums.
-
So I went ahead and installed fog and centos on a different (much older) machine and it worked fine. Looks like there’s some hardware issue with the original server I had been using, but that issue is not the NIC because I also tried a new one of those to no avail. Thanks for all your help @george1421!
-
@george1421 Didn’t see any packets except for a single UDP packet every 2 seconds or so after the client checked in and loaded up into partclone. Turned out there’s some hardware issue with the server I had been using. Thanks for your help!