Multicast Issues on Centos 7

Wayne Workman

@Wayne-Workman said in Multicast Issues on Centos 7:

Look at this:
https://wiki.fogproject.org/wiki/index.php?title=Troubleshoot_Downloading_-_Multicast

https://wiki.fogproject.org/wiki/index.php?title=Multicast

let us know how it goes, we are here to help.

Many times, I’ll post things and later on an answer is found that was already in one of the links I posted… seeing how long this thread has gone for, I’m just going to repost these links to be looked at again by the OP and any future readers.

BedCruncher

@Tom-Elliott

To answer your questions, I will handle the second one first. I went through and verified that yes, I do have a firewall on my server (firewalld), and as a way to get around the default zone of public that the em2 interface was assigned, I moved it to the “trusted” zone so that it wouldn’t filter any traffic whatsoever. After that, I then was able to get some more information for muticasting testing for you.

I got to playing around with the the commands and debug mode on an individual client and I noticed a slight issue when comparing the multicast port in the fog settings with the command that is logged when I trigger and try to run the multicast task. The logged part I am referencing is --portbase 50028 but when I checked that against what I had listed in the fogsettings multicast port 56904. I attempted to change to settings to mirror what is in the log file, but continually fail at the part where I attempt to run the command

cat /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;cat /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;

However, if I follow that part near the bottom middle of this Multicasting page, I can get it to work running the same command

gunzip -c /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;gunzip -c /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;

Then both clients begin to receive the images as expected.

BedCruncher

@Wayne-Workman

I followed up as I just posted on the second link and with some tweaking I was able to get somewhere on this, I believe you should be able to understand what I’ve done according to the second link and it’s testing steps.

Tom Elliott

@BedCruncher The troubleshooting guide forgets that the client actually does the gunzip action, not the server. So your finding makes perfect sense. To fix the client test, you first need to pull the request to a gunzip equiv as well.

For example,

The receiver command should be:

mkfifo /tmp/pigz1

udp-receiver --nokbd --portbase 50028 --ttl 32 --mcast-rdv-address <FOGSERVERIP> 2>/dev/null >/tmp/pigz1 &
pigz -d -c </tmp/pigz1 | partclone.restore --ignore_crc -O /dev/sda2

Of course adjust the /dev/sda to the respective file you’re trying to load.

Potentially, the most important part is that of the --mcast-rdv-address as it’s what tells the client WHERE to get the broadcast stream.

Sebastian Roth

@BedCruncher Definitely shouldn’t make a different if you are using compressed or uncompressed image data (when testing)!!! Sure pushing the image to disk is a different story but just trying to get the multicast thing running it should work either way I suppose.

Tom Elliott

@Sebastian-Roth If he’s testing real data (d1p1.img for example) it most definitely will.

BedCruncher

@Tom-Elliott
I will try the command you posted above and report back what it finds. Thanks.

BedCruncher

@Tom-Elliott
Ok, so now we are getting somewhere. I had to run another multicast test really quickly on both of the client computers as I had earlier blown away the partition tables that were restored. That was just to have them properly put back in place. I since have ran the command that you specified on the client and am testing it for both of the stored disk images and they are restoring back. I will let it run it’s course… only take 15 minutes or less for the restore to take place and then try doing the multicast test directly and see if it hangs or not.

EDIT:
The manual restore for both partitions that I did worked correctly and both now boot. I will try scheduling a multicast task again and see where we are at.

BedCruncher

OK!!! I think we have something here. So I finally figured out that part of the problem seems to be in the firewall zone as specified in the Centos 7 Setup guide. The issue I think stems from that the public zone seems to be blocking the multicast ports. I even explicitly told the firewalld daemon to add the interface em2 to the trusted zone, but it never did unless I manually specified in the NIC interface file ZONE=trusted. This seems to be a bug of sorts in that project.

I suppose that you could also specify a port range for the firewalld daemon to allow through. In my case the NIC hosting the FOG Server is segregated my other network. So in this case I don’t care to have all ports open on that interface because there is no reason to block that traffic. Keep in mind the interface em1 is still in the public zone and more locked down and restricted.

The result seems to be that I am now able get it to consistently image across at least two devices and so far it has persisted across device and server reboots. So I think we have made a good leap in that regards. I will keep checking back here for a few days to try and update if I run into issues.

Thank you all for persistently fighting with me to get this rolling. I do much appreciate all you have done for me and with me.
@Sebastian-Roth @Wayne-Workman @Tom-Elliott

Wayne Workman

@BedCruncher Great find, try this?

firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 0 -m udp -p udp -m pkttype --pkt-type multicast -j ACCEPT

Source:
http://superuser.com/questions/837340/how-do-i-enable-set-multicast-rules-using-firewalld-in-rhel7-centos-7

BedCruncher

@Wayne-Workman
No, I hadn’t. The command I ran was

firewall-cmd --permanent --zone=trusted --change-interface=em2

that was specified at this RHEL Firewalld page page. This seems to be this link firewalld.zones.

So I can change mine around and try it, but I wouldn’t have managed to do that particular one myself as I’m by no means a iptables guru. I will try to apply that tomorrow to test it out and let you know.

Wayne Workman

@BedCruncher If you find that the command I posted - or any command - allows you to multicast, I will immediately update all firewalld documentation we have in the wiki to reflect your success.

dvchuyen

@Wayne-Workman said in Multicast Issues on Centos 7:

Yes, I confirm the command work. I found it and solved my problem few days ago.

https://forums.fogproject.org/topic/7194/could-not-pxe-boot-input-output-error-when-do-multicast/31

BedCruncher

@Wayne-Workman
Sorry… I reread over what I had posted and it didn’t seem clear. The command above that I had ran was “supposed” to make it permanent, but failed to do so. I had to specify it statically in the NIC interface file. I will also double check the command you posted and test to see if it persists across reboots.

Wayne Workman

@dvchuyen I’ve updated the Fedora 23 and the CentOS 7 Wiki articles.

BedCruncher

@Wayne-Workman
I am not quite experiencing the same perfect results as @dvchuyen with regards to that firewall rule. I got it to work once, but since then it’s been extremely problematic.

 firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 0 -m udp -p udp -m pkttype --pkt-type multicast -j ACCEPT

verified it was in there with ipdtable-save and ran firewall-cmd --reload and systemctl restart firewalld. This was to ensure that it was all properly in there and correct. I even rebooted the server to ensure that there wasn’t something in the network service that was gumming it up. I have also deleted the tasks out of FOG and manually triggered it again and still hang.

Wayne Workman

@BedCruncher can you turn off your firewall and see if multicast works then?

systemctl stop firewalld

BedCruncher

@Wayne-Workman
I will do that really quick, but it seems like it might be a different issue with firewalld now. I got it to work again if I ran systemctl restart firewalld.service after a reboot. then it would start the imaging seemingly consistently. For some reason the rules aren’t correctly applying at boot time.

Wayne Workman

@BedCruncher From the behaviour you’ve been describing - I no longer believe this is a firewall issue.

Please just turn off firewall until we can complete troubleshooting with some sort of conclusive findings:

systemctl stop firewalld
systemctl disable firewalld

BedCruncher

@Wayne-Workman
God, I feel like I’m crying wolf all the time now. I disabled the firewalld service and it was hanging there as before. I then ran the commands

systemctl stop FOGMulticastManager
killall udp-sender
killall udp-sender
killall udp-sender
mysql -u root fog
TRUNCATE TABLE multicastSessionsAssoc;
TRUNCATE TABLE multicastSessions;
TRUNCATE TABLE tasks;
quit;
systemctl start FOGMulticastManager

I then tested it, and ran the multicast test. I again ran the commands to do all that above. I rebooted and ran the above commands again to ensure I was working clean and disabled the firewall and tested and so far it seems to be working. Please disregard.

Multicast Issues on Centos 7

176

12.1k

17.3k

155.4k