Multicast Issues on Centos 7
-
I’ve updated to version 7386 to try and get this part of things fixed. Will post back once I test this.
-
I am now getting logging working, but
[04-28-16 7:46:38 am] ___ ___ ___ /\ \ /\ \ /\ \ /::\ \ /::\ \ /::\ \ /:/\:\ \ /:/\:\ \ /:/\:\ \ /::\-\:\ \ /:/ \:\ \ /:/ \:\ \ /:/\:\ \:\__\ /:/__/ \:\__\ /:/__/_\:\__\ \/__\:\ \/__/ \:\ \ /:/ / \:\ /\ \/__/ \:\__\ \:\ /:/ / \:\ \:\__\ \/__/ \:\/:/ / \:\/:/ / \::/ / \::/ / \/__/ \/__/ ########################################### # Free Computer Imaging Solution # # Credits: # # http://fogproject.org/credits # # GNU GPL Version 3 # ########################################### [04-28-16 7:46:38 am] Interface Ready with IP Address: xx.xx.xx.xx [04-28-16 7:46:38 am] Interface Ready with IP Address: xx.xx.xx.xx [04-28-16 7:46:38 am] Interface Ready with IP Address: 192.168.240.10 [04-28-16 7:46:38 am] Interface Ready with IP Address: REMOVED [04-28-16 7:46:38 am] Interface Ready with IP Address: REMOVED [04-28-16 7:46:38 am] * Starting MulticastManager Service [04-28-16 7:46:38 am] * Checking for new items every 10 seconds [04-28-16 7:46:38 am] * Starting service loop [04-28-16 7:46:38 am] * No tasks found! [04-28-16 7:46:48 am] * No tasks found! [04-28-16 7:46:58 am] * No tasks found! [04-28-16 7:47:09 am] * No tasks found! [04-28-16 7:47:19 am] * No tasks found! [04-28-16 7:47:29 am] * No tasks found! [04-28-16 7:47:39 am] * No tasks found! [04-28-16 7:47:49 am] * No tasks found! [04-28-16 7:48:00 am] * No tasks found! [04-28-16 7:48:10 am] * No tasks found! [04-28-16 7:48:20 am] * No tasks found! [04-28-16 7:48:30 am] * No tasks found! [04-28-16 7:48:40 am] * No tasks found! [04-28-16 7:48:50 am] * No tasks found! [04-28-16 7:49:01 am] * No tasks found! [04-28-16 7:49:11 am] * No tasks found! [04-28-16 7:49:21 am] * No tasks found! [04-28-16 7:49:31 am] * No tasks found! [04-28-16 7:49:41 am] * No tasks found! [04-28-16 7:49:51 am] * No tasks found! [04-28-16 7:50:02 am] | Sleeping for 10 seconds to ensure tasks are properly submitted [04-28-16 7:50:12 am] | 0 tasks to be cleaned [04-28-16 7:50:12 am] | 1 task found [04-28-16 7:50:12 am] | Task (4) Multi-Cast Task is new! [04-28-16 7:50:12 am] | Task (4) Multi-Cast Task has been cleaned. Udp-sender 20120424 [04-28-16 7:50:12 am] | Task (4) /images/W7Px64PreSysprep image file found. Using mcast address 232.168.240.10 UDP sender for (stdin) at 192.168.240.10 on em2 Broadcasting control to 224.0.0.1 [04-28-16 7:50:12 am] | Task (4) 2 client(s) found. [04-28-16 7:50:12 am] | Task (4) Multi-Cast Task sending on base port: 50028 [04-28-16 7:50:12 am] | CMD: cat /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;cat /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint; [04-28-16 7:50:12 am] | Task (4) Multi-Cast Task has started. [04-28-16 7:50:22 am] | 0 tasks to be cleaned [04-28-16 7:50:22 am] | 1 task found [04-28-16 7:50:22 am] | Task (4) Multi-Cast Task is already running PID 13546 [04-28-16 7:50:32 am] | 0 tasks to be cleaned [04-28-16 7:50:32 am] | 1 task found [04-28-16 7:50:32 am] | Task (4) Multi-Cast Task is already running PID 13546 [04-28-16 7:50:42 am] | 0 tasks to be cleaned [04-28-16 7:50:42 am] | 1 task found
the above after letting it for quite a few minutes the above ending to this log gets repeatedly entered. I also ran the ran the command
ps aux | grep udp
ps aux | grep udp root 13546 0.0 0.0 115240 1456 ? S 07:50 0:00 sh -c cat /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;cat /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint; root 13548 0.0 0.0 8640 664 ? S 07:50 0:00 /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint root 13753 0.0 0.0 112644 964 pts/1 S+ 07:57 0:00 grep --color=auto udp
Still hangs at the same partclone screen as before.
ps ax | grep FOG 13062 ? Ss 0:00 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager & 13065 ? S 0:06 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager & 13081 ? Ss 0:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator & 13084 ? S 0:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator & 13100 ? Ss 0:00 /usr/bin/php -q /opt/fog/service/FOGSnapinReplicator/FOGSnapinReplicator & 13103 ? S 0:00 /usr/bin/php -q /opt/fog/service/FOGSnapinReplicator/FOGSnapinReplicator & 13119 ? Ss 0:00 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskScheduler & 13122 ? S 0:00 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskScheduler & 13134 ? Ss 0:00 /usr/bin/php -q /opt/fog/service/FOGPingHosts/FOGPingHosts & 13137 ? S 0:00 /usr/bin/php -q /opt/fog/service/FOGPingHosts/FOGPingHosts & ps aux | grep Multicast root 13062 0.0 0.8 323076 15852 ? Ss 07:46 0:00 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager & root 13065 0.4 0.6 411168 13028 ? S 07:46 0:05 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager & root 14024 0.0 0.0 112648 960 pts/1 S+ 08:06 0:00 grep --color=auto Multicast
-
@BedCruncher What’s output if you go to (in the browser)
http://ip.of.fog.here/fog/service/ipxe/boot.php?mac=macofhostwithcolons
Where macofhostwithcolons is the mac address (with colons) of the host that’s setup to do this tasking?
-
@Tom-Elliott
Here’s the info. I also compared it to a second host that I had added to a group for multitasking, and it appears identical, barring the mac address of course. Let me know what else you can recommend and I’ll be glad to accommodate.#!ipxe set fog-ip 192.168.240.10 set fog-webroot fog set boot-url http://${fog-ip}/${fog-webroot} kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 keymap= web=192.168.240.10/fog/ consoleblank=0 rootfstype=ext4 rootfstype=ext4 mac=00:26:b9:aa:14:ec ftp=192.168.240.10 storage=192.168.240.10:/images/ storageip=192.168.240.10 web=192.168.240.10/fog/ osid=5 consoleblank=0 irqpoll hostname=0026b9aa14ec chkdsk=0 img=W7Px64PreSysprep imgType=mps imgPartitionType=all imgid=1 imgFormat= PIGZ_COMP=-6 hostearly=1 port=50028 type=down mc=yes imgfetch init_32.xz boot
-
@BedCruncher When the client boots to begin the tasking, is it on a separate subnet from the 192.168.240.10 server?
-
@Tom-Elliott
No it’s not. I have the em2 interface controlling the DHCP server and have a cable running to a 5 port Gig switch. It’s a dumb switch with no special routing or anything. I even checked under the dhcpd.leases file and can find it there.lease 192.168.240.30 { starts 4 2016/04/28 12:50:28; ends 4 2016/04/28 18:50:28; cltt 4 2016/04/28 12:50:28; binding state active; next binding state free; rewind binding state free; hardware ethernet 00:26:b9:aa:14:ec; uid "\001\000&\271\252\024\354"; }
EDIT:
Updated to build 7410 this morning to hopefully see if it might have been fixed.
EDIT 2:
Updated and initiated a new test and still getting the same results. -
Hey guys, I had to step away over the weekend. Has there been any new updates?
-
@BedCruncher Now that we have the logging back can you please try running udp-sender by hand again using the exact same command as seen in the logs?
-
@Sebastian-Roth
So I tried both of the udp-sender commandsudp-sender --file /opt/fog/.fogsettings --log /tmp/multicast.log --ttl 32 --nopointopoint --interface em2
which results in
13:41:14.387632 Using mcast address 232.168.240.10 13:41:14.387698 UDP sender for /opt/fog/.fogsettings at 192.168.240.10 on em2 13:41:14.387708 Broadcasting control to 224.0.0.1
displaying in the logs and then no task found
and also
cat /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;cat /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;
which results in the same message as above, but not the repeated no active task found entries. I also updated to build 7470 earlier. It is acting differently. I don’t see that the udp-sender and udp-receiver are talking any more.
-
@BedCruncher When using this command you need to bring up at least two clients with
udp-receive
to actually make it start sending - see the--min-receivers 2
option:cat /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint
So please try this again by hand. Start the udp-sender up on your FOG server like this (don’t need the logging option - will just print on screen). Then boot two of your clients in debug mode and run:
udp-receiver --file test.img --portbase 50028 --ttl 32 --nokbd
(on both clients). If your network is setup properly this should work!!! -
@Sebastian-Roth
I did that through my dumb switch and also removed the min-receivers switch and tried it manually. Still didn’t work. I then plugged the server directly into the switch and tried only the simplified command that I got to work before and it also failed.I will try that tomorrow as I won’t have access to the server until then.
-
@BedCruncher Can you run a join.me session?
-
@BedCruncher Seeing as, as I understand it, you’ve removed managed switch scenarios as a potential causing problems, but you’re still having issues, does your FOG server (or something else) have a firewall running that’s just blocking any multicast/UDP traffic?
-
@Wayne-Workman said in Multicast Issues on Centos 7:
Look at this:
https://wiki.fogproject.org/wiki/index.php?title=Troubleshoot_Downloading_-_Multicasthttps://wiki.fogproject.org/wiki/index.php?title=Multicast
let us know how it goes, we are here to help.
Many times, I’ll post things and later on an answer is found that was already in one of the links I posted… seeing how long this thread has gone for, I’m just going to repost these links to be looked at again by the OP and any future readers.
-
To answer your questions, I will handle the second one first. I went through and verified that yes, I do have a firewall on my server (firewalld), and as a way to get around the default zone of public that the em2 interface was assigned, I moved it to the “trusted” zone so that it wouldn’t filter any traffic whatsoever. After that, I then was able to get some more information for muticasting testing for you.
I got to playing around with the the commands and debug mode on an individual client and I noticed a slight issue when comparing the multicast port in the fog settings with the command that is logged when I trigger and try to run the multicast task. The logged part I am referencing is
--portbase 50028
but when I checked that against what I had listed in the fogsettings multicast port56904
. I attempted to change to settings to mirror what is in the log file, but continually fail at the part where I attempt to run the commandcat /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;cat /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;
However, if I follow that part near the bottom middle of this Multicasting page, I can get it to work running the same command
gunzip -c /images/W7Px64PreSysprep/d1p1.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;gunzip -c /images/W7Px64PreSysprep/d1p2.img | /usr/local/sbin/udp-sender --interface em2 --min-receivers 2 --max-wait 600 --portbase 50028 --full-duplex --ttl 32 --nokbd --nopointopoint;
Then both clients begin to receive the images as expected.
-
I followed up as I just posted on the second link and with some tweaking I was able to get somewhere on this, I believe you should be able to understand what I’ve done according to the second link and it’s testing steps.
-
@BedCruncher The troubleshooting guide forgets that the client actually does the gunzip action, not the server. So your finding makes perfect sense. To fix the client test, you first need to pull the request to a gunzip equiv as well.
For example,
The receiver command should be:
mkfifo /tmp/pigz1 udp-receiver --nokbd --portbase 50028 --ttl 32 --mcast-rdv-address <FOGSERVERIP> 2>/dev/null >/tmp/pigz1 & pigz -d -c </tmp/pigz1 | partclone.restore --ignore_crc -O /dev/sda2
Of course adjust the /dev/sda to the respective file you’re trying to load.
Potentially, the most important part is that of the --mcast-rdv-address as it’s what tells the client WHERE to get the broadcast stream.
-
@BedCruncher Definitely shouldn’t make a different if you are using compressed or uncompressed image data (when testing)!!! Sure pushing the image to disk is a different story but just trying to get the multicast thing running it should work either way I suppose.
-
@Sebastian-Roth If he’s testing real data (d1p1.img for example) it most definitely will.
-
@Tom-Elliott
I will try the command you posted above and report back what it finds. Thanks.