Multicast without registration starts OK, but hangs and disconnects clients due to timeout.
-
@jmvela2x I updated to 1.5.9-RC1.4, changed line 662 per your earlier instructions and tried to kick off a multicast task. The log shows the task start, then completed, then killed, then completed again, and then removes itself from active tasks. This seems like step backwards. Am I missing something?
-
@jmvela2x Can you please post the relevant part of the log here?
-
[05-05-20 12:08:27 pm] ================================== === ==== ===== ==== === ========= == === == === === ======== ==== == ==== === === ======== ==== == ========= === ==== ==== == ========= === ======== ==== == === === === ======== ==== == ==== === === ========= == === == === === ========== ===== ==== ================================== ===== Free Opensource Ghost ====== ================================== ============ Credits ============= = https://fogproject.org/Credits = ================================== == Released under GPL Version 3 == ================================== [05-05-20 12:08:27 pm] Interface Ready with IP Address: 10.132.81.150 [05-05-20 12:08:27 pm] Interface Ready with IP Address: 127.0.0.1 [05-05-20 12:08:27 pm] Interface Ready with IP Address: 127.0.1.1 [05-05-20 12:08:27 pm] Interface Ready with IP Address: 192.168.122.1 [05-05-20 12:08:27 pm] Interface Ready with IP Address: f223pxefog.fm.intel.com [05-05-20 12:08:27 pm] * Starting MulticastManager Service [05-05-20 12:08:27 pm] * Checking for new items every 10 seconds [05-05-20 12:08:27 pm] * Starting service loop [05-05-20 12:08:27 pm] * No new tasks found [05-05-20 12:08:38 pm] * No new tasks found [05-05-20 12:08:48 pm] * No new tasks found [05-05-20 12:08:58 pm] * No new tasks found [05-05-20 12:09:08 pm] * No new tasks found [05-05-20 12:09:18 pm] * No new tasks found [05-05-20 12:09:28 pm] * No new tasks found [05-05-20 12:09:38 pm] * No new tasks found [05-05-20 12:09:48 pm] | Task ID: 22 Name: Test is new [05-05-20 12:09:48 pm] | Task ID: 22 Name: Test image file found, file: /images/Ubuntu-16.04-Legacy [05-05-20 12:09:48 pm] | Task ID: 22 Name: Test 2 clients found [05-05-20 12:09:48 pm] | Task ID: 22 Name: Test sending on base port 53110 [05-05-20 12:09:48 pm] | Command: /usr/local/sbin/udp-sender --interface eno1 --min-receivers 2 --max-wait 36000 --mcast-rdv-address 10.132.81.150 --portbase 53110 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/Ubuntu-16.04-Legacy/d1p1.img;/usr/local/sbin/udp-sender --interface eno1 --min-receivers 2 --max-wait 30 --mcast-rdv-address 10.132.81.150 --portbase 53110 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/Ubuntu-16.04-Legacy/d1p2.img; [05-05-20 12:09:48 pm] | Task ID: 22 Name: Test has started [05-05-20 12:09:58 pm] | Task ID: 22 Name: Test has been completed [05-05-20 12:09:58 pm] | Task ID: 22 Name: Test has been killed [05-05-20 12:09:58 pm] | Task ID: 22 Name: Test is now completed [05-05-20 12:10:08 pm] * No new tasks found [05-05-20 12:10:18 pm] * No new tasks found [05-05-20 12:10:28 pm] * No new tasks found [05-05-20 12:10:38 pm] * No new tasks found [05-05-20 12:10:48 pm] * No new tasks found [05-05-20 12:10:58 pm] * No new tasks found [05-05-20 12:11:08 pm] * No new tasks found [05-05-20 12:11:18 pm] * No new tasks found [05-05-20 12:11:28 pm] | Task ID: 23 Name: Test is new [05-05-20 12:11:28 pm] | Task ID: 23 Name: Test image file found, file: /images/Ubuntu-16.04-Legacy [05-05-20 12:11:28 pm] | Task ID: 23 Name: Test 2 clients found [05-05-20 12:11:28 pm] | Task ID: 23 Name: Test sending on base port 60722 [05-05-20 12:11:28 pm] | Command: /usr/local/sbin/udp-sender --interface eno1 --min-receivers 2 --max-wait 36000 --mcast-rdv-address 10.132.81.150 --portbase 60722 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/Ubuntu-16.04-Legacy/d1p1.img;/usr/local/sbin/udp-sender --interface eno1 --min-receivers 2 --max-wait 30 --mcast-rdv-address 10.132.81.150 --portbase 60722 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/Ubuntu-16.04-Legacy/d1p2.img; [05-05-20 12:11:28 pm] | Task ID: 23 Name: Test has started [05-05-20 12:11:38 pm] | Task ID: 23 Name: Test has been completed [05-05-20 12:11:38 pm] | Task ID: 23 Name: Test has been killed [05-05-20 12:11:38 pm] | Task ID: 23 Name: Test is now completed [05-05-20 12:11:48 pm] * No new tasks found [05-05-20 12:11:58 pm] * No new tasks found [05-05-20 12:12:08 pm] * No new tasks found [05-05-20 12:12:18 pm] * No new tasks found [05-05-20 12:12:29 pm] * No new tasks found [05-05-20 12:12:39 pm] * No new tasks found [05-05-20 12:12:49 pm] * No new tasks found [05-05-20 12:12:59 pm] * No new tasks found
-
@jmvela2x I didn’t even have the chance to join the clients to the session.
-
@jmvela2x I see something potentially relevant in the status output of the FOGMulticastManager service.
“PHP Warning: proc_get_status(): supplied resource is not a”(vailable) I presume is the ending, but it’s cut off in the terminal.
-
@Sebastian-Roth I may have an opportunity coming up soon to test this in our production environment, but without multicast working in 1.5.9-RC1.4 I will lose the chance. Barring some kind of fix in the very near future for this issue, can you advice on how to roll back to 1.5.8 so I can at least test functionality of multicast cross-subnet in terms of bandwidth usage, etc when the opportunity presents itself?
-
@jmvela2x said in Multicast without registration starts OK, but hangs and disconnects clients due to timeout.:
I can at least test functionality of multicast cross-subnet in terms of bandwidth usage,
This can only happen if your subnet router supports igmp proxying or you have a mrouter in place to manage the multicast traffic. Multicasts will not normally traverse a normal router.
-
@george1421 I’m pretty sure we’re covered on this front, but I will run this by our network guy.
-
@jmvela2x I am sorry but today has been a very busy day und I could not find the time to look at this yet. I will do so first thing in the morning!! I don’t recommend you to roll back to 1.5.8 just now.
-
@jmvela2x Just a quick note on this. Now that I think more about it I am fairly sure I did test multicast once before pushing out the RC1 release. So I wonder if this could be a hickup on your FOG server? Did you try to clear all tasks, restart the FOG server and then schedule a fresh multicast task yet?
Is this multicast scheduled through a group in the FOG Web UI?
-
@Sebastian-Roth I did try several reboots and cleared all the tasks each time to make double sure. The multicast session is being scheduled from the Images tab with a defined client count as we are trying to avoid host registration as the hosts in our environment are not static per se. The amount of work required just makes FOG an nonviable solution for us if have to do host registration for all our clients.
-
@Sebastian-Roth I will hold off on changing anything until I hear from you tomorrow. Thanks for the follow up.
-
@jmvela2x Good news and bad news. I found the issue but I still need a bit more time to debug and fix it. It’s getting late and I need to rush to work now. Please stay tuned and I will update as soon as I can.
-
@jmvela2x I just pushed a fix to
dev-branch
which should fix the issue. But I found that multicast from the PXE menu seems to still have an issue. Though it works it spawns several udpcast sessions. But this issue has been in 1.5.8 already as I see from my testing. I will look into this and fix that soon as well.For now you can pull the latest fix to get multicast as in 1.5.8 back again:
sudo -i cd fogproject git checkout dev-branch git pull cd bin ./installfog.sh
-
@Sebastian-Roth That works for the time being. I can just manually cancel the tasks on completion since this is still in validation on our side. Thanks a ton!
-
@george1421 Finally heard back from IT and we have Catalyst 4510r+e in the router department. From some of the documentation I’ve been looking at it seems they have mroute and IGMP Proxy capability. Might need some guidance in that department if anyone has the knowledge or experience.
-
@jmvela2x I just pushed some more commits to
dev-branch
which should fix the multicast session issue altogether. -
Hi,
i dont know if its the right thread, but i stumble upon somehow similar problem.
My Host are all registered, in a Group and Multicast for the group starts just fine. But after a while up to 50% of the machines suddenly restart and go back to the “waiting” screen where the image location is shown. The Rest Multicasts up to 95-97 Percent and then does not finish or finish VERY slow.
Any Ideas?
dev-branch from Mo, 11.05.2020 - 12:00
Fog Installation/Update and Server restart happend.No problems on Unicast.
-
@spychodelics From what you describe your issue is not related to this topic. May I ask you to open a new one so we can focus on each of them and not mix up things. It helps a lot to not have separate issues discussed in one topic!
Post your FOG version (right lower corner of the web UI after login) - probably 1.5.9-RC1.8 but I wanna be sure. As well tell us more about your setup. Do you have FOG server and clients all in the same subnet, all connected on the exact same switch or is it distributed across network equipment and possibly even subnets? Are clients all the same hardware? Please give us more details like make and model when opening the new topic.
-
@Sebastian-Roth I still see the same issue where the multicast task is stuck at “in progress” and continues to run according to the logs. The job finishes successfully, but the task does not auto delete. Updated to 1.5.9-RC1.8 and rebooted server. Checked twice, same issue as before.