Issues with FOG services
-
I just stared a new deployment. All computers got the IP, booted and got to the part where partclone start. And … this:
https://photos.app.goo.gl/q4EZ7qoHDmCxHEul2
All are frozen like that. And task manager looks like this:
https://photos.app.goo.gl/VwqaAPDqlTd4CQg42
All workstations are connected and waiting.If I reboot the server and do it again, same result.
The only way I could find to fix this is to reinstall FOG.
https://photos.app.goo.gl/96CLWmsBmsCsBqN83
It doesn’t take long because all files are already installed but I believe the installer fixes some services and then it works:
https://photos.app.goo.gl/zqj1WoCoC2RJI8km1 -
@andreiv So if you restart the server you will see the same hang on multicast again? What do you see in
/opt/fog/log/multicast.log
? Which version of FOG do you use? -
This is what I have in the log file:
[11-30-17 12:59:02 pm] ================================== === ==== ===== ==== === ========= == === == === === ======== ==== == ==== === === ======== ==== == ========= === ==== ==== == ========= === ======== ==== == === === === ======== ==== == ==== === === ========= == === == === === ========== ===== ==== ================================== ===== Free Opensource Ghost ====== ================================== ============ Credits ============= = https://fogproject.org/Credits = ================================== == Released under GPL Version 3 == ================================== [11-30-17 12:59:03 pm] Interface Ready with IP Address: 127.0.0.1 [11-30-17 12:59:03 pm] Interface Ready with IP Address: 127.0.1.1 [11-30-17 12:59:03 pm] Interface Ready with IP Address: 172.16.1.1 [11-30-17 12:59:03 pm] Interface Ready with IP Address: 192.168.199.199 [11-30-17 12:59:03 pm] Interface Ready with IP Address: 193.231.17.37 [11-30-17 12:59:03 pm] * Starting MulticastManager Service [11-30-17 12:59:03 pm] * Checking for new items every 10 seconds [11-30-17 12:59:03 pm] * Starting service loop [11-30-17 12:59:03 pm] * No tasks found! [11-30-17 12:59:13 pm] * No tasks found! [11-30-17 12:59:23 pm] * No tasks found! [11-30-17 12:59:33 pm] * No tasks found! [11-30-17 12:59:43 pm] * No tasks found! [11-30-17 12:59:53 pm] * No tasks found! [11-30-17 1:00:03 pm] * No tasks found! [11-30-17 1:00:13 pm] * No tasks found! [11-30-17 1:00:23 pm] * No tasks found! [11-30-17 1:00:34 pm] | Task (19) Multi-Cast Task is new! [11-30-17 1:00:34 pm] | Task (19) /images/S02R1 image file found. [11-30-17 1:00:34 pm] | Task (19) Multi-Cast Task 19 clients found. [11-30-17 1:00:34 pm] | Task (19) Multi-Cast Task sending on base port: 55946. [11-30-17 1:00:34 pm] | Command: /usr/local/sbin/udp-sender --interface enp9s0 --min-receivers 19 --max-wait 600 --portbase 55946 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/S02R1/d1p1.img;/usr/local/sbin/udp-sender --interface enp9s0 --min-receivers 19 --max-wait 10 --portbase 55946 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/S02R1/d1p2.img;/usr/local/sbin/udp-sender --interface enp9s0 --min-receivers 19 --max-wait 10 --portbase 55946 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/S02R1/d1p3.img; [11-30-17 1:00:34 pm] | Task (19) Multi-Cast Task has started! [11-30-17 1:00:44 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:00:54 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:01:04 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:01:14 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:01:24 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:01:34 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:01:44 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:01:54 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:02:04 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:02:14 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:02:24 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:02:34 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:02:44 pm] | Task (19) Multi-Cast Task is already running with pid: 30052. [11-30-17 1:02:54 pm] | Task (19) Multi-Cast Task is already running with pid: 30052.
And this line goes on and on…
-
@andreiv From the log this looks pretty good. Maybe there is just one of the clients missing? You schedule a task for 19 machines. Are you absolutely sure all 19 come up to the blue screen? It doesn’t start if just a single one is missing.
-
Sorry for the late reply…
Yes, I triple-checked. All machines were connected and waiting. But I have an idea to why this is happening. I’ll test it as soon as I can and get back with the results. -
These instructions have fixed this problem lots of times: https://wiki.fogproject.org/wiki/index.php?title=Troubleshoot_Downloading_-_Multicast#Clear_DB_of_non-essential_multicast_data
-
@wayne-workman
Yes, I am almost sure that that is the problem. There are “leftover” tasks and this is related to an issue I reported in another thread, the fact that checking the “Schedule Shutdown after task completion” causes the task to hang (remain unfinished).
The next time I have to deploy I am going to do a test and deploy twice, once with that checkbox checked and once unchecked and I am going to capture a video of the last steps of the process on the client, to see what happens. -
@andreiv So were you able to make it work again??
-
@sebastian-roth Yes, but the my solution was to reinstall FOG. I didn’t know about the solution you mentioned. The reinstall also seems to solve the problem.
-
@andreiv Don’t think a re-install would have cleared the left over entries in the DB. At least not that I am aware of. Tom would know better though. I’ll mark this solved anyway.
-
@andreiv said in Issues with FOG services:
checking the “Schedule Shutdown after task completion” causes the task to hang (remain unfinished).
That feature doesn’t work for you?
@andreiv said in Issues with FOG services:
The next time I have to deploy I am going to do a test and deploy twice, once with that checkbox checked and once unchecked and I am going to capture a video of the last steps of the process on the client, to see what happens.
Please do.
-
@sebastian-roth said in Issues with FOG services:
Don’t think a re-install would have cleared the left over entries in the DB. At least not that I am aware of.
The FOG Installer does not DELETE anything from the DB at all. The most it even does is A). Make a new database or B ). Upgrade an older schema database to the latest schema.
A little fog history:
There’s a great number of problems that re-running the installer will FIX though. FTP Passwords are one I pushed hard and helped with. So if the FTP passwords (storage management node passwords used for FTP purposes) are not correct for a node, all kinds of crap breaks in FOG and tons of people were needing help diagnosing & fixing that, so we changed the fog installer to just fix it every single time for all storage node Addresses that match the local machine’s address. This fixed the FTP password problem basically instantly for both master nodes and storage nodes for everyone. -
@andreiv I moved your last post to here since it’s a separate issue that you found: https://forums.fogproject.org/topic/11157/checking-schedule-shutdown-after-task-completion-causes-the-running-task-to-hang/2 Please follow it there.