Daily wake-up routine not working after update
-
@altitudehack One more time please?
Yes I try to be polite. I realize this isn’t a scenario anybody wants and also understand peoples work lives.
-
@tom-elliott Well this is getting interesting:
[04-23-21 11:58:31 am] ================================== === ==== ===== ==== === ========= == === == === === ======== ==== == ==== === === ======== ==== == ========= === ==== ==== == ========= === ======== ==== == === === === ======== ==== == ==== === === ========= == === == === === ========== ===== ==== ================================== ===== Free Opensource Ghost ====== ================================== ============ Credits ============= = https://fogproject.org/Credits = ================================== == Released under GPL Version 3 == ================================== [04-23-21 11:58:31 am] Interface Ready with IP Address: [privateIP] [04-23-21 11:58:31 am] Interface Ready with IP Address: 127.0.0.1 [04-23-21 11:58:31 am] Interface Ready with IP Address: 127.0.1.1 [04-23-21 11:58:31 am] Interface Ready with IP Address: fog.my.domain.com [04-23-21 11:58:31 am] * Starting TaskScheduler Service [04-23-21 11:58:31 am] * Checking for new items every 60 seconds [04-23-21 11:58:31 am] * Starting service loop [04-23-21 11:58:31 am] * 59 tasks found. [04-23-21 11:58:31 am] * Running loop of all tasks. [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 52 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 54 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 62 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 65 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 67 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 81 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 82 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 83 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 84 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 85 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 86 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement [04-23-21 11:58:31 am] * 87 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement ... [04-23-21 11:58:31 am] * 94 Does not appear to be valid [04-23-21 11:58:31 am] * Attempting to run checks for PowerManagement
It also helps that you guys have made the upgrade process so painless
-
@altitudehack What does it look like now? (after updating/installing again of course.)
(These tasks all appear to be powermanagement tasks.) So I separated the logic that handles power management vs scheduled tasks.
-
@altitudehack Just cause you like that command so much, here’s a pretty cool command to loop through. Does the same, but in a more compressed format.
for i in FOG{Scheduler,PingHosts,{Image,Snapin}Replicator,MulticastManager,ImageSize,SnapinHash}.service; do systemctl restart $i; done
-
@altitudehack To further help. now that we know the power management tasks were what was causing this you can clean this up with:
TRUNCATE TABLE powerManagement
Mind you this will completely clear out the power management table. I don’t know if these tasks were intentionally created or if this was how you were configuring WOL tasks for these hosts?
-
@tom-elliott said in Daily wake-up routine not working after update:
What does it look like now?
[04-23-21 1:27:29 pm] Interface Ready with IP Address: [privateIP] [04-23-21 1:27:29 pm] Interface Ready with IP Address: 127.0.0.1 [04-23-21 1:27:29 pm] Interface Ready with IP Address: 127.0.1.1 [04-23-21 1:27:29 pm] Interface Ready with IP Address: fog.my.domain.com [04-23-21 1:27:29 pm] * Starting TaskScheduler Service [04-23-21 1:27:29 pm] * Checking for new items every 60 seconds [04-23-21 1:27:29 pm] * Starting service loop [04-23-21 1:27:29 pm] * 59 tasks found. [04-23-21 1:27:29 pm] * 0 scheduled task(s) to run. [04-23-21 1:27:29 pm] * 59 power management task(s) to run.
-
@tom-elliott I created the tasks through the GUI to wake the groups up. After the last update, I see a bunch of this instead:
[04-23-21 2:51:51 pm] Interface not ready, waiting for it to come up:
-
@altitudehack What I meant about the GUI wake, was this a group -> task -> wol -> Scheduled as cron or Group -> Power Management -> Cron time -> WOL?
You should be able to run
systemctl restart FOGScheduler
to fix the interfact not ready message. This shouldn’t be anything programming wise as that portion hasn’t changed. -
@tom-elliott Group -> Power Management -> Cron time -> WOL. But I deleted it earlier today so I’m not sure why it’s still there.
After restarting the service the logs look like this:[04-23-21 3:15:06 pm] * Starting TaskScheduler Service [04-23-21 3:15:06 pm] * Checking for new items every 60 seconds [04-23-21 3:15:06 pm] * Starting service loop [04-23-21 3:15:06 pm] * 59 tasks found. [04-23-21 3:15:06 pm] * 0 scheduled task(s) to run. [04-23-21 3:15:06 pm] * 59 power management task(s) to run.
-
@altitudehack Well when done through Group all it does is create individual tasks for each of the hosts in the group (hence the high number)
As you’re using the 1.5.9.72 version, I think there’s a potential issue with deleting group items in such a way. I’m assuming going back to the group -> power management after you have “deleted it” it has the same task reappeared?
I still believe you will need to clean out the powerManagement table using the
TRUNCATE TABLE powerManagement
Then restart the FOGScheduler service and you should then see: 0 scheduled task(s) to run and 0 power management task(s) to run.
Then I think you should use Group -> Task -> Advanced -> WOL. This will create a single task that does the exact same thing (but simpler) for your hosts in that group. It won’t see x number. it will show 1 scheduled task to run.
-
I went ahead and re-created the wakeup cron using the breadcrumbs described and nothing happened in the logs for ~five minutes. So I restarted the service again and now have 79 tasks found:
[04-23-21 3:21:35 pm] * Starting TaskScheduler Service [04-23-21 3:21:35 pm] * Checking for new items every 60 seconds [04-23-21 3:21:35 pm] * Starting service loop [04-23-21 3:21:35 pm] * 79 tasks found. [04-23-21 3:21:35 pm] * 0 scheduled task(s) to run. [04-23-21 3:21:35 pm] * 79 power management task(s) to run.
-
@altitudehack I am running a test and just wanted to follow similar route:
Here’s what I see:
[04-23-21 3:25:49 pm] * Starting TaskScheduler Service [04-23-21 3:25:49 pm] * Checking for new items every 60 seconds [04-23-21 3:25:49 pm] * Starting service loop [04-23-21 3:25:49 pm] * 1 task found. [04-23-21 3:25:49 pm] * Power Management Task run time: Fri, 23 Apr 2021 15:50:00 -0500 [04-23-21 3:25:49 pm] * This is a cron style task that should not run now. [04-23-21 3:26:49 pm] * 1 task found. [04-23-21 3:26:49 pm] * Power Management Task run time: Fri, 23 Apr 2021 15:50:00 -0500 [04-23-21 3:26:49 pm] * This is a cron style task that should not run now.
-
@tom-elliott said in Daily wake-up routine not working after update:
TRUNCATE TABLE powerManagement
I missed that in your previous post. Thank you! I think this one’s solved!
[04-23-21 3:25:27 pm] * Starting TaskScheduler Service [04-23-21 3:25:27 pm] * Checking for new items every 60 seconds [04-23-21 3:25:27 pm] * Starting service loop [04-23-21 3:25:27 pm] * No tasks found! [04-23-21 3:26:27 pm] * 1 task found. [04-23-21 3:26:27 pm] * 1 scheduled task(s) to run. [04-23-21 3:26:27 pm] * 0 power management task(s) to run.
I’ll find out Monday. Have a great weekend, @Tom-Elliott !
-
@altitudehack Thank you, you too
-
Just wanted to follow up. Hopefully all is working as expected now?
-
@tom-elliott Good morning, Tom! When I checked logs this morning it was full of ‘waiting for the interface to come up’ messages. I checked the order of my commands on Friday and can confirm that I did bounce the scheduler service after truncating the table:
1430 2021-04-23 15:25:07 mysql -uroot -p 1431 2021-04-23 15:25:27 systemctl restart FOGScheduler
I’m not sure what I missed. I restarted the service this morning and the interface came up. It looks like the previous logs were lost when I restarted the service.
Thank you for checking in! -
@Tom-Elliott I noticed another update was available yesterday so ran it, rebooted and went to bed. Unfortunately woke up to more
Interface not ready
messages…1444 2021-04-26 17:11:15 cd fogproject-dev-branch/ 1445 2021-04-26 17:11:18 git pull 1447 2021-04-26 17:11:47 cd bin 1449 2021-04-26 17:12:01 ./installfog.sh 1450 2021-04-26 17:17:26 reboot ... [04-26-21 5:32:51 pm] Interface not ready, waiting for it to come up: [04-26-21 5:33:01 pm] Interface not ready, waiting for it to come up: [04-26-21 5:33:11 pm] Interface not ready, waiting for it to come up: ... [04-26-21 5:33:21 pm] Interface not ready, waiting for it to come up: [04-26-21 5:33:31 pm] Interface not ready, waiting for it to come up: [04-26-21 5:33:41 pm] Interface not ready, waiting for it to come up:
One thing that stands out to me is that the FOG server was rebooted at 5:17 last night, but the first message in the logs was timestamped 5:32. What was happening for those 15 minutes?
I rebooted again this morning and the interface came up immediately. I’ll keep an eye on the logs to see if it goes sideways in a few minutes:
1464 2021-04-27 06:54:30 reboot ... tail /var/log/fog/fogscheduler.log [04-27-21 6:54:59 am] Interface Ready with IP Address: [localIP] [04-27-21 6:54:59 am] Interface Ready with IP Address: 127.0.0.1 [04-27-21 6:54:59 am] Interface Ready with IP Address: 127.0.1.1 [04-27-21 6:54:59 am] Interface Ready with IP Address: fog.my.domain.com [04-27-21 6:54:59 am] * Starting TaskScheduler Service [04-27-21 6:54:59 am] * Checking for new items every 60 seconds [04-27-21 6:54:59 am] * Starting service loop [04-27-21 6:54:59 am] * 1 task found. [04-27-21 6:54:59 am] * 1 scheduled task(s) to run. [04-27-21 6:54:59 am] * 0 power management task(s) to run.
-
@altitudehack What’s really strange to me is you should see at least the FOG_WEB_HOST setting.
Can you check disk space and usage for your whole system?
df -h
The fact that FOG_WEB_HOST is coming back with nothing seems to me that the Database is dying or there, somehow, isn’t a defined point for this.
-
# df -hPT Filesystem Type Size Used Avail Use% Mounted on devtmpfs devtmpfs 1.9G 0 1.9G 0% /dev tmpfs tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs tmpfs 1.9G 9.0M 1.9G 1% /run tmpfs tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/mapper/cl_fog-root xfs 16G 6.3G 9.8G 40% / /dev/sda1 xfs 1014M 312M 703M 31% /boot /dev/mapper/cl_fog-var xfs 5.0G 3.4G 1.7G 67% /var /dev/mapper/cl_fog-home xfs 10G 930M 9.1G 10% /home /dev/mapper/fog_images xfs 1.0T 37G 988G 4% /images tmpfs tmpfs 379M 0 379M 0% /run/user/27129 tmpfs tmpfs 379M 0 379M 0% /run/user/0
-
@altitudehack Well, there goes that theory.
Unsure where to look.
Can you check on FOG Configuration -> FOG Settings -> FOG_WEB_HOST (Maybe Web Host on the interface for you?) to see that the web host is defined?