Daily wake-up routine not working after update
-
I do wish the message:
* Press [Enter] key when database is updated/installed.
would only be displayed if a database update is performed. Kinda feel silly hitting refresh on that page each time I pull in new code
-
@altitudehack You could also run the installer with:
./installfog.sh -y
-
@Tom-Elliott maybe another error to debug in the code:
Apr 27 11:30:54 fog.int.visionary.com systemd[1]: Started FOGScheduler. Apr 27 11:30:55 fog.int.visionary.com env[15253]: PHP Warning: _() expects exactly 1 parameter, 2 given in /var/www/html/fog/lib/fog/timer.class.php on line 134 Apr 27 11:31:55 fog.int.visionary.com env[15253]: PHP Warning: _() expects exactly 1 parameter, 2 given in /var/www/html/fog/lib/fog/timer.class.php on line 134 Apr 27 11:32:55 fog.int.visionary.com env[15253]: PHP Warning: _() expects exactly 1 parameter, 2 given in /var/www/html/fog/lib/fog/timer.class.php on line 134
-
@altitudehack Should be fixed in latest again.
Thank you,
-
@tom-elliott yes, those errors stopped after the latest update. got this little bugger, though:
Apr 27 12:47:45 fog.int.visionary.com systemd[1]: [/usr/lib/systemd/system/FOGScheduler.service:17] Unknown lvalue 'StartLimitIntervalSec' in section 'Unit'
-
@altitudehack https://unix.stackexchange.com/questions/463917/systemds-startlimitintervalsec-and-startlimitburst-never-work
This sounds like a systemd version issue.
What version are you running?
systemctl --version
Should output something like this:
systemd 239 (239-45.el8) +PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy
-
# systemctl --version systemd 219 +PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN
-
@altitudehack what OS are you running?
-
@altitudehack And is this impacting it from running or just a noticed message?
-
@tom-elliott I think it’s just a notice. This is on CentOS 7.9.2.
-
@tom-elliott
I confirm 1.9.5.77 is waking up computers on schedule again.OS: Debian 10
FOG: dev-branch 1.9.5.77, no https.I just noticed way longer Downloading kernel, init and fog-client binaries during update process. Takes approx 10 times longer 10-20 minutes) compare to stable. Sometimes even not finishing and timing out. Maybe problems getting client from fogproject.org?
Anyway, great and quick debugging. Thank you,Tom!
Peter -
@tom-elliott The logs show the service kicked off a WoL this morning, but I was just made aware that it didn’t run and had to run it manually via Groups > Select group > Power Management > Perform Immediately > WoL. Here’s the scheduler log:
[04-28-21 5:59:10 am] * 1 task found. [04-28-21 5:59:10 am] * 1 scheduled task(s) to run. [04-28-21 5:59:10 am] * 0 power management task(s) to run. [04-28-21 5:59:10 am] * Scheduled Task run time: Sat, 01 May 2021 06:00:00 -0500 [04-28-21 5:59:10 am] * This is a cron style task that should run at: 1619866800 [04-28-21 6:00:10 am] * 1 task found. [04-28-21 6:00:10 am] * 1 scheduled task(s) to run. [04-28-21 6:00:10 am] * 0 power management task(s) to run. [04-28-21 6:00:10 am] * Scheduled Task run time: Sat, 01 May 2021 06:00:00 -0500 [04-28-21 6:00:10 am] * This is a cron style task that should run at: 1619866800 [04-28-21 6:01:10 am] * 1 task found. [04-28-21 6:01:10 am] * 1 scheduled task(s) to run. [04-28-21 6:01:10 am] * 0 power management task(s) to run. [04-28-21 6:01:10 am] * Scheduled Task run time: Sat, 01 May 2021 06:00:00 -0500 [04-28-21 6:01:10 am] * This is a cron style task that should run at: 1619866800
Curiously, the Service Master log has no entries since the last restart:
/var/log/fog/servicemaster.log [04-27-21 11:39:09 am] FOGImageSize forked child process (18037). [04-27-21 11:39:09 am] Service_Signal_handler (15271) received signal 15. [04-27-21 11:39:09 am] Service_Signal_handler (15271) killing child (15518). [04-27-21 11:39:09 am] Service_Signal_handler (15271) exiting. [04-27-21 11:39:09 am] FOGImageSize child process (18037) is running. [04-27-21 11:39:09 am] FOGSnapinHash Start [04-27-21 11:39:09 am] FOGSnapinHash forked child process (18126). [04-27-21 11:39:09 am] FOGSnapinHash child process (18126) is running. [04-27-21 12:23:48 pm] Service_Signal_handler (17870) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (17870) killing child (17966). [04-27-21 12:23:48 pm] Service_Signal_handler (17870) exiting. [04-27-21 12:23:48 pm] Service_Signal_handler (17840) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (17840) killing child (17861). [04-27-21 12:23:48 pm] Service_Signal_handler (17840) exiting. [04-27-21 12:23:48 pm] Service_Signal_handler (17853) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (17853) killing child (17883). [04-27-21 12:23:48 pm] Service_Signal_handler (17853) exiting. [04-27-21 12:23:48 pm] Service_Signal_handler (17817) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (17817) killing child (17826). [04-27-21 12:23:48 pm] Service_Signal_handler (17817) exiting. [04-27-21 12:23:48 pm] Service_Signal_handler (17825) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (17825) killing child (17839). [04-27-21 12:23:48 pm] Service_Signal_handler (17825) exiting. [04-27-21 12:23:48 pm] Service_Signal_handler (18058) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (18058) killing child (18126). [04-27-21 12:23:48 pm] Service_Signal_handler (18058) exiting. [04-27-21 12:23:48 pm] Service_Signal_handler (17973) received signal 15. [04-27-21 12:23:48 pm] Service_Signal_handler (17973) killing child (18037). [04-27-21 12:23:48 pm] Service_Signal_handler (17973) exiting. [04-27-21 12:24:09 pm] FOGMulticastManager Start [04-27-21 12:24:09 pm] FOGMulticastManager forked child process (32107). [04-27-21 12:24:09 pm] FOGMulticastManager child process (32107) is running. [04-27-21 12:24:09 pm] FOGImageReplicator Start [04-27-21 12:24:09 pm] FOGImageReplicator forked child process (32112). [04-27-21 12:24:09 pm] FOGImageReplicator child process (32112) is running. [04-27-21 12:24:09 pm] FOGTaskScheduler Start [04-27-21 12:24:09 pm] FOGTaskScheduler forked child process (32127). [04-27-21 12:24:09 pm] FOGTaskScheduler child process (32127) is running. [04-27-21 12:24:09 pm] FOGSnapinReplicator Start [04-27-21 12:24:09 pm] FOGSnapinReplicator forked child process (32133). [04-27-21 12:24:09 pm] FOGSnapinReplicator child process (32133) is running. [04-27-21 12:24:09 pm] FOGPingHosts Start [04-27-21 12:24:09 pm] FOGPingHosts forked child process (32315). [04-27-21 12:24:09 pm] FOGPingHosts child process (32315) is running. [04-27-21 12:24:09 pm] FOGSnapinHash Start [04-27-21 12:24:09 pm] FOGSnapinHash forked child process (32379). [04-27-21 12:24:09 pm] FOGSnapinHash child process (32379) is running. [04-27-21 12:24:09 pm] FOGImageSize Start [04-27-21 12:24:09 pm] FOGImageSize forked child process (32424). [04-27-21 12:24:09 pm] FOGImageSize child process (32424) is running.
Is that expected? Would any other logs be useful?
Thank you for your work towards resolving this! -
@altitudehack I’m going to guess your apache access logs would be handy.
Do you use HTTPS as part of your web gui?
I did make a small change to try to send WOL packets first across https, then across http, (so it will be sending packets twice for each machine needed) but as http->https forces a redirect, the mac address to wake up is lost in the redirect.
By trying HTTPS before HTTP we should be able to maintain the variable input sending properly.
This is just a guess.
Of course, if you’d like, you can also delete the scheduled task and retry using the PowerManagement Scheduled WOL as @Petěrko shows that it is working now. I’m sorry as I feel like I’m making you jump through hoops and that is definitely not my intent.
As for the service logs, this is good. It was restarting the service constantly because of the errors it was seeing from PHP (the missing
;
and the_() expects one parameter but 2 given
) -
@tom-elliott I’m completely lost regarding the http/https talk. I understand that a WoL command is a specially-crafted UDP packet, usually sent by the
ether-wake
program. What’s the connection to Apache other than how we manage the application? Are magic packets generated by Apache within FOG?I will try setting a PowerManagement Scheduled WOL as suggested and let you know how it goes.
Thanks for sticking with this!
-
@altitudehack Yes, we have programmed the magic packet generation which then passes to the nodes. This way WOL can theoretically work across sites (if you have multiple storage nodes/groups with a centrally managed server system.)
So apache sends a URL request to each node which then generates and sends the magic packets.
-
@tom-elliott Interesting. I’m just using a single node so guessing any communication between apache & the WoL function would remain internal to the fogserver.
-
@altitudehack yes. But if it’s redirected at all, the request would lose the MAC address
-
@tom-elliott I’m not very familiar with this part of the system, but it does look like it is redirected to php-fpm:
<VirtualHost *:80> <FilesMatch "\.php$"> SetHandler "proxy:fcgi://127.0.0.1:9000/" </FilesMatch> KeepAlive Off ServerName [privateIP] ServerAlias fog.my.domain.com ServerAlias fogserver.domain.com DocumentRoot /var/www/html/ <Directory /var/www/html/fog/> DirectoryIndex index.php index.html index.htm </Directory> RewriteEngine On RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK) RewriteRule .* - [F] RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-d RewriteRule ^/fog/(.*)$ /fog/api/index.php [QSA,L] </VirtualHost>
-
@altitudehack
Hi again,
Apache+PHP are creating WOL packets. We had problems with HTTPS on. Now (1.5.9.78) Scheduled and Power Tasks are working. I am on Debian 10.I always reboot fogserver after dev-branch upgrade and all services are running properly as far as I know.
I kept Tasks from MASTER before fog upgrade and fogscheduler picked them up with no hiccup (after restart).
Peter
-
Confirmed that WoL did not work this morning using the PowerManagement Scheduled WOL technique. Will update to the latest and give 'er another go.