504 Gateway Timeout hitting fog/scheduledtask/list
-
We’ve been running Fog 1.5.8 without issue for months and I just noticed that no deployments have occurred for any machine in our lab since 11/6/2020.
While looking into this issue, I noticed I cannot bring up the scheduled task view in Fog here http://<server>/fog/management/index.php?node=task&sub=activescheduled. After a period of time, the page states “504 - Gateway Timeout” and the entire site is mostly unusable until I restart PHP (/etc/init.d/ph7.1-fpm restart).
Then, everything works again until someone tries going to the Scheduled Tasks page again…
The scheduledTasks mysql table had only 17 entries in it but I truncated them as a troubleshooting step. After truncating the table, I can go to this page without issue but If I add a single task back, the page becomes unavailable again.
Here’s what I’ve tried so far:
Changed MEMORY LIMIT in Fog Settings from 128 to 512
Updated to FOG 1.5.9
Restarted the entire serverThe only log entries I could find that seem important are:
[Wed Nov 11 14:02:35.200217 2020] [proxy_fcgi:error] [pid 8670] [client 207.173.84.70:63948] AH01067: Failed to read FastCGI header, referer: http://lab/fog/management/index.php?node=task&sub=active
[Wed Nov 11 14:02:35.200225 2020] [proxy_fcgi:error] [pid 22650] [client 207.173.84.70:63989] AH01067: Failed to read FastCGI header, referer: http://lab/fog/management/index.php?node=task&sub=active
[Wed Nov 11 14:02:35.200240 2020] [proxy_fcgi:error] [pid 8670] (104)Connection reset by peer: [client 207.173.84.70:63948] AH01075: Error dispatching request to : , referer: http://lab/fog/management/index.php?node=task&sub=active
[Wed Nov 11 14:02:35.200244 2020] [proxy_fcgi:error] [pid 22650] (104)Connection reset by peer: [client 207.173.84.70:63989] AH01075: Error dispatching request to : , referer: http://lab/fog/management/index.php?node=task&sub=activeAny ideas?
-
@nehsa Hmm this is interesting one.
Apache shells out php commands to the php-fpm application since php-fpm is much faster a processing php code than apache.
So the 504 gateway timeout is probably because php-fpm hasn’t responded within the timeout period, so apache throws the error.
So in /var/logs/php-fpm there should be an error log. Lets tail that file and see if there is anything helpful.
Also check to see if php-fpm is running by keying in
ps aux | grep php-fpm
Also it looks like you’ve looked at the apache error file, those errors don’t really tell us anything.
The memory limit concerns me depending on where you found that. If that was under the TFTP section that should have been 127MB (old value) and 275MB (new value). That value allocates memory with FOS Linux when capturing or deploying images. It has nothing to do with the web ui or scheduling a task. PXE booting and imaging yes.
-
Hi @george1421,
Thanks for your response. Here’s more info:
I don’t see anything useful in the php-fpm log. This appears when I first click on the scheduled tasks link but nothing else that I can find:
[12-Nov-2020 10:12:04.930531] DEBUG: pid 24205, fpm_children_make(), line 421: [pool www] child 24495 started
[12-Nov-2020 10:12:04.930551] DEBUG: pid 24205, fpm_pctl_perform_idle_server_maintenance(), line 427: [pool www] 1 child(ren) have been created dynamicallyI changed php-fpm log level to debug to ensure we see everything.
php-fpm is running:
ps aux | grep php-fp
root 25007 0.0 0.2 412524 44260 ? Ss 10:15 0:00 php-fpm: master process (/etc/php/7.1/fpm/php-fpm.conf)
www-data 25010 0.4 0.1 413220 23072 ? S 10:15 0:01 php-fpm: pool www
www-data 25011 0.3 0.1 641468 32120 ? S 10:15 0:01 php-fpm: pool www
www-data 25012 0.3 0.1 413212 22432 ? S 10:15 0:01 php-fpm: pool www
www-data 25013 0.3 0.1 413248 22956 ? S 10:15 0:01 php-fpm: pool www
www-data 25014 0.3 0.1 640588 31276 ? S 10:15 0:01 php-fpm: pool www
www-data 25071 0.3 0.1 413200 22716 ? S 10:15 0:01 php-fpm: pool wwwIn regards to memory limit, the setting I changed was under FOG Settings->General Settings->Memory Limit. The tooltip indicates: “Default setting is the memory limit set in php.ini” I’m not sure if this value is actually set anywhere. It’s set to 512 in the UI but I dont see it reflrected here:
/etc/php/7.1/fpm/php.ini still has memory_limit = 128MThanks for your help!
-
@nehsa May I ask which Linux OS and version you use? Looks like Debian or Ubuntu put please let us know which one exactly. With one of those there might be a
/var/log/php-fpm/www-error.log
file…Manually adjust the memory limit in the file and restart php-fpm (
systemctl restart php7.1-fpm
). -
Hi Sebastian,
We’re running Ubuntu 16.04. I don’t have an /var/log/php-fpm/www-error.log. In my case, all logging goes into php7.1-fpm.log. I set the log level to DEBUG but don’t see anything valuable being logged. Only theses messages when I click on the “Scheduled Tasks” link:
[12-Nov-2020 10:12:04.930531] DEBUG: pid 24205, fpm_children_make(), line 421: [pool www] child 24495 started
[12-Nov-2020 10:12:04.930551] DEBUG: pid 24205, fpm_pctl_perform_idle_server_maintenance(), line 427: [pool www] 1 child(ren) have been created dynamicallyIncreasing the memory doesn’t appear to help anything.
Thanks,
Jesse
-
Hi @george1421 and @Sebastian-Roth,
I figured out what the issue was…
I have a custom lab website that has a button “Create Fog Task” that creates an CRON-style scheduled task via the FOG api. Last week, I decided to change this button from creating the tasks at 1AM to Midnight.
This was the problem…
Tasks were created with stHour of NULL (in the FOG mySQL database) because I was passing in 0 (for midnight) via the API call. This broke the website when it was trying to build the “Start Time” field. Change the site back to passing 1 for 1AM fixed the issue.
e.g.
http://lab/fog/scheduledtask/new
{
“id”:“18”,
“name”:“MITX-PC”,
“description”:“”,
“type”:“C”,
“taskType”:“1”,
“minute”:“35”,
“hour”:“0”, <-- HERE
“dayOfMonth”:““,
“month”:””,
“dayOfWeek”:“*”,
“isGroupTask”:“”,
“hostID”:“4”,
“shutdown”:“”,
“other1”:“”,
“other2”:“-1”,
“other3”:“fog”,
“other4”:“1”,
“other5”:“”,
“scheduleTime”:“0”,
“isActive”:“1”,
“imageID”:“0”,Out of curiousity, how would you create a scheduled task for midnight via the API?
Thanks for both of your help!
Jesse
-
@nehsa said in 504 Gateway Timeout hitting fog/scheduledtask/list:
“hour”:“0”, <-- HERE
So setting a zero in the api causes it to write a null character into the database?
@Tom-Elliott @JJ-Fullmer Have you guys seen this with the API. I would expect maybe a blank to cause a null but not a zero.
-
@george1421 I have seen some strange behavior with 1’s and 0’s related to the hostPending field in the database that I’m still working out with @Sebastian-Roth . I believe there’s some notes buried somewhere in my post here https://forums.fogproject.org/topic/14607/weird-host-behavior-some-disappearing-losing-primary-mac-some-suddenly-needing-approval where I discovered I had some hosts says pending = 1 some pending = ‘1’ and then pending = 0 and pending = ‘0’.
I ended up making all of them string values (‘1’ or ‘0’ rather than 1 or 0).
It doesn’t seem to be universally true throughout everything but I have seen it.@nehsa
Also, I also want to know how to make a scheduled task via the api. I’ve messed with that idea and haven’t been able to figure out the json needed. My workaround has been to schedule a reboot in windows using a combination of powershell date math (to figure out seconds till a given time) andshutdown.exe /r /t $seconds
then I stop the fog service on the host (so it doesn’t reboot the host) and queue the task for right now.
It would be much more elegant to do a delayed scheduled task.I think the required fields may be something like
{ "taskTypeID": 1, "type":"S", "scheduleTime":"1585616580" }
I have that commented out in some code but sadly didn’t write down what that time meant… but regardless I’m pretty sure it didn’t actually work. I think I probably just scheduled a task and did a get call on tasks to see what it looked like. If I didn’t, that’s what I should have done…
@Tom-Elliott might have some insight on scheduling a task via the api -
@nehsa Great you figured out it was the hour set to zero causing this!! Never would have figured this one out I guess.
Now digging through the code I found that it’s not just the API setting an empty value instead of 0 in the DB but also another issue in the code that moved the “empty hour” down all the way to the “day of week” variable. If that one is empty it’s actually causing the dead lock I found. So I have nailed this down. Just needs some fixing which I will do soon!
Thank you for bringing this issue up and helping in finding what’s causing this!
-
@Tom-Elliott The reported behavior of a zero value input (“0”) to the API being stored as empty/null is surely caused by the check in line 916 in router.class.php. I am wondering why we want to skip on
!$val
(essentially skip zero, empty strings/arrays) as it ends up in the DB as empty string anyway. What do you think? I am not sure how much of the known API behavior this will break if changed? -
@sebastian-roth Should we do something like:
if ($key == 'id' || (!is_numeric($val) && !$val) ) {
Or (I like this one):
if ($key == 'id' || NULL === $val ) {
Or:
if ($key == 'id' || is_null($val) } {
-
@nehsa I finally figured out scheduled tasks via the api.
I have it written out with a powershell example (with verbose output so you can see the POST url used and the json data) in post
https://forums.fogproject.org/post/139328Give that a looksee and let me know if you want more examples. Also in that Thread @Tom-Elliott explained what some of the needed fields for a scheduled or crontask do.
-
@nehsa Please be aware of the fixes we pushed, one to address the timeout as discussed below and the other to change the API parameter “taskType” to “taskTypeID” to distintuish it from a sub element that was named alike (see details here).
Using the latest version of
dev-branch
you will be able to schedule at 0 PM as well.Moved the topic to bug reports.
-
Hi there, my apologies for not responding earlier. I was on a mini-vacation. As much as one can have in the US now anyway not being able to travel anywhere…
I’m glad you weren’t stuck in my absense. I want to look at the PowerShell stuff you’ve created for FOG. I was just reading about it. I use PowerShell a lot in my work and can use it.
Here’s what I did for your reference in JavaScript/JQuery:
I have an HTML button set to hit this JavaScript function. It just passes the string name of the machine to create the task for. e.g. “LABMACHINE1”
function create_fog_scheduled_task(node_name){ var hour_to_deploy = 1; // 1am // get the host list $.ajax({ url: "http://lab/fog/host/list", type: "GET", headers: { "fog-api-token": fog_api_token, "fog-user-token": fog_user_token }, contentType: 'application/json', success: function (data) { console.log("Retrieved host list:"); console.log(data); // get the host id var host_id = -1 for (x = 0; x <= data.hosts.length; x++) { if (data.hosts[x].name.toUpperCase() == node_name.toUpperCase()) { host_id = data.hosts[x].id; break; } } console.log("Creating scheduled task for: " + node_name + ", ID: " + host_id); // randomize when task starts to prevent all starting at once.. var minute = Math.floor(Math.random() * 60) // 0 to 59 var post_data = { "name": node_name, "description": "", "type": "C", "taskType": 1, "minute": parseInt(minute), "hour": hour_to_deploy, "dayOfMonth": "*", "month": "*", "dayOfWeek": "*", "isGroupTask": 0, "hostID": parseInt(host_id), "shutdown": 0, "other1": "", "other2": "-1", "other3": "fog", "other4": true, "other5": "", "scheduleTime": "", "isActive": 1, "imageID": 0 } $.ajax({ url: "http://lab/fog/scheduledtask/new", type: "POST", headers: { "fog-api-token": fog_api_token, "fog-user-token": fog_user_token }, contentType: 'application/json', data: JSON.stringify(post_data), success: function (data) { console.log("Scheduled task created for " + node_name); } }); } }); }
@Sebastian-Roth, @JJ-Fullmer, @Tom-Elliott -
In troubleshooting this issue, I noticed my code can end up passing minute as zero as well which also ends up as NULL in the database. This doesn’t appear to cause an issue like stHour.The UI/PHP set the minute to 0 in this case:
2020-11-24 01:00Even though the database shows (look at SKYLAKE-PC):
mysql> select stName, stHour, stMinute, stActive from scheduledTasks; +-----------------+--------+----------+----------+ | stName | stHour | stMinute | stActive | +-----------------+--------+----------+----------+ | EEAP-PC | 1 | 33 | 1 | | BEELINK-PC | 1 | 42 | 1 | | CALPELLA-PC | 1 | 49 | 1 | | EVGAQUAD-PC | 1 | 49 | 1 | | ICELAKE2-NUC | 1 | 13 | 1 | | JBMIX-PC | 1 | 53 | 1 | | MAMMOTH-PC | 1 | 51 | 1 | | MASTER-PC | 1 | 53 | 1 | | MITX-PC | 1 | 35 | 1 | | MODISC-PC | 1 | 27 | 1 | | OPTIPLEX7010-PC | 1 | 39 | 1 | | PRECISION-PC | 1 | 42 | 1 | | SKYLAKE-PC | 1 | | 1 | | SMC-X10SAE | 1 | 13 | 1 | | THERMALTAKE-PC | 1 | 19 | 1 | | THINK-PC | 1 | 24 | 1 | | SMC-X11SAE | 1 | 23 | 1 | | VALUELINEI7-PC | 1 | 39 | 1 | | ICELAKE-NUC | 1 | 17 | 1 | +-----------------+--------+----------+----------+ 19 rows in set (0.00 sec)
I still added this just in case to my website:
// randomize when task starts to prevent all starting at once.. var minute = Math.floor(Math.random() * 60) // 0 to 59 if (minute == 0) { minute = 1; };
Thanks everyones for your help on this! It’s very much appreciated!
Jesse