503 Error and stuck at "attempting to update database"
I’m having an issue that I have no idea how to solve. I am capturing an image and deploying a different image to 20 computers. Once the upload finished it gave me an error when trying to update the database
I am not able to get into the web interface, It gives me a 503 Error
I am able to SSH into the server though!
I know that I can just restart the server but this is not the first time this happens so I would like to fix it for good.
FOG Version: 1.5.4
Server has 4TB storage and 32GB RAM (don’t think this is the issue but in case someone asks)
Any help is extremely appreciated! Thanks!
!!Update: It now gives me a 504 Gateway error (bottom of screen)
So I changed the pm.max_children setting to 150
Good catch on the max_children setting. If you would have had a current version of FOG that setting would be 35. I think 150 is a bit excessive since each child consumes ram. I think 50 might be a better choice to start with. Most people don’t start 20 simultaneous unicast streams so I can say we’ve not run into this issue before with the standard 35 max children. 5 IS a bit low.
So to recap you had 2 issues, one was the memory limit and the other was the max children setting.
Once you get past your image push crunch, you might consider upgrading to the latest version of FOG.
@george1421 So I am pretty sure it is fixed now. I looked at the logs for php and found this error
[20-Aug-2019 17:28:19] WARNING: [pool www] server reached pm.max_children setting (5), consider raising it
So I changed the pm.max_children setting to 150, just to be sure, and it is now working perfectly!
I re-imaged the same lab with 20 computers and monitored the server to see how many php-fpm services ran and I found that at one point I had 10 instances so bringing that up from 5 did help for sure.
@Sebastian-Roth I restarted the service and the server too just to be sure.
Made the changes on the server and deployed the image to a lab of 20 computers and it’s giving me the same issue.
Did you actually restart service php7.1-fpm or the whole server to make it apply those changes?
@george1421 I checked the logs and I think I found the issue. I’m not at my office anymore so I’ll test it tomorrow and update the thread.
Thank you so much for your support, quick response, and excellent help!!
@rodluz OK we’ll need to look in /var/log/php-fpm directory, there should be an error log in there. We’ll need to see what is throwing the error. If there isn’t anything usable in there, then check the apache error log. But does sound like a memory exhaustion issue.
So just to be clear you are doing 20 simultaneous unicast images and its throwing the 503 error? (personally that’s a bit more than a single 1 GbE link can manage, but this error is that php-fpm is not responding back to apache within the apache timeout value.
@george1421 Made the changes on the server and deployed the image to a lab of 20 computers and it’s giving me the same issue.
@george1421 Thank you so much! I just did that and will test it by imaging more labs. If anything I keep having this issue today or tomorrow I will update the thread.
@rodluz ok that is the right one, it shifts around depending on the distro and the version of php installed. Adjust the values as I posted then reboot the fog server. Oh, your fog server should have at least 4GB of ram allocated to it. It only need 1 or 2 vCPU. You may need more vCPU if you have a large campus where more than 500 computers running the FOG Client will be hitting your FOG server.
@george1421 I found it in /etc/php/7.1/fpm/pool.d
@rodluz What path did you find it in? There should (really) only be one www.conf in the php path.
If it was commented out the default may be 32MB. Set it to 256M save the file and reboot. That should address the failure under heavy load, basically it was running out of process memory and couldn’t recycle fast enough.
@george1421 Thank you! The php_admin_value line is actually commented out (I don’t know why). Does this mean some other setting could be active somewhere else?
Also, what about adding more than 256M to that value?
@rodluz Ok it runs fine until you have a heavy demand on the fog server. Since you are running an older version of FOG we will probably need to tweak a value in the www.conf file. Search the /etc directory for it, it will probably be in th /etc/php/7.1 path. Edit the www.conf file there is a line towards the bottom where it allocates 32MB of ram, we need to bump that to 256MB then restart the fog server. That should address the fail under heavy load issue.
Let me get the exact setting.
There should be a line like this:
php_admin_value[memory_limit] = 32M
Adjust it to
php_admin_value[memory_limit] = 256M
Also confirm that the following setting has the right value of 2000
pm.max_requests = 2000
@george1421 I can’t access the web interface at all, it gives me either a 503 error or 504. The FOG server works fine for day-to-day imaging, but this happens when I am trying to image multiple labs at a time.
I ran the command that you put in your comment and this is the result
I find it strange that it’s replying with html codes in the picture.
You can access everything from the web ui OK? With the gateway timeouts I might expect that php-fpm isn’t running on the FOG server.
ps aux|grep php-fpm
Has the fog server ever worked correctly or was it working fine one day and then broken the next?