High MySQL CPU Usage Bogging Down Server
-
@tom-elliott I need to look at the document to make sure its not capable of the zombie attach that impacted github, but here is what I setup before: https://forums.fogproject.org/topic/10717/can-php-fpm-make-fog-web-gui-fast/3
[edit] nevermind I already put in the bind to 127.0.0.1 in the tut so we are protected already [/edit]
-
I will keep an eye on the server tomorrow let you guys know how goes.
-
Update
Today was much better. The load stayed way down nothing like we have seen before so the fix is working in that aspect, but CPU utilization is still running close to 100% but never crashed like it did before. Because of the CPU usage the GUI was slow to respond. Did some talking today and we might just blow up all the images (we have 600 of them too, but not all of them don’t have image files on the server) and the database and start brand new when we move to the new server build. So I am going to see if we can quickly identify a good chunk of hosts we can remove in the meantime.
@Tom-Elliott Our Sysadmin used PHPMyadmin to access the Mysql server and that was not working for him today. What do we need to do to get that to work with your fix from yesterday?
-
@uwpviolator we may have to manually install phpmyadmin. The one Ubuntu uses relies on libapache-mod-php so moving to php-fpm probably broke it from communicating. I can work with you tomorrow around 3pm if that works?
-
@tom-elliott CST or EST?
-
@uwpviolator est
-
@uwpviolator Is this a virtual machine? If so what are the configuration (vCPU and memory)? What constructs the disk subsystem?
Now that you are running php-fpm, what is the top CPU hog? Is it mysql?
(my intuition is telling me this) I wonder if mysql needs to be optimized here or there is an underlying performance issue (like disk subsystem) that is being taxed causing the high CPU load. I did some benchmarking a while ago comparing different setups. I’m not saying that any of this is relevant to the case at hand, just trying to connect more data points Ref: https://forums.fogproject.org/topic/10459/can-you-make-fog-imaging-go-fast
-
@tom-elliott Yeah that works.
-
@Tom-Elliott @Joe-Schmitt @george1421
FOG is tanking. Here is the apache log. Any ideas?
[Fri Mar 09 10:43:21.688093 2018] [proxy_fcgi:error] [pid 974] [client 10.75.1.5:61438] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.688999 2018] [proxy:error] [pid 974] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.689017 2018] [proxy_fcgi:error] [pid 974] [client 10.129.153.197:56845] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.696951 2018] [proxy:error] [pid 973] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.696965 2018] [proxy_fcgi:error] [pid 973] [client 10.75.1.21:50126] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.698089 2018] [proxy:error] [pid 973] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.698102 2018] [proxy_fcgi:error] [pid 973] [client 10.77.150.58:50923] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.702536 2018] [proxy:error] [pid 974] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.702549 2018] [proxy_fcgi:error] [pid 974] [client 10.76.1.216:58741] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.703391 2018] [proxy:error] [pid 974] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.703404 2018] [proxy_fcgi:error] [pid 974] [client 10.86.150.85:49674] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.704294 2018] [proxy:error] [pid 974] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.704307 2018] [proxy_fcgi:error] [pid 974] [client 10.94.151.240:61237] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.708168 2018] [proxy:error] [pid 974] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.708181 2018] [proxy_fcgi:error] [pid 974] [client 10.120.153.201:53910] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.722465 2018] [proxy:error] [pid 974] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.722485 2018] [proxy_fcgi:error] [pid 974] [client 10.84.150.64:53527] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:21.726900 2018] [proxy:error] [pid 976] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed [Fri Mar 09 10:43:21.726916 2018] [proxy_fcgi:error] [pid 976] [client 10.119.151.67:63447] AH01079: failed to make connection to backend: 127.0.0.1 [Fri Mar 09 10:43:29.286285 2018] [mpm_prefork:error] [pid 971] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
-
Looks like encryption broke again. I am starting to think its a time issue. Because we have so many hosts and we have to set checkin time so long that when we hit a peak in the day we break the encryption window of 30min (I think I heard before) then the clients all freakout.
-
So, worked remotely and was able to help determine part of the problem.
Installed apachetop and took a look, found a lot of things still looking at servicemodule-active.php. We moved away from this call as it had some weird issues on its own. Moved the file and the GUI become much less sluggish. Worked to get phpmyadmin working as well. This wasn’t a problem with php-fpm vs. libapache2-mod-php (though I did make sure phpmyadmin would access using the php-fpm system). Part of the problem that was happening was the rewrite rules that handles api system.
Moved the rewrite rules into the directory stanza of the fog.conf. Made this update in the working-1.5.1 branch as well. I haven’t had time to test whether this move actually still allows the system to work, but it should. Will work on that this weekend (testing to see it works properly).
Also, we are still working to help purge the db of old stuff that’s no longer needed.