Very high CPU usage httpd, mysqld, FOGMulticastManager FOG trunk@5224
-
Hi all, due to a non-FOG related issue on a couple of FOG boxes I had to rebuild them and since doing so am encountering very high CPU usage and multiple httpd processes.
top - 13:09:42 up 22:33, 1 user, load average: 56.32, 45.19, 42.16 Tasks: 221 total, 47 running, 174 sleeping, 0 stopped, 0 zombie Cpu(s): 56.9%us, 42.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 1922092k total, 1666716k used, 255376k free, 92020k buffers Swap: 4128764k total, 2304k used, 4126460k free, 644060k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28249 mysql 20 0 1373m 59m 6588 S 11.0 3.2 1:16.55 mysqld 28903 apache 20 0 328m 15m 3840 R 4.2 0.8 0:02.84 httpd 28452 apache 20 0 421m 15m 4484 R 3.9 0.8 0:19.04 httpd 28666 apache 20 0 421m 15m 4480 R 3.9 0.8 0:13.70 httpd 28830 apache 20 0 425m 18m 4304 R 3.9 1.0 0:05.28 httpd 28838 apache 20 0 420m 14m 4392 R 3.9 0.8 0:02.58 httpd 28855 apache 20 0 424m 18m 4692 R 3.9 1.0 0:04.97 httpd 28883 apache 20 0 421m 14m 4256 R 3.9 0.8 0:03.23 httpd 28937 apache 20 0 425m 19m 4368 R 3.9 1.0 0:03.41 httpd 28966 apache 20 0 326m 13m 3872 R 3.9 0.7 0:02.60 httpd 28477 apache 20 0 422m 16m 4692 R 3.6 0.9 0:20.11 httpd 28723 apache 20 0 425m 19m 4684 R 3.6 1.1 0:11.49 httpd 28841 apache 20 0 326m 13m 3732 R 3.6 0.7 0:05.58 httpd 28885 apache 20 0 423m 16m 4056 R 3.6 0.9 0:04.08 httpd 28902 apache 20 0 425m 18m 4088 R 3.6 1.0 0:04.18 httpd 28919 apache 20 0 421m 14m 4112 S 3.6 0.8 0:03.90 httpd 28543 apache 20 0 421m 15m 4436 D 3.2 0.8 0:16.87 httpd 28846 apache 20 0 420m 14m 4416 R 3.2 0.8 0:04.00 httpd 28858 apache 20 0 420m 14m 4376 R 3.2 0.8 0:05.29 httpd 28910 apache 20 0 325m 13m 3844 S 3.2 0.7 0:04.08 httpd 28947 apache 20 0 420m 14m 4392 R 3.2 0.8 0:02.68 httpd 29022 apache 20 0 421m 14m 3752 S 3.2 0.7 0:01.36 httpd 29030 apache 20 0 420m 13m 3760 S 3.2 0.7 0:01.22 httpd 28704 apache 20 0 424m 17m 4772 R 2.9 0.9 0:14.33 httpd 28767 apache 20 0 421m 15m 4312 S 2.9 0.8 0:10.92 httpd 28848 apache 20 0 326m 14m 3892 S 2.9 0.8 0:05.33 httpd
tail /var/log/httpd/access_log
10.***.***.51 - - [20/Apr/2016:12:59:52 +0100] "GET /fog/service/servicemodule-active.php? moduleid=displaymanager&mac=A4:1F:72:85:89:F4%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 5 "-" "-" 10.***.***.47 - - [20/Apr/2016:12:59:51 +0100] "GET /fog/service/servicemodule-active.php?moduleid=snapinclient&mac=74:27:EA:EB:14:97%7C%7C00:00:00:00:00:00:00:E0%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 5 "-" "-" 10.***.***.2 - - [20/Apr/2016:12:59:52 +0100] "GET /fog/service/snapins.checkin.php?mac=FC:AA:14:19:72:40%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 4 "-" "-" 10.***.***.9 - - [20/Apr/2016:12:59:51 +0100] "GET /fog/service/servicemodule-active.php?moduleid=autologout&mac=78:45:C4:0D:EE:BD%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 5 "-" "-" 10.***.***.4 - - [20/Apr/2016:12:59:52 +0100] "GET /fog/service/Printers.php?mac=8C:89:A5:90:59:6B%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 12 "-" "-" 10.***.***.56 - - [20/Apr/2016:12:59:52 +0100] "GET /fog/service/servicemodule-active.php?moduleid=displaymanager&mac=FC:AA:14:12:F8:DE%7C%7C00:00:00:00:00:00:00:E0%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 5 "-" "-" 10.***.***.59 - - [20/Apr/2016:12:59:52 +0100] "GET /fog/service/servicemodule-active.php?moduleid=clientupdater&mac=A4:1F:72:85:87:EE%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 5 "-" "-" 10.***.***.7 - - [20/Apr/2016:12:59:52 +0100] "GET /fog/service/servicemodule-active.php?moduleid=greenfog&mac=FC:AA:14:19:76:E2%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 5 "-" "-" 10.***.***.31 - - [20/Apr/2016:12:59:52 +0100] "POST /fog/service/getversion.php HTTP/1.1" 200 4 "-" "-" 10.***.***.95 - - [20/Apr/2016:12:59:51 +0100] "GET /fog/service/hostname.php?moduleid=hostnamechanger&mac=74:27:EA:CE:0F:26%7C%7C00:00:00:00:00:00:00:E0&newService=1 HTTP/1.1" 200 233 "-" "-"
I suspect due to the rebuild the clients are trying to check in to the FOG server with old client version and CA cert. If I stop all 3 services (httpd, mysqld, FOGMulticastManager) and only start httpd all the processes spawn again and CPU ramps up again making FOG unusable.
Other than updating the new client/SSL combo on all my hosts is there away around this to stop the https processes maxing out the CPU? Assuming of course thats whats causing the problem…
cheers. Kiweegie.
-
@Quazz Hi Quazz yes main problem of high cpu usage appears to be much better - main server with most number of hosts is sitting at 5-30% CPU load roughly with no active tasks running. I’m testing now with images running to double check under load.
cheers Kiweegie.
-
@Kiweegie Any update on this? Is it better now?
-
@Quazz You’re quite right, apologies Quazz I stand corrected. I was in /opt/git/trunk and presumed running ./bin/installfog.sh would suffice. Must not be enough caffeine in the system this morning…
-
@Kiweegie No, you ran this command
./bin/installfog.sh
which indicates you are not in the bin directory.
-
@Quazz Thanks Quazz but I am inside the bin directory as my screenshot shows I believe. It would fail well before there if not.
-
@Kiweegie You have to run installfog.sh while inside the bin directory otherwise it will fail in more way than one (just checked this on my installation to be sure)
-
@Sebastian-Roth Just upgraded trunk (5554) using git now and getting this error on install
./bin/installfog.sh: line 452: installPackages: command not found * Confirming package installation ./bin/installfog.sh: line 456: confirmPackageInstallation: command not found * Configuring services * What is the storage location for your images directory? (/images)
-
@Kiweegie There have been a lot of improvements in the last two weeks! Mainly in the FOG client. Please upgrade to the latest version (first try the client, then FOG server) and see if things have changed for you as well.
-
@Wayne-Workman Hi Wayne
apologies for not getting back to you til now, other projects keep piling up… I’ve checked the DB as you suggested via the cli and although only 3 hosts are listed in the GUI the cli shows 270 hosts…
| 270 | FOG01-current | | | 3 | 0 | 2016-05-21 11:28:20 | 2016-05-21 11:44:32 | foguser | 0 | | | | | | | | | | | | | | 0000-00-00 00:00:00 | 6 | | | 1 |
Above is last snippet.
I guess I can just drop the DB and start over?
cheers Kiweegie.
-
@Kiweegie Lots of improvements in the last week to the new client, can you update and see if this affects the CPU load at all?
I suspect there will be a quite dramatic decrease after moving to client 010.5
-
@Kiweegie go into mysql via CLI and look at the actual data, and let us see what you see.
mysql
use fog
select * from hosts;
-
@Tom-Elliott Hey Tom
another anomaly turned up since most recent changes. My Helpdesk team complained that all desktops with FOG client installed were boot looping. Work around was to remove all clients from the host management section in FOG. This permitted desktops to boot and not restart - suspect this is down to the old FOG client.
Only today getting time investigate this and noticed that when adding in a brand new host it seems to accept ok but nothing shows in the GUI when clicking List all hosts. If I export the list it shows one host only (I’ve added 2) but if I try and re-add the 2nd host i get error Hostname already exists.
Something funky going on with the database?
cheers Kiweegie
-
@Tom-Elliott Hey Tom, many thanks. Both servers now have access to the reports, fog settings etc pages. CPU usage still high but the servers are usable enough albeit slow. I’ll keep an eye on Jbobs progress with the new client.
thanks again, Kiweegie.
-
Just pushed a fix to hopefully make things functional again (meaning cancelling/deleting/viewing) everything
-
For each module the client has, it does 1 request to servicemodule-active to see if its enabled. That means, every 1 cycle of the client it’s calling that file about 5-6 times. This is one of the things the v0.10.0 removes completely.
Keep in mind the new client was originally designed for a completely different form of server<-> client communication (socket connections). And inorder to be able to release it in a timely fashion, we retrofitted the legacy client’s communication method onto the new client. This is why we have so much “useless” network traffic. v0.10.0 provides a hybrid of sorts between the ideal network communication model we originally wanted, and the legacy one.
-
Ok, @Developers what does
servicemodule-active.php
do? On Kiweegie’s servers, it’s getting 7 requests per second. -
@Wayne-Workman Hi Wayne
here’s the output of apachetop on my main machine
last hit: 15:16:16 atop runtime: 0 days, 00:21:30 15:16:17 All: 22036 reqs ( 17.1/sec) 4721.2K ( 3747.7B/sec) 219.4B/req 2xx: 22036 ( 100%) 3xx: 0 ( 0.0%) 4xx: 0 ( 0.0%) 5xx: 0 ( 0.0%) R ( 30s): 454 reqs ( 15.1/sec) 90.0K ( 3072.1B/sec) 203.0B/req 2xx: 454 ( 100%) 3xx: 0 ( 0.0%) 4xx: 0 ( 0.0%) 5xx: 0 ( 0.0%) REQS REQ/S KB KB/S URL 210 7.00 1.0 0.0*/fog/service/servicemodule-active.php 53 1.77 86.9 2.9 /fog/management/other/ssl/srvpublic.crt 43 1.43 0.5 0.0 /fog/service/Printers.php 30 1.00 0.2 0.0 /fog/service/autologout.php 28 0.97 0.1 0.0 /fog/service/snapins.checkin.php 17 0.59 0.9 0.0 /fog/management/index.php 16 0.53 0.1 0.0 /fog/service/greenfog.php 15 0.50 0.1 0.0 /fog/service/printerlisting.php 14 0.47 0.1 0.0 /fog/service/getversion.php 14 0.48 0.1 0.0 /fog/service/jobs.php 13 0.45 0.1 0.0 /fog/service/hostname.php 1 0.04 0.0 0.0 *
and from my second machine
last hit: 15:15:47 atop runtime: 0 days, 00:21:10 15:15:48 All: 5274 reqs ( 4.2/sec) 1008.8K ( 813.4B/sec) 195.9B/req 2xx: 5259 (99.7%) 3xx: 15 ( 0.3%) 4xx: 0 ( 0.0%) 5xx: 0 ( 0.0%) R ( 30s): 190 reqs ( 6.3/sec) 50.5K ( 1722.4B/sec) 272.0B/req 2xx: 190 ( 100%) 3xx: 0 ( 0.0%) 4xx: 0 ( 0.0%) 5xx: 0 ( 0.0%) REQS REQ/S KB KB/S URL 80 2.67 0.4 0.0*/fog/service/servicemodule-active.php 30 1.07 49.2 1.8 /fog/management/other/ssl/srvpublic.crt 13 0.45 0.2 0.0 /fog/service/Printers.php 9 0.35 0.0 0.0 /fog/service/hostname.php 9 0.33 0.1 0.0 /fog/service/getversion.php 9 0.35 0.0 0.0 /fog/service/jobs.php 9 0.38 0.0 0.0 /fog/service/printerlisting.php 9 0.36 0.0 0.0 /fog/service/snapins.checkin.php 9 0.38 0.0 0.0 /fog/service/greenfog.php 8 0.35 0.4 0.0 /fog/management/index.php 5 0.17 0.0 0.0 /fog/service/autologout.php
Heading off home now but will be online later if more information required.
regards Kiweegie.
-
Some info for people experiencing high loads:
v0.10.0 of the client (currently undergoing release candidate testing) drastically cuts the amount of traffic used. In the “heaviest” cycle, the client will only make 3 requests. 1 to authenticate, 1 to get module settings, and 1 to get server settings (cycle time, client version,…). It also prevents run-away authentication, where the client had the potential to constantly spam the server’s authentication method.
Hopefully when v0.10.0 is released it should resolve any load issues. However, more than likely v0.10.0 may contain several bugs as the entire code base has changed to support any OS, and a completely new middleware API has been made to support the decrease in traffic. Since so much has changed, its almost guaranteed that I will have overlooked some minor things.
-
@Wayne-Workman Not a problem. Installed on both boxes and currently running - I need to run myself shortly but will fire over the results before i go in about 20 mins.
cheers, Kiweegie.