Fog SVN 5020 and above CPU Hammered thread.
- 
 I am also still seeing this on this mornings SVN 5082 on ubuntu 14. 
 It does not affect the storage node whose cpu load is around 1. Here is TOP of the main server. lsof -i :80 output 
 lsof output.txtapache -v output jhales@934fog1:~$ apache2 -V [Tue Oct 27 13:18:00.309460 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_LOCK_DIR} is not defined [Tue Oct 27 13:18:00.309558 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_PID_FILE} is not defined [Tue Oct 27 13:18:00.309593 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_RUN_USER} is not defined [Tue Oct 27 13:18:00.309610 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_RUN_GROUP} is not defined [Tue Oct 27 13:18:00.309633 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_LOG_DIR} is not defined [Tue Oct 27 13:18:00.321367 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_RUN_DIR} is not defined [Tue Oct 27 13:18:00.321865 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_LOG_DIR} is not defined [Tue Oct 27 13:18:00.322102 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_LOG_DIR} is not defined [Tue Oct 27 13:18:00.322130 2015] [core:warn] [pid 21052] AH00111: Config variable ${APACHE_LOG_DIR} is not defined AH00526: Syntax error on line 74 of /etc/apache2/apache2.conf: Invalid Mutex directory in argument file:${APACHE_LOCK_DIR}mysqladmin output. 
 
- 
 mytop output 
- 
 Tools that might be interesting (work in progress): - Check the output of apache2 -Vorhttpd -Vfor “Server MPM”, is it Worker or Prefork??
- Apache server status http://fog-ip/server-status (might be enabled already but only accessable from localhost - check out the apache mod-configs)
- Mysql logging, a good talk about this: http://stackoverflow.com/questions/568564/how-can-i-view-live-mysql-queries
- mysqladmin -u root -p -i 1 processlist;
- Tools like mytop and mtop (http://www.tecmint.com/mysql-performance-monitoring/)
 
- Check the output of 
- 
 new symptom when I create an all snapins task the task is created but no snapins are queued. 
- 
 This might be the error in apache when it happens hard to tell. [Tue Oct 27 13:39:30.293918 2015] [:error] [pid 2548] [client 10.200.50.158:50298] PHP Warning: trim() expects parameter 1 to be string, array given in /var/www/html/fog/lib/fog/FOGController.class.php on line 193 [Tue Oct 27 13:39:32.665850 2015] [:error] [pid 2548] [client 10.200.50.158:50298] PHP Warning: trim() expects parameter 1 to be string, array given in /var/www/html/fog/lib/fog/FOGController.class.php on line 193 [Tue Oct 27 13:40:59.185332 2015] [:error] [pid 2596] [client 10.200.50.139:53802] PHP Warning: trim() expects parameter 1 to be string, array given in /var/www/html/fog/lib/fog/FOGController.class.php on line 193 [Tue Oct 27 13:41:00.465521 2015] [:error] [pid 2596] [client 10.200.50.139:53802] PHP Warning: trim() expects parameter 1 to be string, array given in /var/www/html/fog/lib/fog/FOGController.class.php on line 193
- 
 As all I have are theories, can you all update? I am fairly sure things aren’t 100%, but I’m hoping i’ve made SOME progress cause I’m really hitting it blind. 
- 
 httpd -V Server version: Apache/2.2.15 (Unix) 
 Server built: Aug 18 2015 02:00:22
 Server’s Module Magic Number: 20051115:25
 Server loaded: APR 1.3.9, APR-Util 1.3.9
 Compiled using: APR 1.3.9, APR-Util 1.3.9
 Architecture: 64-bit
 Server MPM: Prefork
 threaded: no
 forked: yes (variable process count)Server-status 404 not found mysqladmin -u root -p -i 1 processlist; ±------±-----------±-------------------±----±--------±-----±------±-----------------+ 
 | Id | User | Host | db | Command | Time | State | Info |
 ±------±-----------±-------------------±----±--------±-----±------±-----------------+
 | 11 | root | localhost:51961 | fog | Sleep | 75 | | |
 | 14 | root | localhost:51964 | fog | Sleep | 548 | | |
 | 15 | root | localhost:51965 | fog | Sleep | 393 | | |
 | 2853 | fogstorage | classroomfog:52337 | fog | Sleep | 231 | | |
 | 3019 | fogstorage | 10.32.8.220:36559 | fog | Sleep | 182 | | |
 | 57280 | root | localhost:53739 | fog | Sleep | 2 | | |
 | 57307 | root | localhost:53780 | fog | Sleep | 0 | | |
 | 57328 | root | localhost:53833 | fog | Sleep | 0 | | |
 | 57330 | root | localhost:53839 | fog | Sleep | 1 | | |
 | 57334 | root | localhost:53843 | fog | Sleep | 2 | | |
 | 57338 | root | localhost:53847 | fog | Sleep | 1 | | |
 | 57339 | root | localhost:53848 | fog | Sleep | 1 | | |
 | 57350 | root | localhost:53859 | fog | Sleep | 1 | | |
 | 57364 | root | localhost:53881 | fog | Sleep | 0 | | |
 | 57366 | root | localhost:53883 | fog | Sleep | 2 | | |
 | 57367 | root | localhost:53885 | fog | Sleep | 2 | | |
 | 57368 | root | localhost:53886 | fog | Sleep | 0 | | |
 | 57376 | root | localhost:53894 | fog | Sleep | 0 | | |
 | 57381 | root | localhost:53908 | fog | Sleep | 0 | | |
 | 57383 | root | localhost:53910 | fog | Sleep | 0 | | |
 | 57387 | root | localhost:53914 | fog | Sleep | 1 | | |
 | 57395 | root | localhost:53925 | fog | Sleep | 0 | | |
 | 57398 | root | localhost:53966 | fog | Sleep | 2 | | |
 | 57399 | root | localhost:53967 | fog | Sleep | 1 | | |
 | 57400 | root | localhost:53968 | fog | Sleep | 0 | | |
 | 57401 | root | localhost:53969 | fog | Sleep | 2 | | |
 | 57402 | root | localhost:53970 | fog | Sleep | 2 | | |
 | 57405 | root | localhost:53980 | fog | Sleep | 0 | | |
 | 57406 | root | localhost | | Query | 0 | | show processlist |
 | 57407 | root | localhost:53983 | fog | Sleep | 1 | | |
 | 57408 | root | localhost:53984 | fog | Sleep | 1 | | |
 | 57409 | root | localhost:53985 | fog | Sleep | 1 | | |
 | 57410 | root | localhost:53986 | fog | Sleep | 0 | | |
 | 57411 | root | localhost:53987 | fog | Sleep | 1 | | |
 ±------±-----------±-------------------±----±--------±-----±------±-----------------+Top top - 15:06:50 up 1 day, 5:26, 2 users, load average: 38.21, 23.90, 20.41 
 Tasks: 258 total, 42 running, 215 sleeping, 0 stopped, 1 zombie
 Cpu(s): 89.1%us, 10.5%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
 Mem: 3899360k total, 1538960k used, 2360400k free, 82128k buffers
 Swap: 4194300k total, 145492k used, 4048808k free, 320464k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
 2027 mysql 20 0 2404m 79m 3300 S 23.9 2.1 334:03.04 mysqld
 14615 apache 20 0 343m 20m 3768 R 9.2 0.5 0:05.08 httpd
 14454 apache 20 0 343m 21m 3764 R 8.8 0.6 0:05.61 httpd
 14457 apache 20 0 343m 20m 3772 R 8.8 0.5 0:05.60 httpd
 14475 apache 20 0 343m 20m 3768 R 8.8 0.5 0:05.23 httpd
 14602 apache 20 0 343m 20m 3828 R 8.8 0.5 0:04.97 httpd
 14617 apache 20 0 343m 20m 3768 R 8.8 0.5 0:05.04 httpd
 14216 apache 20 0 343m 21m 3768 R 8.5 0.6 0:27.19 httpd
 14384 apache 20 0 343m 20m 3768 R 8.5 0.5 0:10.49 httpd
 14435 apache 20 0 343m 20m 3764 R 8.5 0.5 0:05.88 httpd
 14459 apache 20 0 331m 10m 3764 S 8.5 0.3 0:05.70 httpd
 14460 apache 20 0 343m 20m 3704 R 8.5 0.5 0:05.41 httpd
 14472 apache 20 0 343m 20m 3768 R 8.5 0.5 0:05.33 httpd
 14473 apache 20 0 343m 20m 3764 R 8.5 0.5 0:05.35 httpd
 14613 apache 20 0 343m 21m 3768 R 8.5 0.6 0:04.74 httpd
 14672 apache 20 0 343m 21m 3764 R 8.5 0.6 0:03.01 httpd
 14707 apache 20 0 343m 20m 3552 R 8.5 0.5 0:00.61 httpd
 14708 apache 20 0 343m 20m 3552 R 8.5 0.5 0:00.62 httpd
 14408 apache 20 0 336m 14m 3768 S 8.2 0.4 0:08.21 httpd
 14696 apache 20 0 336m 14m 3764 S 8.2 0.4 0:00.82 httpd
 14166 apache 20 0 343m 20m 3772 R 7.9 0.5 0:30.77 httpd
 14438 apache 20 0 343m 20m 3768 R 7.5 0.5 0:05.64 httpd
 14443 apache 20 0 343m 20m 3768 R 7.5 0.5 0:05.60 httpd
 14476 apache 20 0 343m 20m 3764 R 7.5 0.5 0:05.38 httpd
 14704 apache 20 0 343m 20m 3760 R 7.5 0.6 0:00.71 httpd
 14368 apache 20 0 331m 10m 3768 S 7.2 0.3 0:10.54 httpd
 14455 apache 20 0 343m 20m 3768 R 7.2 0.5 0:05.52 httpd
 14356 apache 20 0 343m 21m 3768 R 6.9 0.6 0:10.81 httpd
 14411 apache 20 0 336m 14m 3768 S 6.9 0.4 0:07.78 httpd
 14720 apache 20 0 338m 16m 3552 R 6.9 0.4 0:00.52 httpd
 14481 apache 20 0 333m 11m 3768 S 6.5 0.3 0:05.50 httpd
 14717 apache 20 0 343m 20m 3552 R 6.5 0.5 0:00.55 httpd
 14718 apache 20 0 343m 20m 3552 R 6.5 0.5 0:00.55 httpd
 14471 apache 20 0 338m 16m 3768 R 6.2 0.4 0:05.34 httpd
 14715 apache 20 0 343m 20m 3704 R 6.2 0.5 0:00.52 httpd
 14719 apache 20 0 343m 20m 3544 R 6.2 0.5 0:00.50 httpd
 14721 apache 20 0 336m 14m 3552 R 6.2 0.4 0:00.51 httpd
 14722 apache 20 0 338m 16m 3552 R 6.2 0.4 0:00.52 httpd
 14442 apache 20 0 333m 11m 3768 R 5.9 0.3 0:05.58 httpd
 14485 apache 20 0 332m 11m 3768 R 5.9 0.3 0:05.33 httpd
 14665 apache 20 0 334m 12m 3768 R 5.9 0.3 0:03.34 httpd
 14710 apache 20 0 337m 15m 3760 R 5.9 0.4 0:00.63 httpd
 14703 apache 20 0 331m 9988 3552 S 5.6 0.3 0:00.69 httpd
 14365 apache 20 0 337m 16m 3768 S 5.2 0.4 0:10.44 httpd
 14478 apache 20 0 332m 10m 3768 R 5.2 0.3 0:05.41 httpd
 14603 apache 20 0 331m 10m 3768 S 5.2 0.3 0:04.88 httpd
 14716 apache 20 0 337m 15m 3760 S 5.2 0.4 0:00.48 httpd
 14428 apache 20 0 331m 10m 3768 R 4.9 0.3 0:06.22 httpd
 14477 apache 20 0 331m 10m 3768 S 4.9 0.3 0:05.22 httpd
 14486 apache 20 0 333m 11m 3768 R 4.9 0.3 0:05.25 httpd
 14456 apache 20 0 331m 10m 3768 R 4.6 0.3 0:05.40 httpd
- 
 @Adam-Taylor Is this after updating to the most current version which Tom uped just a few minutes ago? 
- 
 Yes, this was just after updating. 
- 
 @Joseph-Hales Are you using the legacy client or the new client? I’m going to attempt to replicate your issue. 
- 
 Running strace on an apache process give me this:\ Process 28758 attached - interrupt to quit 
 ^CProcess 28758 detached
 % time seconds usecs/call calls errors syscall
 28.32 0.002985 1 2262 poll 
 18.47 0.001947 0 26340 getdents
 18.10 0.001908 4 505 brk
 13.05 0.001375 0 32536 90 lstat
 7.50 0.000790 0 31140 399 stat
 4.72 0.000497 0 11704 open
 2.57 0.000271 0 9684 8663 access
 2.25 0.000237 0 11723 close
 1.48 0.000156 0 11107 33 read
 0.73 0.000077 0 10781 lseek
 0.73 0.000077 0 771 sendto
 0.72 0.000076 0 9307 fstat
 0.30 0.000032 0 893 munmap
 0.28 0.000030 2 16 write
 0.28 0.000029 2 16 writev
 0.27 0.000028 0 2246 recvfrom
 0.23 0.000024 0 893 mmap
 0.00 0.000000 0 15 rt_sigaction
 0.00 0.000000 0 15 rt_sigprocmask
 0.00 0.000000 0 16 pwrite
 0.00 0.000000 0 77 setitimer
 0.00 0.000000 0 16 accept
 0.00 0.000000 0 16 shutdown
 0.00 0.000000 0 16 getsockname
 0.00 0.000000 0 32 semop
 0.00 0.000000 0 79 fcntl
 0.00 0.000000 0 15 flock
 0.00 0.000000 0 60 getcwd
 0.00 0.000000 0 31 chdir
 0.00 0.000000 0 15 getuid
 0.00 0.000000 0 16 epoll_wait
 100.00 0.010539 162343 9185 total This was done for 30 seconds. The err is access make me look at why access is uphappy. See next post. 
- 
 These are some of the access_httpd.log files are not there: access(“/var/www/html/fog/lib/plugins/ldap/reg-task/Template.class.php”, F_OK) = -1 ENOENT (No such file or directory) 
 access(“/var/www/html/fog/lib/plugins/ldap/service/Template.class.php”, F_OK) = -1 ENOENT (No such file or directory)Something is looking for alot of files that are not there… There was 9727 requests for files that don’t exist. 
- 
 From this point on, I’m only going to post what is asked of me to post because i don’t want to flood the thread  I just found it odd that there was that many requests for missing files. Adam 
- 
 I’m fairly certain that the FOG Dashboard is causing the high CPU usage. after moving the dashboard’s js file and refreshing the FOG Dashboard, I saw a 0.02% 1-minute CPU Load average while sitting on the FOG Dashboard page. mv /var/www/html/fog/management/js/fog/fog.dashboard.js /var/www/html/fog/management/js/fog/fog.dashboard.js.movedAfter putting it back and refreshing the FOG Dashboard, I saw a 0.24% 1-minute CPU Load average while sitting on the FOG Dashboard page. mv /var/www/html/fog/management/js/fog/fog.dashboard.js.moved /var/www/html/fog/management/js/fog/fog.dashboard.jsOpening three tabs with the file in the correct place, the 1-minute CPU Load average was at 0.96% Also, each FOG Dashboard page creates approximately 2 httpd processes. So with 3 pages open and sitting on the dashboard, I was seeing about 6 httpd instances in top. So, my hypothesis is that organizations with multiple FOG users under 1 main server probably have the FOG Dashboard open all the time, causing the load.I’m not saying they should close the dashboard or not leave it open all day… it’s probably a recent code change somewhere on the dashboard causing this new issue… SO… if you all could please test, and just temporarily move the file and then tell everybody to refresh their FOG tabs, just see what the CPU does and please report back. 
- 
 @Wayne-Workman Interesting catch we might have to look into as well. On the other hand I am pretty sure this is not causing the high load on those servers with a lot of clients in the environment. On Joseph Hales’ server we see a lot of apache processes spawned and kept open by clients. Those clients never request Javascript files AFAIK. I will be setting up an older revision with Joseph later today and I really hope to see a difference. We then step up the revisions to see where it all started. Keep your fingers crossed. 
- 
 I know for certain it started somewhere between r4960 and r4982. 
- 
 For what it’s worth, there are about 500ish PCs here in my building that are using the legacy client, and I’m using the latest as of yesterday and I’m not seeing the large CPU loads as everyone else is… Also, nobody here but me and my co-worker even use the fog web interface, and we don’t leave it sitting open either. 
- 
 @Wayne-Workman Even if the web interface isn’t used by anyone the load will always be high. However, out of the ~1000 clients I have approximately 900 have the new client installed. Maybe this issue is related to these newest revisions with regard to communication with the new client (v0.95). 
- 
 Can all update? My suspicion as to a part of the problem is specific to the fog multicast service. I noticed things were not starting for it and would continuously be respawned. This would create quite a high load on the system as it would be constantly creating new sessions and spawning new instances. While I’m sure there may be some cases of weirdness I think this may be a largely attributing factor. 
- 
 I am still seeing the issue after updating and restarting SVN 5106. Also after updating to 5108. 



