Fog SVN 5020 and above CPU Hammered thread.
-
Still seeing the issue too. Note the huge number of threads and tasks!
-
Guys,
Do me a favor, it’s going to break things, but it’s very specific.
It’s the service module active checks. They’re spawning way too many processes.
For now, just run: mv /var/www/fog/service/servicemodule-active.php /var/www/fog/service/service-active.php
-
Well it broke it but it stopped the “thing”!
-
@Trevelyan I’m aware of it breaking the client stuff, but it’s all i can do for right now.
I’m remoted into another system having the same type of issues and trying to use it as a way to figure out exactly what’s up. You can image, and do all that, just no FOG Client stuff at the moment. I’ll be trying to narrow exactly where it is.
-
While it’s not perfect, I found 4168 is still good and 4169 things are “decent”. 4170, all hell breaks loose. Please update back to 4168 to be “functional”.
You can do this with svn by
cd /opt/trunk;svn up -r 4168;cd bin;./installfog.sh
Of course change the /opt/trunk path to the location of your svn folder.
-
4168 == sanity!
Never get past about 30% now.
Adam
-
Ok im functional at 4168 just a note all the snapins that weren’t showing up are all there now.
-
I’m now fairly certain these cpu load issues is now fixed. It was due to HookManager. I’m sorry it took so long to figure out.
-
Everything seems happy now!
Thanks
PS - Hosts and Users report shows up as error 500 - links to (fog server…) /node=report&sub=file&f=SG9zdHMgYW5kIFVzZXJzLnBocA==
-
@Trevelyan I’ll fix the hosts and users report when I get into work.
Can you all verify if things are better and let me know? I’m sorry about the problem this caused, but hey progress or something.
-
I want to thank Tom again for this! I tried to assist and help but I just don’t know the code (and its history) as much as Tom does. He’s done all the hard work to find and fix this issue. Along the way he also fixed a couple other things as well.
@Tom-Elliott Don’t say sorry. FOG is work in progress and you are pushing things way forward!! :metal:
-
@Tom-Elliott Thanks for all your work on fog. Long time user. I’m on 5124 and CPU utilization looks good. I echo @Trevelyan 's concern about the user tracker report. CPU utilization goes high and then comes back down after the 500 error is sent to the browser. I know you said you’d fix it when you got to work. Is there a way for me to find out what version fixes it? Then I won’t bother posting this question in the forum.
Thanks,
Mark
-
I updated my main and storage node after the update to SVN 5126 the CPU usage is normal but something is still wrong with mysql the database hits MAX_Workers right away and this prevents the storage node from connecting.
[Thu Oct 29 13:34:44.593182 2015] [:error] [pid 21227] [client 10.109.49.21:54036] PHP Warning: mysqli::mysqli(): MySQL server has gone away in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Thu Oct 29 13:34:44.593262 2015] [:error] [pid 21227] [client 10.109.49.21:54036] PHP Warning: mysqli::mysqli(): (HY000/2002): Connection refused in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Thu Oct 29 13:34:44.681319 2015] [:error] [pid 21145] [client 10.114.51.136:49375] PHP Warning: mysqli::mysqli(): MySQL server has gone away in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Thu Oct 29 13:34:44.681414 2015] [:error] [pid 21145] [client 10.114.51.136:49375] PHP Warning: mysqli::mysqli(): (HY000/2002): Connection refused in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Thu Oct 29 13:34:44.837520 2015] [mpm_prefork:notice] [pid 21097] AH00169: caught SIGTERM, shutting down [Thu Oct 29 13:36:56.973677 2015] [mpm_prefork:notice] [pid 1510] AH00163: Apache/2.4.16 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations [Thu Oct 29 13:36:56.981375 2015] [core:notice] [pid 1510] AH00094: Command line: '/usr/sbin/apache2' [Thu Oct 29 13:38:00.215860 2015] [mpm_prefork:error] [pid 1510] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting```
-
After reverting storage node is back up.
but the apache log show slightly different errors but the node is working?
[Thu Oct 29 13:58:49.245894 2015] [:error] [pid 2338] [client 10.44.32.31:61425] PHP Strict Standards: Declaration of FOGController::getSubObjectIDs() should be compatible with FOGBase::getSubObjectIDs($object = 'Host', $findWhere = Array, $getField = 'id', $not = false, $operator = 'AND', $orderBy = 'name') in /var/www/html/fog/lib/fog/FOGController.class.php on line 0 [Thu Oct 29 13:58:49.245960 2015] [:error] [pid 2338] [client 10.44.32.31:61425] PHP Fatal error: Access level to FOGManagerController::orderBy() must be public (as in class FOGBase) in /var/www/html/fog/lib/fog/FOGManagerController.class.php on line 2 [Thu Oct 29 13:58:49.319201 2015] [:error] [pid 2222] [client 10.102.50.209:53776] PHP Warning: mysqli::mysqli(): MySQL server has gone away in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Thu Oct 29 13:58:49.320596 2015] [:error] [pid 2222] [client 10.102.50.209:53776] PHP Strict Standards: Declaration of FOGController::getSubObjectIDs() should be compatible with FOGBase::getSubObjectIDs($object = 'Host', $findWhere = Array, $getField = 'id', $not = false, $operator = 'AND', $orderBy = 'name') in /var/www/html/fog/lib/fog/FOGController.class.php on line 0 [Thu Oct 29 13:58:49.320645 2015] [:error] [pid 2222] [client 10.102.50.209:53776] PHP Fatal error: Access level to FOGManagerController::orderBy() must be public (as in class FOGBase) in /var/www/html/fog/lib/fog/FOGManagerController.class.php on line 2 [Thu Oct 29 13:58:49.362058 2015] [:error] [pid 2317] [client 10.41.32.7:64706] PHP Strict Standards: Declaration of FOGController::getSubObjectIDs() should be compatible with FOGBase::getSubObjectIDs($object = 'Host', $findWhere = Array, $getField = 'id', $not = false, $operator = 'AND', $orderBy = 'name') in /var/www/html/fog/lib/fog/FOGController.class.php on line 0 [Thu Oct 29 13:58:49.362127 2015] [:error] [pid 2317] [client 10.41.32.7:64706] PHP Fatal error: Access level to FOGManagerController::orderBy() must be public (as in class FOGBase) in /var/www/html/fog/lib/fog/FOGManagerController.class.php on line 2 [Thu Oct 29 13:58:49.402927 2015] [:error] [pid 2281] [client 10.43.32.103:64848] PHP Warning: mysqli::mysqli(): MySQL server has gone away in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Thu Oct 29 13:58:49.404219 2015] [:error] [pid 2281] [client 10.43.32.103:64848] PHP Strict Standards: Declaration of FOGController::getSubObjectIDs() should be compatible with FOGBase::getSubObjectIDs($object = 'Host', $findWhere = Array, $getField = 'id', $not = false, $operator = 'AND', $orderBy = 'name') in /var/www/html/fog/lib/fog/FOGController.class.php on line 0 [Thu Oct 29 13:58:49.404267 2015] [:error] [pid 2281] [client 10.43.32.103:64848] PHP Fatal error: Access level to FOGManagerController::orderBy() must be public (as in class FOGBase) in /var/www/html/fog/lib/fog/FOGManagerController.class.php on line 2 [Thu Oct 29 13:58:49.415663 2015] [:error] [pid 2223] [client 10.1.50.216:51730] PHP Strict Standards: Declaration of FOGController::getSubObjectIDs() should be compatible with FOGBase::getSubObjectIDs($object = 'Host', $findWhere = Array, $getField = 'id', $not = false, $operator = 'AND', $orderBy = 'name') in /var/www/html/fog/lib/fog/FOGController.class.php on line 0 [Thu Oct 29 13:58:49.415714 2015] [:error] [pid 2223] [client 10.1.50.216:51730] PHP Fatal error: Access level to FOGManagerController::orderBy() must be public (as in class FOGBase) in /var/www/html/fog/lib/fog/FOGManagerController.class.php on line 2 [Thu Oct 29 13:58:49.897820 2015] [mpm_prefork:notice] [pid 1510] AH00169: caught SIGTERM, shutting down [Thu Oct 29 13:58:50.626613 2015] [mpm_prefork:notice] [pid 4285] AH00163: Apache/2.4.16 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations [Thu Oct 29 13:58:50.626657 2015] [core:notice] [pid 4285] AH00094: Command line: '/usr/sbin/apache2' [Thu Oct 29 13:59:00.773873 2015] [mpm_prefork:error] [pid 4285] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting```
-
@Joseph-Hales Does your max_connections settings get wiped when you update? Can you check them?
-
I checked max_workers and max_connections were still edited properly to test further I set max_workers and max_connections to 1000 and I am still seeing the same issue.
-
SVN 5155 CPU is fine but storage node still can’t connect and max_workers hits that limit again even when set to 1000.
[Fri Oct 30 14:02:09.746764 2015] [:error] [pid 11232] [client 10.1.50.108:63311] PHP Warning: mysqli::mysqli(): MySQL server has gone away in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Fri Oct 30 14:02:09.746866 2015] [:error] [pid 11232] [client 10.1.50.108:63311] PHP Warning: mysqli::mysqli(): (HY000/2002): Connection refused in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Fri Oct 30 14:02:09.784735 2015] [:error] [pid 11087] [client 10.109.49.5:59983] PHP Warning: mysqli::mysqli(): MySQL server has gone away in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Fri Oct 30 14:02:09.784862 2015] [:error] [pid 11087] [client 10.109.49.5:59983] PHP Warning: mysqli::mysqli(): (HY000/2002): Connection refused in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Fri Oct 30 14:02:09.839016 2015] [:error] [pid 14086] [client 10.43.48.199:63489] PHP Warning: mysqli::mysqli(): (HY000/2002): Connection refused in /var/www/html/fog/lib/fog/FOGBase.class.php on line 64 [Fri Oct 30 14:02:10.027273 2015] [mpm_prefork:notice] [pid 11050] AH00169: caught SIGTERM, shutting down [Fri Oct 30 14:04:21.875728 2015] [mpm_prefork:notice] [pid 1544] AH00163: Apache/2.4.16 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations [Fri Oct 30 14:04:21.888867 2015] [core:notice] [pid 1544] AH00094: Command line: '/usr/sbin/apache2' [Fri Oct 30 14:05:08.065535 2015] [mpm_prefork:error] [pid 1544] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting```
-
@Joseph-Hales Well, maybe you just have “too many” hosts talking to your FOG server…
I will solve this thread as the performance/load issue is solved. Feel free to go on about max_connections here or open a new topic.
-
@Uncle-Frank, @Joseph-Hales
I’ve added what I am terming "sleeper code’.
Essentially, it will not continuously hammer the MySQL server if too many connections are being reached. Rather, it will force the item that is waiting for data to sleep for 5 seconds in hopes the next time it hits, it can actually operate. I doubt this will fix the apache max request workers problem, but it should ensure usability of the system as MySQL is the largest problem. It should, also, allow the GUI to function in these instances as it too will wait for connections to become available.
-
Updated to 5207 the sleeper code fixed my storage node not connecting but you are correct we are still seeing the max connections issue I will start a new thread on just that issue. As for the comment about to many hosts talking I would note the max connections issue is a recent thing and my hosts have actually decreased this year a small amount. I am also far from the only site running this many host but it is likely that I am the only one who is trying to run SVN at this time most large shops won’t upgrade till final release.