HIGH CPU Fog Services after update r5029 v6759
-
@Raymond-Bell Just going to guess but the Storage nodes, currently, are unable to talk with the main FOG Server?
-
@Tom-Elliott said:
@Raymond-Bell Just going to guess but the Storage nodes, currently, are unable to talk with the main FOG Server?
Editing the rc.local file now will let you know after reboot
-
@Tom-Elliott
After editing rc.local and rebooting all i got to web sign in and got thisDatabase Schema Installer / Updater
But i did not get this during set-up to the latest SVNI click install and get ’
Install/Upgrade Successful! The following errors occured Update ID: 1 - 0 Database Error: Too many connections, Message: Check that database is running Database SQL: CREATE DATABASE fog Update ID: 1 - 1 Database Error: Too many connections, Message: Check that database is running Database SQL: CREATE TABLE `fog`.`groupMembers` ( `gmID` int(11) NOT NULL auto_increment, `gmHostID` int(11) NOT NULL, `gmGroupID` int(11) NOT NULL, PRIMARY KEY (`gmID`), KEY `new_index` (`gmHostID`), KEY `new_index1` (`gmGroupID`) ) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC
But second time login just fine and server is seeing nodes
After reboot and rc.local edit
Server
top - 07:58:10 up 8 min, 2 users, load average: 127.49, 100.92, 50.49 Tasks: 360 total, 36 running, 324 sleeping, 0 stopped, 0 zombie %Cpu(s): 65.6 us, 33.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 1.1 si, 0.0 st KiB Mem: 4355692 total, 1476112 used, 2879580 free, 95208 buffers KiB Swap: 1046524 total, 0 used, 1046524 free. 496208 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1363 root 20 0 39820 20000 15036 R 89.2 0.5 7:10.67 FOGSnapinReplic 1399 root 20 0 39820 19996 15024 S 84.5 0.5 7:10.53 FOGTaskSchedule 1254 mysql 20 0 363320 75940 9864 S 22.5 1.7 1:58.49 mysqld 3441 www-data 20 0 108088 15672 10396 D 9.6 0.4 0:01.11 apache2 3052 www-data 20 0 110464 18732 12840 R 8.9 0.4 0:06.03 apache2 2983 www-data 20 0 110820 18828 12840 R 8.3 0.4 0:05.70 apache2 3405 fog 20 0 5692 2916 2344 R 6.6 0.1 0:01.32 top 3523 www-data 20 0 108232 15724 10312 R 6.6 0.4 0:00.72 apache2 3060 www-data 20 0 111052 19272 12792 R 6.3 0.4 0:05.67 apache2 3541 www-data 20 0 107928 15280 10012 R 5.6 0.4 0:00.45 apache2
SN 1
top - 07:59:33 up 9 min, 2 users, load average: 5.07, 4.46, 2.47 Tasks: 221 total, 7 running, 214 sleeping, 0 stopped, 0 zombie %Cpu(s): 69.8 us, 29.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st KiB Mem: 7735460 total, 881520 used, 6853940 free, 79604 buffers KiB Swap: 7828476 total, 0 used, 7828476 free. 405380 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2643 root 20 0 39948 19864 14908 R 86.8 0.3 7:11.84 FOGImageReplica 2679 root 20 0 39948 20068 15116 R 86.8 0.3 7:09.10 FOGMulticastMan 2609 root 20 0 39948 19848 14888 R 76.8 0.3 7:09.03 FOGPingHosts 2626 root 20 0 39948 19888 14928 R 74.8 0.3 7:09.72 FOGTaskSchedule 2662 root 20 0 39948 19984 15024 R 74.1 0.3 7:11.40 FOGSnapinReplic 41 root 20 0 0 0 0 R 0.3 0.0 0:00.17 kworker/0:1 2845 fog 20 0 5544 2752 2344 R 0.3 0.0 0:00.35 top 1 root 20 0 4740 3888 2616 S 0.0 0.1 0:01.44 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 0:00.29 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
SN 2
top - 08:00:12 up 10 min, 3 users, load average: 5.12, 4.54, 2.58 Tasks: 225 total, 7 running, 218 sleeping, 0 stopped, 0 zombie %Cpu(s): 72.2 us, 27.8 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 7735460 total, 886896 used, 6848564 free, 79612 buffers KiB Swap: 7828476 total, 0 used, 7828476 free. 405392 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2662 root 20 0 39948 19984 15024 R 88.8 0.3 7:41.96 FOGSnapinReplic 2643 root 20 0 39948 19864 14908 R 88.4 0.3 7:41.97 FOGImageReplica 2626 root 20 0 39948 19888 14928 R 76.6 0.3 7:41.53 FOGTaskSchedule 2609 root 20 0 39948 19848 14888 R 70.5 0.3 7:40.83 FOGPingHosts 2679 root 20 0 39948 20068 15116 R 69.7 0.3 7:40.03 FOGMulticastMan 2308 root 20 0 53452 6756 5804 S 5.7 0.1 0:00.33 udisksd 2845 fog 20 0 5544 2876 2344 S 0.4 0.0 0:00.45 top 2901 fog 20 0 5688 2768 2344 R 0.4 0.0 0:00.01 top 1 root 20 0 4740 3888 2616 S 0.0 0.1 0:01.44 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 0:00.31 rcu_sched
-
Still don’t understand why i have so many apache2 services running on the server either…
-
@Raymond-Bell are you disabling the services from booting completely? Then using the rc.local?
THe rc.local won’t work if it can’t stop the original tasking anyway.
-
@Tom-Elliott said:
@Raymond-Bell are you disabling the services from booting completely? Then using the rc.local?
THe rc.local won’t work if it can’t stop the original tasking anyway.
Yes and i rebooted also
-
I have 11 apache processes running right now, I’m fairly certain it’s normal behavior. Load is under 0.5% total.
-
@Wayne-Workman i will try this i think this is what Tom is talking about
-
@Raymond-Bell said:
Too many connections, Message: Check that database is running
Too many connections, Message: Check that database is running.
Can you edit the /etc/mysql/my.cnf file and add in the
[mysqld]
portion (or change)max_connections = 500
then restart the mysql service? (sudo service mysql restart
) -
@Tom-Elliott said:
@Raymond-Bell said:
Too many connections, Message: Check that database is running
Too many connections, Message: Check that database is running.
Can you edit the /etc/mysql/my.cnf file and add in the
[mysqld]
portion (or change)max_connections = 500
then restart the mysql service? (sudo service mysql restart
)Yes that helped with the to many connections
Also this helped on the server but not the nodessudo update-rc.d FOG*service* disable``` Try disabling the FOG Services from starting at boot. Then, edit the /etc/rc.local file. Add: sleep 30 /etc/init.d/FOGPingHosts stop /etc/init.d/FOGScheduler stop /etc/init.d/FOGImageReplicator stop /etc/init.d/FOGSnapinReplicator stop /etc/init.d/FOGMulticastManager stop /etc/init.d/FOGPingHosts start /etc/init.d/FOGScheduler start /etc/init.d/FOGImageReplicator start /etc/init.d/FOGSnapinReplicator start /etc/init.d/FOGMulticastManager start
-
Guys , @Raymond-Bell & @Tom-Elliott
do you got an patch-solution for this?
After the patch from this afternoon i also get HIGH CPU services (now got 6777)Server is freaking out
Ubuntu 14.04 -
@Raymond-Bell The rc.local stuff is also on the Storage Nodes?
-
@Tom-Elliott said:
@Raymond-Bell The rc.local stuff is also on the Storage Nodes?
yes i edited rc.local on Storage Nodes also
-
@Raymond-Bell Did you restart the services after adding the rc.local? I’m sorry if I’m asking obvious questions, but I really just need to make sure. Current update should help alleviate CPU usage as well though.
-
@Tom-Elliott said:
@Raymond-Bell Did you restart the services after adding the rc.local? I’m sorry if I’m asking obvious questions, but I really just need to make sure. Current update should help alleviate CPU usage as well though.
yes restarted and rebooted and also up to r5038 6777
will do it again to make sure -
@Tom-Elliott Yes same result after
sudo service FOGMulticastManager stop sudo service FOGImageReplicator stop sudo service FOGSnapinReplicator stop sudo service FOGScheduler stop sudo service FOGPingHosts stop sudo update-rc.d FOGMulticastManager disable sudo update-rc.d FOGImageReplicator disable sudo update-rc.d FOGSnapinReplicator disable sudo update-rc.d FOGScheduler disable sudo update-rc.d FOGPingHosts disable sudo service FOGMulticastManager start sudo service FOGImageReplicator start sudo service FOGSnapinReplicator start sudo service FOGScheduler start sudo service FOGPingHosts start
top - 10:48:18 up 1:48, 2 users, load average: 4.14, 4.92, 5.02 Tasks: 202 total, 6 running, 196 sleeping, 0 stopped, 0 zombie %Cpu(s): 66.6 us, 32.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st KiB Mem: 1014980 total, 966204 used, 48776 free, 19852 buffers KiB Swap: 1037308 total, 12316 used, 1024992 free. 529972 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5970 root 20 0 41920 20164 15184 R 45.6 2.0 0:07.37 FOGSnapinReplic 6026 root 20 0 41920 20392 15408 R 41.3 2.0 0:05.85 FOGPingHosts 5990 root 20 0 41920 20240 15256 R 39.6 2.0 0:06.34 FOGTaskSchedule 5950 root 20 0 41920 20488 15504 R 39.3 2.0 0:06.90 FOGImageReplica 5930 root 20 0 41920 20116 15132 R 34.3 2.0 0:05.32 FOGMulticastMan 5779 fog 20 0 11284 3900 3128 S 0.3 0.4 0:00.04 sshd 1 root 20 0 4732 3788 2548 S 0.0 0.4 0:02.83 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.78 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 0:05.27 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:00.01 migration/0 10 root rt 0 0 0 0 S 0.0 0.0 0:00.01 watchdog/0 11 root rt 0 0 0 0 S 0.0 0.0 0:00.01 watchdog/1 12 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.12 ksoftirqd/1
-
Something weird. After the new update , my CPU performance from my server was going to 10-20%. After a reboot , he started again @100%
Pretty anoying
-
I just realized after going through another git pull upgrade that the CPU spike will probably happen every time since the services are reinstalled. Here’s how I fixed my problem again, hopefully in a cleaner way. After upgrading to 6791, it seems to be working so far with a couple reboot tests.
Add this to your rc.local. Adjust sleep time based on your setup. My FOG setup is virtual, so 5 seconds seems to work well.
sleep 5 service FOGImageReplicator restart service FOGMulticastManager restart service FOGPingHosts restart service FOGScheduler restart service FOGSnapinReplicator restart
-
@baggar11, @Raymond-Bell I believe I may have fixed this now.
My issue – in theory?
I am moving almost EVERYTHING to static form (where I can). This allows me to get items back without having to initiate a whole class object. The problem, the DB has to be available for FOG to read its info. The services are checking for the item before DB is established. This was totally an over site and for that I’m sorry to all.
With any luck, this issue will be gone now. Please update and let me know.
Thanks.
-
Why this might affect the CPU? It looks to the DB to get timeout values (and logs). The DB is fully operational, but it’s not initiated by fog meaning the DB is not available. It returns an int of 0 for the sleep time. Put that into an infinite loop, (there’s at least 2 that manage start/restart of the services. It is told to sleep for some period of time already within the looping I’m referring to. However it’s set to 0 (meaning 0 seconds), so it get’s stuck in an infinite loop without ever actually starting.