Around 100% cpu usage constantly
Up until recently, my FOG server was running pretty well in my VMware environment. I am running into alarms now for CPU usage on my server. It is constantly showing 100% CPU usage on both VMware console and within Ubuntu. Here are my stats: I am running Ubuntu 14.04 with FOG trunk 7723 (just updated to 7725 to see if that helps). VMware with 4vcpus and 8 Gb of memory. Recently, rebooting helped for a little bit, but now it shoots up to 100 as soon as it is fully booted. I looked at system monitor and 4 FOG services were sitting at 20% CPU each. Should have taken a screenshot of that, but now Multicast Manager is the only one eating up resources. It is at 25% CPU. I do not really utilize Multicast in my environment nor am I actively running that task. See below for screenshots:
Interestingly enough, the dip you see is me deploying an image.
During the worst of it.
The remaining Process still using too much
That’s why I post both. I have no idea what type your system is going to use so I just brute force them all.
In either case, this shouldn’t be a problem even on subsequent updates now.
@Tom-Elliott : Thank you that works (the second version with systemctrl)
@coco65 Run the script as it was designed earlier.
service FOGMulticastManager stop && sleep 5 && service FOGMulticastManager start service FOGPingHosts stop && sleep 5 && service FOGPingHosts start service FOGImageReplicator stop && sleep 5 && service FOGImageReplicator start service FOGSnapinReplicator stop && sleep 5 && service FOGSnapinReplicator start service FOGScheduler stop && sleep 5 && service FOGScheduler start
systemctl stop FOGMulticastManager && sleep 5 && systemctl start FOGMulticastManager systemctl stop FOGPingHosts && sleep 5 && systemctl start FOGPingHosts systemctl stop FOGImageReplicator && sleep 5 && systemctl start FOGImageReplicator systemctl stop FOGSnapinReplicator && sleep 5 && systemctl start FOGSnapinReplicator systemctl stop FOGScheduler && sleep 5 && systemctl start FOGScheduler
This is going to make sure the services have enough time to restart. The Installer appears to not satisfy the “timing” properly.
Just updated both the fog-server and the fog-node to 7945 on debian 8 both. Nothing has changed I am afraid. The fog server runs fine but the storage node is at a constant 100% processor load. The 4 FOG services (ImageReplicator etc) keep using every processor cycle they can get.
I’ve added some code in hopes to prevent runaway services such as this. With any luck the 100% cpu do to FOG services will be no more.
The Defaults, in case of issues:
FOGImageReplicator = 600 seconds (10 Minutes) FOGSnapinReplicator = 600 seconds (10 Minutes) FOGPingHosts = 300 seconds (5 Minutes) FOGScheduler = 60 seconds (1 Minute) FOGMulticastManager = 10 seconds
The writing to log files isn’t as important as the timing. It’s the timing that’s causing the run away CPU cycles as they’re running infinitely. Without a sleep to slow it all down, the services will keep doing work without a break, causing the spikes you’re describing.
@x23piracy just a guess that this goes crazy when updating.
That’s the case for me.
@x23piracy just a guess that this goes crazy when updating.
Hi, after thinking it is fixed, i found this that morning:
Now the Multicast Service is burning like hell.
After another service restart via script it’s silent again, i will report back tomorrow morning if the hell comes back.
@Wayne-Workman ok we can do this later
i did the following and it helped.
to installfog.sh at last line i added a call to restart_services.sh
additionally i added (as root) to cron (crontab -e) a call to restart_services.sh, use @reboot in front if the crontab line.
@Tom-Elliott got it fixed, it’s embarrassing for myself, mysql wasn’t rechable outside 127.0.0.1, some automatic installed security patch (i have this enabled) must caused it (it worked before) i changed the bind in my.cnf away from 127.0.0.1 to 0.0.0.0 you can also use machines current ip address on eth you are using. 0.0.0.0 binds to every network device.
No need for any postprocessing script
Just an idea, the installer could check the bind address while installing/updating and warn if bind is 127.0.0.1 or localhost.
Not fully solved my fog master is now burning, maybe 0.0.0.0 is not making it reachable to 127.0.0.1 (localhost)… i get back
ok that was not the problem, 0.0.0.0 really binds all access worked via mysql client, so i restarted services on the fog master with the script and it worked, i will try reboots and so on…
@x23piracy If you have time, I can find out exactly what’s happening via TeamViewer and make a bug report afterwards.
I need solid proof if possible.
If you can find a revision where all is working properly and when it stops, I can probably be of more help.
Seeing as this seems a sporadic thing and only, to me, seems to appear on Debian based OS’s, I can guess maybe it’s got something to do with the session crontab?
Just a guess.
I haven’t made a change to the fog services that would cause this to sporadically occur.
Normally, when I do see the issue, it’s stuck in a forever loop (the services). This happens when it fails to read data from the database and get’s a 0 data.
@Tom-Elliott - I got tired of sudo so enabled root login. Does sudo -i make any difference then? Tried it.
@Tom-Elliott what should i do i cannot get away from the cpu burning. Any suggestions?
Also i cannot use systemctl like you suggested for 14.04:
root@fog-image:/home/fog# systemctl stop FOGMulticastManager systemctl: Befehl nicht gefunden.
Ubuntu 14.04 uses Upstart as the init system, the plan to switch to systemd is planned for 14.10+. There are parts of “systemd” that have been used in Ubuntu for a long time, but for most intents and purposes when people say “systemd” they mean systemd-as-init.
And it burns for me can someone borrow a fire extinguisher? It’s hot here
@x23piracy The code that’s running the 100% cpu hasn’t changed.
@coco65 This is highly dependent upon how you run sudo.
If you’re running where you first login as a user other than root and they sudo to root, you should run:
sudo -i<- if you’re logging in to become root.
To ensure the script being run under sudo get’s root sourcing first run the script as:
sudo bash -c 'script.sh'
Hi, I am getting the same on Debian 8 Git version 7917 FOG Storage Node running inside a Virtual Box on a NAS. Copying the commands listed below as root did not make any difference in my case. Constant 100% processor load.
@Sebastian-Roth I was overthinking this. I simply added it to the end of my script and set up a cron job as root. For proof of concept, I ran my script as root and it worked!