Around 100% cpu usage constantly


  • Testers

    Hi all,
    Up until recently, my FOG server was running pretty well in my VMware environment. I am running into alarms now for CPU usage on my server. It is constantly showing 100% CPU usage on both VMware console and within Ubuntu. Here are my stats: I am running Ubuntu 14.04 with FOG trunk 7723 (just updated to 7725 to see if that helps). VMware with 4vcpus and 8 Gb of memory. Recently, rebooting helped for a little bit, but now it shoots up to 100 as soon as it is fully booted. I looked at system monitor and 4 FOG services were sitting at 20% CPU each. Should have taken a screenshot of that, but now Multicast Manager is the only one eating up resources. It is at 25% CPU. I do not really utilize Multicast in my environment nor am I actively running that task. See below for screenshots:

    0_1463748932326_Capturefog.PNG
    Interestingly enough, the dip you see is me deploying an image.

    0_1463748967217_Capturegog.PNG
    During the worst of it.

    0_1463749010113_Capturecpu.PNG
    The remaining Process still using too much
    Thanks,
    Paul


  • Senior Developer

    That’s why I post both. I have no idea what type your system is going to use so I just brute force them all.

    In either case, this shouldn’t be a problem even on subsequent updates now.



  • @Tom-Elliott : Thank you that works (the second version with systemctrl)


  • Senior Developer

    @coco65 Run the script as it was designed earlier.

    service FOGMulticastManager stop && sleep 5 && service FOGMulticastManager start
    service FOGPingHosts stop && sleep 5 && service FOGPingHosts start
    service FOGImageReplicator stop && sleep 5 && service FOGImageReplicator start
    service FOGSnapinReplicator stop && sleep 5 && service FOGSnapinReplicator start
    service FOGScheduler stop && sleep 5 && service FOGScheduler start
    

    OR:

    systemctl stop FOGMulticastManager && sleep 5 && systemctl start FOGMulticastManager
    systemctl stop FOGPingHosts && sleep 5 && systemctl start FOGPingHosts
    systemctl stop FOGImageReplicator && sleep 5 && systemctl start FOGImageReplicator
    systemctl stop FOGSnapinReplicator && sleep 5 && systemctl start FOGSnapinReplicator
    systemctl stop FOGScheduler && sleep 5 && systemctl start FOGScheduler
    

    This is going to make sure the services have enough time to restart. The Installer appears to not satisfy the “timing” properly.



  • @Tom-Elliott

    Just updated both the fog-server and the fog-node to 7945 on debian 8 both. Nothing has changed I am afraid. The fog server runs fine but the storage node is at a constant 100% processor load. The 4 FOG services (ImageReplicator etc) keep using every processor cycle they can get.


  • Senior Developer

    I’ve added some code in hopes to prevent runaway services such as this. With any luck the 100% cpu do to FOG services will be no more.

    The Defaults, in case of issues:

    FOGImageReplicator = 600 seconds (10 Minutes)
    FOGSnapinReplicator = 600 seconds (10 Minutes)
    FOGPingHosts = 300 seconds (5 Minutes)
    FOGScheduler = 60 seconds (1 Minute)
    FOGMulticastManager = 10 seconds
    

    The writing to log files isn’t as important as the timing. It’s the timing that’s causing the run away CPU cycles as they’re running infinitely. Without a sleep to slow it all down, the services will keep doing work without a break, causing the spikes you’re describing.


  • Testers

    @Tom-Elliott said in Around 100% cpu usage constantly:

    @x23piracy just a guess that this goes crazy when updating.

    That’s the case for me.


  • Senior Developer

    @x23piracy just a guess that this goes crazy when updating.



  • Hi, after thinking it is fixed, i found this that morning:

    alt text

    Now the Multicast Service is burning like hell.
    After another service restart via script it’s silent again, i will report back tomorrow morning if the hell comes back.

    Regards X23



  • @Wayne-Workman ok we can do this later

    i did the following and it helped.
    to installfog.sh at last line i added a call to restart_services.sh
    additionally i added (as root) to cron (crontab -e) a call to restart_services.sh, use @reboot in front if the crontab line.

    Regards X23



  • @Tom-Elliott got it fixed, it’s embarrassing for myself, mysql wasn’t rechable outside 127.0.0.1, some automatic installed security patch (i have this enabled) must caused it (it worked before) i changed the bind in my.cnf away from 127.0.0.1 to 0.0.0.0 you can also use machines current ip address on eth you are using. 0.0.0.0 binds to every network device.

    No need for any postprocessing script

    Just an idea, the installer could check the bind address while installing/updating and warn if bind is 127.0.0.1 or localhost.

    EDIT:

    Not fully solved my fog master is now burning, maybe 0.0.0.0 is not making it reachable to 127.0.0.1 (localhost)… i get back
    ok that was not the problem, 0.0.0.0 really binds all access worked via mysql client, so i restarted services on the fog master with the script and it worked, i will try reboots and so on…

    Regards X23


  • Moderator

    @x23piracy If you have time, I can find out exactly what’s happening via TeamViewer and make a bug report afterwards.


  • Senior Developer

    I need solid proof if possible.

    If you can find a revision where all is working properly and when it stops, I can probably be of more help.

    Seeing as this seems a sporadic thing and only, to me, seems to appear on Debian based OS’s, I can guess maybe it’s got something to do with the session crontab?

    Just a guess.

    I haven’t made a change to the fog services that would cause this to sporadically occur.

    Normally, when I do see the issue, it’s stuck in a forever loop (the services). This happens when it fails to read data from the database and get’s a 0 data.



  • @Tom-Elliott - I got tired of sudo so enabled root login. Does sudo -i make any difference then? Tried it.
    0_1464544102361_2016-05-29_19-44-27.jpg
    0_1464544113171_2016-05-29_19-45-18.jpg



  • @Tom-Elliott what should i do i cannot get away from the cpu burning. Any suggestions?
    Also i cannot use systemctl like you suggested for 14.04:

    root@fog-image:/home/fog# systemctl stop FOGMulticastManager
    systemctl: Befehl nicht gefunden.
    

    Ubuntu 14.04 uses Upstart as the init system, the plan to switch to systemd is planned for 14.10+. There are parts of “systemd” that have been used in Ubuntu for a long time, but for most intents and purposes when people say “systemd” they mean systemd-as-init.

    And it burns for me can someone borrow a fire extinguisher? It’s hot here :)

    Regards X23


  • Senior Developer

    @x23piracy The code that’s running the 100% cpu hasn’t changed.



  • @Tom-Elliott even if i try to run the script via sudo bash -c ‘script.sh’ high load still remains, that doesn’t fix it. and the problem wasnt there with older trunk versions.


  • Senior Developer

    @coco65 This is highly dependent upon how you run sudo.

    If you’re running where you first login as a user other than root and they sudo to root, you should run:

    sudo -i <- if you’re logging in to become root.

    To ensure the script being run under sudo get’s root sourcing first run the script as:
    sudo bash -c 'script.sh'



  • Hi, I am getting the same on Debian 8 Git version 7917 FOG Storage Node running inside a Virtual Box on a NAS. Copying the commands listed below as root did not make any difference in my case. Constant 100% processor load.

    0_1464522356633_2016-05-29_13-40-40.jpg


  • Testers

    @Sebastian-Roth I was overthinking this. I simply added it to the end of my script and set up a cron job as root. For proof of concept, I ran my script as root and it worked!


Log in to reply
 

414
Online

39210
Users

10856
Topics

103343
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.