Fog server keeps going down
-
@Scott-B I apologize, I am not sure how to pull up the logs on the server. I am running Fedora Server 22 and it is running on a virtual machine.
-
FOG GUI->FOG Configuration Page->Log Viewer.
-
@Tom-Elliott I go to that screen and there is nothing under file. Am I supposed to type something here?
-
What version of fog are you running?
Are you sure this is because of fog and not a more serious issue (i.e. memory issues, hdd dying, etc…)?
-
@Tom-Elliott I am running FOG 5453 and latest version is 1.2.0
It may not be a FOG issue but this is on a virtual machine and the error we are seeing is CPU utilization. We are about to up the utilization to see if that does the trick but wanted to see if you guys saw this before
-
What is your virtualization environment?
-
@need2 We have Vsphere 5.5 and we uped the CPUs and are not getting an error anymore but this morning I can not get on to FOG. I have to reboot again.
-
@need2 I just rebooted and it works fine again but this is an every day event and causes issues if computers reboot due to windows updates.
-
@Scott-B said:
What do the logs on the server suggest the issue could be?
Please provide some details of the server. Ubuntu? Fedora? Versions? Do the log files on the server point to anything?
@szecca1 What OS? Also, you can look at Apache Errors via CLI as well if the web interface is not working. If we know the OS, we can give you the exact command to do this (along with a lot of other logs to check too).
-
@Wayne-Workman The OS of the fog server is Fedora Server 22
-
@szecca1 Just saw that you already said that … sorry. I’ll get some commands together to send your way.
When we have a mystery problem, we need to look in a lot of places until we find a clue.
-
Run these commands the next time the server goes down - grab the output, and then just restart it to keep problems in your environment to a minimum. Post what you find and we’ll go from there. If you get errors while posting the output of any of those, just upload the output in a .txt file instead.
the
-n 100
is for number of lines to return - you can adjust that as needed.Apache Error log:
tail -n 100 /var/log/httpd/error_log
MariaDB log in Fedora 22:
tail -n 100 /var/log/mariadb/mariadb.log
Also, run the top command to see what the load averages are, and what is running:
top
Sample output from my server:top - 09:28:32 up 51 days, 20:58, 1 user, load average: 0.08, 0.11, 0.18 Tasks: 166 total, 1 running, 165 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.5 us, 0.2 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 4028388 total, 61580 free, 341508 used, 3625300 buff/cache KiB Swap: 4063228 total, 4053892 free, 9336 used. 3621740 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30282 apache 20 0 409732 24140 13936 S 1.7 0.6 0:10.38 httpd 9 root 20 0 0 0 0 S 0.3 0.0 50:57.07 rcuos/0 672 dbus 20 0 47108 3504 3048 S 0.3 0.1 4:56.67 dbus-daemon 27066 mysql 20 0 2190100 97052 16348 S 0.3 2.4 2:06.60 mysqld 1 root 20 0 187052 6152 3000 S 0.0 0.2 8:43.03 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:01.77 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 1:12.82 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 71:43.84 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0 11 root rt 0 0 0 0 S 0.0 0.0 0:11.47 migration/0 12 root rt 0 0 0 0 S 0.0 0.0 0:29.96 watchdog/0 13 root rt 0 0 0 0 S 0.0 0.0 0:28.28 watchdog/1 14 root rt 0 0 0 0 S 0.0 0.0 0:14.37 migration/1 15 root 20 0 0 0 0 S 0.0 0.0 0:20.91 ksoftirqd/1 17 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H 18 root 20 0 0 0 0 S 0.0 0.0 6:21.82 rcuos/1 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1 20 root rt 0 0 0 0 S 0.0 0.0 0:28.58 watchdog/2 21 root rt 0 0 0 0 S 0.0 0.0 0:12.49 migration/2 22 root 20 0 0 0 0 S 0.0 0.0 2:10.35 ksoftirqd/2 24 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0H 25 root 20 0 0 0 0 S 0.0 0.0 11:36.20 rcuos/2 26 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2 27 root rt 0 0 0 0 S 0.0 0.0 0:27.90 watchdog/3 28 root rt 0 0 0 0 S 0.0 0.0 0:14.79 migration/3 29 root 20 0 0 0 0 S 0.0 0.0 0:10.40 ksoftirqd/3 31 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/3:0H 32 root 20 0 0 0 0 S 0.0 0.0 5:01.15 rcuos/3 33 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3 34 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 khelper 35 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs 36 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns 37 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 perf 38 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback 39 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 40 root 39 19 0 0 0 S 0.0 0.0 0:00.00 khugepaged 41 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 crypto 42 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd 43 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset 44 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd 45 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ata_sff 46 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 md 47 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 devfreq_wq
-
@Wayne-Workman How can I run those commands when I can’t connect to the server. Either from the virtual machine or from putty, I can’t log in to the server until after I reboot it!
And no worries, I know you guys are busy with other things. Any information you need I’ll be happy to give you several times if needed. -
@szecca1 that is definitely information we needed. That leads me to believe either a problem with the nic that is associated. Does your server go to sleep/hibernate?
-
@Tom-Elliott No the server doesn’t go to sleep. The other VM’s on that server are working perfectly fine. Its just FOG that seems to be having a problem. But once it is rebooted, it comes up and works fine for the day.
-
@szecca1 When I read that post below I immediately thought exactly what Tom posted - NIC issues. Maybe even switch issues.
Try a different port on the switch, maybe even try a different switch… Maybe try a different type of virtual adapter.
-
@Wayne-Workman No other VMs attached to this device is having issues. It could be the virtual adapter but I can change that? What you guys are saying makes sense but it has to be something with the FOG machine because the others are working fine.
-
@szecca1 We just checked to see if we could add a new driver for the ethernet card and we couldn’t. Do you recommend a certain ethernet card adapter driver for me to use? We currently have the E1000 ethernet adapter on it.
-
@szecca1 I don’t think it’s the driver. I think it’s the physical adapter.
-
@Tom-Elliott It can’t be the physical adapter. The adapter that is being used is for several different VMs and no other VMs are having issues besides the FOG server.