Fog server keeps going down
-
I’m still seeing a conflict here. But lets run with it.
The console is frozen (which might indicate the vm has crashed because the console is isolated from any running application like FOG. Even if something was consuming 100% of the CPU the console should still respond, although slowly). Assuming that the vm client has hung, the network stack is still operational (which is responding to a ping). Based on my experience this is a unique situation that should not be.
Lets see if we can acquire a bit more info. Looking at the ESXi console (not the vSphere Web interface accessed by a browser) when that VM is unresponsive is the vm tools reporting to ESXi correctly (summary tab). Is that VM posting any alerts to ESXi (alarms tab). When the vm is in this lost state what does the Consumed Host CPU value show (Summary tab)? Is it high, low, or about the same as when its working correctly? There has to be some external indication that this server has gone away.
I’m still have the feeling that you might have a machine with the same address out there causing problem. That would explain the FOG server dropping off the network (hanging) but still pingable. Plus this is a new installation and not something that has been in place for a while. All other VMs are running without issue on the same hypervisor, All of this is making me think there is some external source at play here. Understand this is just an attempt to read the tea leaves based on what you’ve said this far.
-
@george1421 Where is this conflict that you’re seeing?
We just turned off the fog server and the IP address is no longer pingable which means there is no conflict with IPs. -
@szecca1 Devices don’t always respond to pings. You can configure windows or linux or OSX to not respond to them. Could be a switch, a UPS, an IP camera, a printer, lots of things.
-
@Wayne-Workman Can you guys trust me when I tell you it is not a duplicate IP address. No server, IP camera, UPS, switch or any of those things has been added to cause this problem. We do not configure any of our devices to be unpingable because that would make the devices less managable. I can assure you this is not a duplicate IP issue
-
@szecca1 I don’t think @george1421 means a “conflict” in the terms of IP Addressing, but rather in logic.
If the FOG Server is not accessible VIA console, but can be pinged something is way off. The console is the only thing that should ALWAYS be accessible regardless of the state of network, services, or anything else. Even if you cannot do anything on the console, you should be able to see the server in it’s funked up state. The fact that the server is pingable, but you cannot access the console is conflicting, not that you have multiple ip addresses, or duplicates, or what have you.
-
@Tom-Elliott said:
@szecca1 I don’t think @george1421 means a “conflict” in the terms of IP Addressing, but rather in logic.
Tom is right on, I was focusing on the logic of what you were saying which lead me to the duplicate IP address conclusion. Maybe I need to choose my words a bit better too.
While I don’t offer to do this very often, but since we have a similar virtualization environment, I can create a VM with FOG running on Centos 6.7. Assuming your boss will allow, I can export the vm and you can upload it to your ESXi platform. The only issue I have is if can I make the VM small enough to get it to you vi my dropbox.
-
@george1421 Google drive…
-
@Tom-Elliott I agree this doesn’t make any sense. I wish I was making this up but in the end, I have no access to the server during the time that it is down until I tell vSphere to just reboot the VM. The console on vSphere is not able to commuincate and basically has the same result as when I try to use putty to remote in.
I can already tell you that my boss will not allow that, although I really do appreciate the offer. In a school district that would be too high of a security risk, unfortunately. Thank you though! -
@szecca1 Is the OS installed with a GUI?
-
-
@Tom-Elliott No Fedora server doesnt have a GUI
-
@szecca1 then the only thing I can think of is the console is blanked out, not necessarily unusable. Do you have a keyboard that you can press the space or a key to see if anything returns a display?
-
@Wayne-Workman Do you want me to do this?
-
@Tom-Elliott Yes we tried entering something and nothing is returned at all
-
@szecca1 I just walked in today and the server is not reachable. I figured before I reboot it I would see if you guys want me to do anything. I can ping the server and it responds but trying to putty into the server it just hangs and never connects.
-
@szecca1 Can you get to it through the hypervisor?
-
@Wayne-Workman I apologize I was trying to get a hold of my boss to log onto vSphere. When I go to the console, it asks for localhost login and then gives an error:
(38484.938838) Out of memory: kill process 2739 (httpd) score 104 or sacrifice child (38484.942704) killed process 2739 (httpd) total -vm:1247792kb, anon-rss: 412668kb, file-rss:0KB
-
@Wayne-Workman I am holding off on rebooting for a little bit just so you guys can see that error message
-
@szecca1 It displays that error AT the login screen, or after?
-
@szecca1 Check the amount of free RAM.
free -m
http://www.cyberciti.biz/faq/linux-check-memory-usage/
What does
top
say for memory usage in httpd?