Kernal panic - not syncing: Attempted to kill init!
-
Server
- FOG Version: 1.3.0-RC-26
- OS: Centos 7
Client
- Service Version:
- OS: Win 7
Description
So I go to check on my FOG server, cause it’s been a while, and, after updating everything, I run an image on a test machine to make sure it’s still working. I get Different screens each time I reboot. See below examples.
Haven’t been able to find anything that helps me on the forum or net.
-
More often than not, this is presented if you’ve crossed init’s and kernels (32 bit init with 64 bit kernel or vice versa).
That said, based on the “inode referenced” issue this could be caused by a few factors.
- If you edited the inits and failed to compress correctly.
- If the init was not fully downloaded.
- Memory on the device is bad/ going bad.
- Memory on the server is bad/ going bad.
- Diskspace is full causing the file to fail to download fully.
- HDD going bad on the server causing the file to be sent in a corrupted state (at random).
Of course, I’m sure, there’s more potential causes, but start with the basics first.
Seeing as I know, at some point, your disk space has been full, I’d say start there (though this seems unlikely if the panic is always different.)
If this all checks out, try reseating the hosts ram (maybe try known good ram too?).There’s more to try out first, but please just start here and see what you might find.
-
@Tom-Elliott said in Kernal panic - not syncing: Attempted to kill init!:
Seeing as I know, at some point, your disk space has been full, I’d say start there
Please ask us about how to partition your disks to avoid a full root partition.
-
@Tom-Elliott
So I checked the partition size’s:[root@localhost fogadmin]# df -h --o Filesystem Type Inodes IUsed IFree IUse% Size Used Avail Use% File Mounted on /dev/mapper/centos00-root00 ext4 1.3M 158K 1.1M 13% 20G 7.5G 12G 41% - / devtmpfs devtmpfs 470K 494 469K 1% 1.9G 0 1.9G 0% - /dev tmpfs tmpfs 473K 9 473K 1% 1.9G 4.5M 1.9G 1% - /dev/shm tmpfs tmpfs 473K 643 473K 1% 1.9G 8.9M 1.9G 1% - /run tmpfs tmpfs 473K 13 473K 1% 1.9G 0 1.9G 0% - /sys/fs/cgroup /dev/sda5 ext4 63K 365 63K 1% 969M 329M 574M 37% - /boot /dev/mapper/fog-opt_fog_images ext4 26M 10K 26M 1% 395G 80G 295G 22% - /opt /dev/sdb1 ext4 261M 15 261M 1% 8.1T 91M 7.7T 1% - /images tmpfs tmpfs 473K 29 473K 1% 379M 12K 379M 1% - /run/user/1000
I re-seated the memory and now I’m getting this new screen:
Should I start a new post for this one?
-
@ManofValor
By the way, this screen is consistent now. -
@ManofValor Then, based on the fact that it seems to be pointing at network, I’d say check your patch cable and make sure the connection is solid.
-
Not sure if this was the right thing to try but I ran:
git clone git://git.ipxe.org/ipxe.git
And it ended with this:
[AR] bin/blib.a ar: creating bin/blib.a [HOSTCC] util/zbin util/zbin.c:7:18: fatal error: lzma.h: No such file or directory #include <lzma.h> ^ compilation terminated. make: *** [util/zbin] Error 1
-
@ManofValor No, it didn’t. That looks you downloaded the ipxe repository then ran:
make
AFter that, as it it was building, you received that information.
LZMA, I would suspect, would be you needing the lzma-devel packages (not sure where but I think it’s xzutils-devel and xzutils from fedora/redhat based.)
-
@Tom-Elliott Sorry for redhat/centos/fedora it appears:
yum -y install xz-devel ldconfig
Should do the trick (no guarantees) though I don’t know why you need to download the ipxe repository.
-
@Tom-Elliott
I just tried it cause I went to https://ipxe.org/0f0a6039, from the screen shot, and that was one of the suggestions.
I’ll try that and see… -
Well, that didn’t work. The cable is fine. I get internet and I can ping it.
-
@ManofValor Cable being fine doesn’t tell us much. (You pulled it and put in a new one just in case?)
Where are you getting the ipxe error from? The error from iPXE is not a kernel panic.
-
Seeing as the error that you’re seeing states “Connection reset” that could mean any number of things, though I wouldn’t know where to begin looking.
Maybe you have port security turned on? I don’t know. If that’s the problem, now, it would seem to indicate (probably) a problem with the specific system you’re running into these issues with.
First it was ram, now the nic won’t maintain connection, etc…???
-
I started thinking the same thing. I’m going to try a different PC and see what happens.
-
I believe httpd is the problem right now. I tried to go into FOG management and got an "Unable to connect"page. So I tried to run the installer and apache2 failed to start. I’ve looked at a lot of sites with this similar issue and cannot determine a solution for me. I’ve started, restarted, stopped, reloaded, kill, and tried to update apache/httpd. Here is some info that I hope can help.
[root@localhost httpd]# service httpd start Redirecting to /bin/systemctl start httpd.service Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details. [root@localhost httpd]# systemctl status httpd.service ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2016-12-01 15:08:00 CST; 30s ago Docs: man:httpd(8) man:apachectl(8) Process: 13974 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=1/FAILURE) Process: 13959 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE) Main PID: 13959 (code=exited, status=1/FAILURE) Dec 01 15:08:00 localhost.localdomain systemd[1]: Starting The Apache HTTP Server... Dec 01 15:08:00 localhost.localdomain httpd[13959]: AH00558: httpd: Could not reliably determine the server's fully qu...ssage Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE Dec 01 15:08:00 localhost.localdomain kill[13974]: kill: cannot find process "" Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service: control process exited, code=exited status=1 Dec 01 15:08:00 localhost.localdomain systemd[1]: Failed to start The Apache HTTP Server. Dec 01 15:08:00 localhost.localdomain systemd[1]: Unit httpd.service entered failed state. Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service failed. Hint: Some lines were ellipsized, use -l to show in full. [root@localhost httpd]# journalctl -xn -- Logs begin at Wed 2016-11-30 13:17:57 CST, end at Thu 2016-12-01 15:08:31 CST. -- Dec 01 15:08:00 localhost.localdomain kill[13974]: kill: cannot find process "" Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service: control process exited, code=exited status=1 Dec 01 15:08:00 localhost.localdomain systemd[1]: Failed to start The Apache HTTP Server. -- Subject: Unit httpd.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit httpd.service has failed. -- -- The result is failed. Dec 01 15:08:00 localhost.localdomain systemd[1]: Unit httpd.service entered failed state. Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service failed. Dec 01 15:08:01 localhost.localdomain dhcpd[2890]: DHCPDISCOVER from 36:02:86:28:b1:8b (MCWPL53) via enp30s0 Dec 01 15:08:01 localhost.localdomain dhcpd[2890]: DHCPOFFER on 10.10.0.10 to 36:02:86:28:b1:8b (MCWPL53) via enp30s0 Dec 01 15:08:01 localhost.localdomain polkitd[777]: Unregistered Authentication Agent for unix-process:13944:9300666 (system b Dec 01 15:08:28 localhost.localdomain dhcpd[2890]: DHCPINFORM from 10.10.1.167 via enp30s0: not authoritative for subnet 10.10 Dec 01 15:08:31 localhost.localdomain dhcpd[2890]: DHCPINFORM from 10.10.1.130 via enp30s0: not authoritative for subnet 10.10
I just don’t understand it enough to completely know what I’m looking at.
-
@ManofValor I’m going to guess that your main filesystem (which doesn’t appear to be a problem) is full again. I say that because all was working, then it wasn’t.
-
@ManofValor that and httpd has absolutely nothing with DHCP, TFTP, or the fos system with kernel panics.
-
File system still seems fine:
Filesystem Type Inodes IUsed IFree IUse% Size Used Avail Use% File Mounted on /dev/mapper/centos00-root00 ext4 1.3M 164K 1.1M 13% 20G 7.7G 11G 42% - / devtmpfs devtmpfs 470K 494 469K 1% 1.9G 0 1.9G 0% - /dev tmpfs tmpfs 473K 10 473K 1% 1.9G 5.3M 1.9G 1% - /dev/shm tmpfs tmpfs 473K 645 473K 1% 1.9G 17M 1.9G 1% - /run tmpfs tmpfs 473K 13 473K 1% 1.9G 0 1.9G 0% - /sys/fs/cgroup /dev/sda5 ext4 63K 365 63K 1% 969M 329M 574M 37% - /boot /dev/mapper/fog-opt_fog_images ext4 26M 10K 26M 1% 395G 80G 295G 22% - /opt /dev/sdb1 ext4 261M 15 261M 1% 8.1T 91M 7.7T 1% - /images tmpfs tmpfs 473K 30 473K 1% 379M 16K 379M 1% - /run/user/1000
-
This post is deleted! -
OK, I fixed the Apache/HTTPD issue. After much, much, much searching I found this site for a fix and it worked for me.
http://awsadminz.com/httpd-service-main-process-exited-kill-cannot-find-process/
Should I post a separate tutorial as a fix for this issue?
As for the kernel thing, I tried a different PC and it is working fine. I guess I’ll have to figure that one out.