Kernal panic - not syncing: Attempted to kill init!

ManofValor

Server

FOG Version: 1.3.0-RC-26
OS: Centos 7

Client

Service Version:
OS: Win 7

Description

So I go to check on my FOG server, cause it’s been a while, and, after updating everything, I run an image on a test machine to make sure it’s still working. I get Different screens each time I reboot. See below examples.

Haven’t been able to find anything that helps me on the forum or net.

Tom Elliott

More often than not, this is presented if you’ve crossed init’s and kernels (32 bit init with 64 bit kernel or vice versa).

That said, based on the “inode referenced” issue this could be caused by a few factors.

If you edited the inits and failed to compress correctly.
If the init was not fully downloaded.
Memory on the device is bad/ going bad.
Memory on the server is bad/ going bad.
Diskspace is full causing the file to fail to download fully.
HDD going bad on the server causing the file to be sent in a corrupted state (at random).

Of course, I’m sure, there’s more potential causes, but start with the basics first.

Seeing as I know, at some point, your disk space has been full, I’d say start there (though this seems unlikely if the panic is always different.)
If this all checks out, try reseating the hosts ram (maybe try known good ram too?).

There’s more to try out first, but please just start here and see what you might find.

Wayne Workman

@Tom-Elliott said in Kernal panic - not syncing: Attempted to kill init!:

Seeing as I know, at some point, your disk space has been full, I’d say start there

Please ask us about how to partition your disks to avoid a full root partition.

ManofValor

@Tom-Elliott
So I checked the partition size’s:

[root@localhost fogadmin]# df -h --o
Filesystem                     Type     Inodes IUsed IFree IUse%  Size  Used Avail Use% File Mounted on
/dev/mapper/centos00-root00    ext4       1.3M  158K  1.1M   13%   20G  7.5G   12G  41% -    /
devtmpfs                       devtmpfs   470K   494  469K    1%  1.9G     0  1.9G   0% -    /dev
tmpfs                          tmpfs      473K     9  473K    1%  1.9G  4.5M  1.9G   1% -    /dev/shm
tmpfs                          tmpfs      473K   643  473K    1%  1.9G  8.9M  1.9G   1% -    /run
tmpfs                          tmpfs      473K    13  473K    1%  1.9G     0  1.9G   0% -    /sys/fs/cgroup
/dev/sda5                      ext4        63K   365   63K    1%  969M  329M  574M  37% -    /boot
/dev/mapper/fog-opt_fog_images ext4        26M   10K   26M    1%  395G   80G  295G  22% -    /opt
/dev/sdb1                      ext4       261M    15  261M    1%  8.1T   91M  7.7T   1% -    /images
tmpfs                          tmpfs      473K    29  473K    1%  379M   12K  379M   1% -    /run/user/1000

I re-seated the memory and now I’m getting this new screen:

Should I start a new post for this one?

ManofValor

@ManofValor
By the way, this screen is consistent now.

Tom Elliott

@ManofValor Then, based on the fact that it seems to be pointing at network, I’d say check your patch cable and make sure the connection is solid.

ManofValor

Not sure if this was the right thing to try but I ran:

git clone git://git.ipxe.org/ipxe.git

And it ended with this:

  [AR] bin/blib.a
ar: creating bin/blib.a
  [HOSTCC] util/zbin
util/zbin.c:7:18: fatal error: lzma.h: No such file or directory
 #include <lzma.h>
                  ^
compilation terminated.
make: *** [util/zbin] Error 1

Tom Elliott

@ManofValor No, it didn’t. That looks you downloaded the ipxe repository then ran:

make

AFter that, as it it was building, you received that information.

LZMA, I would suspect, would be you needing the lzma-devel packages (not sure where but I think it’s xzutils-devel and xzutils from fedora/redhat based.)

Tom Elliott

@Tom-Elliott Sorry for redhat/centos/fedora it appears:

yum -y install xz-devel
ldconfig

Should do the trick (no guarantees) though I don’t know why you need to download the ipxe repository.

ManofValor

@Tom-Elliott
I just tried it cause I went to https://ipxe.org/0f0a6039, from the screen shot, and that was one of the suggestions.
I’ll try that and see…

ManofValor

Well, that didn’t work. The cable is fine. I get internet and I can ping it.

Tom Elliott

@ManofValor Cable being fine doesn’t tell us much. (You pulled it and put in a new one just in case?)

Where are you getting the ipxe error from? The error from iPXE is not a kernel panic.

Tom Elliott

Seeing as the error that you’re seeing states “Connection reset” that could mean any number of things, though I wouldn’t know where to begin looking.

Maybe you have port security turned on? I don’t know. If that’s the problem, now, it would seem to indicate (probably) a problem with the specific system you’re running into these issues with.

First it was ram, now the nic won’t maintain connection, etc…???

ManofValor

I started thinking the same thing. I’m going to try a different PC and see what happens.

ManofValor

I believe httpd is the problem right now. I tried to go into FOG management and got an "Unable to connect"page. So I tried to run the installer and apache2 failed to start. I’ve looked at a lot of sites with this similar issue and cannot determine a solution for me. I’ve started, restarted, stopped, reloaded, kill, and tried to update apache/httpd. Here is some info that I hope can help.

[root@localhost httpd]# service httpd start
Redirecting to /bin/systemctl start  httpd.service
Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.
[root@localhost httpd]# systemctl status httpd.service
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2016-12-01 15:08:00 CST; 30s ago
     Docs: man:httpd(8)
           man:apachectl(8)
  Process: 13974 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=1/FAILURE)
  Process: 13959 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE)
 Main PID: 13959 (code=exited, status=1/FAILURE)

Dec 01 15:08:00 localhost.localdomain systemd[1]: Starting The Apache HTTP Server...
Dec 01 15:08:00 localhost.localdomain httpd[13959]: AH00558: httpd: Could not reliably determine the server's fully qu...ssage
Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE
Dec 01 15:08:00 localhost.localdomain kill[13974]: kill: cannot find process ""
Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service: control process exited, code=exited status=1
Dec 01 15:08:00 localhost.localdomain systemd[1]: Failed to start The Apache HTTP Server.
Dec 01 15:08:00 localhost.localdomain systemd[1]: Unit httpd.service entered failed state.
Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
[root@localhost httpd]# journalctl -xn
-- Logs begin at Wed 2016-11-30 13:17:57 CST, end at Thu 2016-12-01 15:08:31 CST. --
Dec 01 15:08:00 localhost.localdomain kill[13974]: kill: cannot find process ""
Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service: control process exited, code=exited status=1
Dec 01 15:08:00 localhost.localdomain systemd[1]: Failed to start The Apache HTTP Server.
-- Subject: Unit httpd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit httpd.service has failed.
-- 
-- The result is failed.
Dec 01 15:08:00 localhost.localdomain systemd[1]: Unit httpd.service entered failed state.
Dec 01 15:08:00 localhost.localdomain systemd[1]: httpd.service failed.
Dec 01 15:08:01 localhost.localdomain dhcpd[2890]: DHCPDISCOVER from 36:02:86:28:b1:8b (MCWPL53) via enp30s0
Dec 01 15:08:01 localhost.localdomain dhcpd[2890]: DHCPOFFER on 10.10.0.10 to 36:02:86:28:b1:8b (MCWPL53) via enp30s0
Dec 01 15:08:01 localhost.localdomain polkitd[777]: Unregistered Authentication Agent for unix-process:13944:9300666 (system b
Dec 01 15:08:28 localhost.localdomain dhcpd[2890]: DHCPINFORM from 10.10.1.167 via enp30s0: not authoritative for subnet 10.10
Dec 01 15:08:31 localhost.localdomain dhcpd[2890]: DHCPINFORM from 10.10.1.130 via enp30s0: not authoritative for subnet 10.10

I just don’t understand it enough to completely know what I’m looking at.

Tom Elliott

@ManofValor I’m going to guess that your main filesystem (which doesn’t appear to be a problem) is full again. I say that because all was working, then it wasn’t.

Tom Elliott

@ManofValor that and httpd has absolutely nothing with DHCP, TFTP, or the fos system with kernel panics.

ManofValor

File system still seems fine:

Filesystem                     Type     Inodes IUsed IFree IUse%  Size  Used Avail Use% File Mounted on
/dev/mapper/centos00-root00    ext4       1.3M  164K  1.1M   13%   20G  7.7G   11G  42% -    /
devtmpfs                       devtmpfs   470K   494  469K    1%  1.9G     0  1.9G   0% -    /dev
tmpfs                          tmpfs      473K    10  473K    1%  1.9G  5.3M  1.9G   1% -    /dev/shm
tmpfs                          tmpfs      473K   645  473K    1%  1.9G   17M  1.9G   1% -    /run
tmpfs                          tmpfs      473K    13  473K    1%  1.9G     0  1.9G   0% -    /sys/fs/cgroup
/dev/sda5                      ext4        63K   365   63K    1%  969M  329M  574M  37% -    /boot
/dev/mapper/fog-opt_fog_images ext4        26M   10K   26M    1%  395G   80G  295G  22% -    /opt
/dev/sdb1                      ext4       261M    15  261M    1%  8.1T   91M  7.7T   1% -    /images
tmpfs                          tmpfs      473K    30  473K    1%  379M   16K  379M   1% -    /run/user/1000

ManofValor

This post is deleted!

ManofValor

OK, I fixed the Apache/HTTPD issue. After much, much, much searching I found this site for a fix and it worked for me.
http://awsadminz.com/httpd-service-main-process-exited-kill-cannot-find-process/
Should I post a separate tutorial as a fix for this issue?
As for the kernel thing, I tried a different PC and it is working fine. I guess I’ll have to figure that one out.

Kernal panic - not syncing: Attempted to kill init!

Server

Client

Description

92

12.6k

17.5k

156.3k