Linux kernel
-
-
@Sebastian-Roth said in Linux kernel:
@Maorui2k Sorry for the long delay, I have been away. So it looks like the ubuntu kernel patches don’t make a difference. So next we can try to use the ubuntu kernel config to see if that helps. Find a newly compiled bzImage here. Please try this and let me know!
It didn’t work… And the USB keyboard didn’t work with this version, so I couldn’t test the reboot in debug mode manually. I tried two different keyboards and all USB2/3 ports. The keyboard stopped response after the new kernel was loaded.
-
@Maorui2k Sorry for the USB issue. Didn’t think that USB keyboard would be build as a module by default. I added it to the kernel and uploaded a new version. Could you please try again?
Have you ever installed Ubuntu 16.04 on that hardware and tried to reboot? I am just wondering if we are on the wrong track with this. Maybe booting a live CD is just a little different.
Bay the way: Are you working on the exact same PC all the time? Just to rule out that BIOS settings might be causing the issue on one of the PCs?
-
@Sebastian-Roth The reboot/shutdown and USB still didn’t work with the new kernel…
I installed Ubuntu 16.04 desktop and server edition, and both could reboot successfully. Live CD was also fine.
I tried everything in same PC. I checked all BIOS options, didn’t see anything related to this issue.
-
Moved to Hardware Compatibility.
-
@Maorui2k I found this in the kernel config:
config X86_REBOOTFIXUPS bool "Enable X86 board specific fixups for reboot" depends on X86_32 ---help--- This enables chipset and/or board specific fixups to be done in order to get reboot to work correctly. This is only needed on some combinations of hardware and BIOS. The symptom, for which this config is intended, is when reboot ends with a stalled/hung system. Currently, the only fixup is for the Geode machines using CS5530A and CS5536 chipsets and the RDC R-321x SoC.
Although it sounds promising at first I don’t think this is what we are after. Those chipsets don’t seem to match yours. And this option is only available on 32 bit kernel builds!! So I highly doubt this would fix the reboot for you.
But this made me think about CPU architecture. Are you sure the 64 bit kernel is loaded by iPXE?? Pay attention to the screen when it says:
bzImage... ok init.xz... ok
Nothing about 32 there, right???
I updated that kernel once again as I still was missing the USB HID settings, too bad. Hope that your keyboard is working now but probably won’t help the reboot anyway.
As well I’ve used the 3.17.3 kernel config from our repo and updated it to build a 4.4 kernel from that. So if anything was in that config it should be still in there. Please try both kernels.I’ve been reading through this and searched the web over and over again. I just don’t understand why none of the kernels I build is doing any better. Just scratching my head… Why do 3.17.3 and the ubuntu kernels reboot properly? Boot up a live system again (or from disk if you have still installed ubuntu) and get a full listing of the kernel messages buffer. I feel this could be quite important in finding the issue. Get a FAT32 formated USB key and run:
sudo -i mkdir /usbkey mount -t vfat /dev/sdb1 /usbkey dmesg > /usbkey/ubuntu_working.txt umount /usbkey
Do the same in a FOG
debug
task just no sudo and choosing a different filename to store the dmesg output in. Please upload both full outputs and post download links here. -
@sebastian-roth I checked kernel version, it’s x64. The USB works this time, thx!
I uploaded three logs, pls take a look. https://drive.google.com/open?id=0Bx_soHaLoSYETXhEeUVBRVllNVE
dmesg-Ubuntu-16.04.2.log: original Ubuntu 16.04.2 kernel which has no reboot/shutdown issue
dmesg-44-Ubuntu-config.log: bzImage4.4-pretty-close-to-the-ubuntu-build
dmesg-44-config-3.17.3.log: bzImage4.4-with-config-uped-from-3.17.3 -
@Maorui2k Thanks for the logs! There is one important information missing in your post and I suspect the answer no (both new kernels don’t reboot/shutdown, right?
The logs are a really good starting point I think. The most obvious thing is that I had a different kernel version. 4.4.0 Ubuntu and 4.4.67 mine. Although this es kind of close I still think we should try to match everything as close as we can to rule everything out. Thanks for patiently testing all the kernels I upload. This is to be a long endeavor and we just need to keep going to find what’s the issue.
I am building a new kernel now, exact same version, ubuntu patches, ubuntu config. Just to see if we can make it work by imitating the ubuntu kernel as close as possible. Will upload soon.
-
It’s good to redo things and have a closer look. I found out that I had probably used the wrong ubuntu kernel config file last time. Got it here and thought that was good. But turns out the real original configs are included in the Ubuntu patchset for this kernel.
Please understand I am now trying to keep this kernel build as close to the ubuntu kernel as possible. So I try to only change kernel settings that are absolutely necessary to make this kernel boot on your machine. You won’t be able to boot this on many other platforms and I really hope I get all the settings right the first time.
So I uploaded two new kernels (download - one is still building and I will upload soon!). Please test both these kernels and let me know. init.xz is still the same as always.
Edit: Ok I give up after hours of watching it compile. Trying to build a kernel without module support (all in one big blob) but having all the drivers included that Ubuntu does is just insane. So that leaves you with that one kernel to test with for now. Possibly I will generate an initrd file including all the ubuntu modules in case this kernel is still not rebooting/shutting down your hardware properly.
-
@sebastian-roth I was away for days again… I uploaded the new logs here. https://drive.google.com/open?id=0Bx_soHaLoSYETXhEeUVBRVllNVE The reboot/shutdown was still failed.
I would be happy to help in this investigation So don’t hesitate to ask for more testing and logs. I’ve dedicated the same PC for this testing.
-
@Maorui2k I start to feel really dumb! Again the version number in the kernel log (thanks!) is different to the Ubuntu one that seems to work. One issue is that the Ubuntu packages page links to the “wrong” (?) patch source (right side under “Download Source Package”). Eventually I found the correct patch but I am not sure if this would make a difference.
So let’s try something new. Now we want to use the very original Ubuntu kernel
bzImage-4.4.0-62-generic
and for it does not have all the needed modules compiled into the kernel I build a special FOGinit-ubuntu.xz
for you which has kernel modules and does load the important ones for you on startup. As always download and put those in/var/www/fog/service/ipxe
(before backing up the original ones). For this bigger initrd file you also need to increaseFOG_KERNEL_RAMDISK_SIZE
in the FOG Settings (FOG web GUI) to172032
. Boot your client and see if it all comes up properly and we’ll see if shutdown/reboot is working with this very Ubuntu kernel!!!As I said I found the correct patch I think but still the version numbers do not match. I think I am kind of lost with this as I don’t really know how exactly Ubuntu builds their kernels. I might not be that far of - at least I was able to build several different kernels that booted on your machine. Still none of them actually made it shutdown/reboot properly. In case you want to build a kernel yourself, follow these steps:
Download original kernel code and ubuntu patch and run the following commands:
cd /path/to/downloads gunzip linux_4.4.0-62.83.diff.gz tar xzf linux_4.4.0.orig.tar.gz cd linux-4.4 patch -p1 <../linux_4.4.0-62.83.diff cp debian.master/config/config.common.ubuntu .config cat debian.master/config/amd64/config.common.amd64 >> .config cat debian.master/config/amd64/config.flavour.generic >> .config
Then edit the
.config
file and make sure the following kernel settings are all compiled into the kernel (CONFIG_...=y
) instead of just build as module (CONFIG_...=m
CONFIG_BLK_DEV_RAM=y CONFIG_R8169=y CONFIG_USB_STORAGE=y CONFIG_HID=y CONFIG_HID_GENERIC=y CONFIG_HID_LOGITECH=y CONFIG_USB_HID=y CONFIG_SATA_AHCI=y CONFIG_SATA_AHCI_PLATFORM=y CONFIG_PATA_OLDPIIX=y CONFIG_PATA_MPIIX=y
Now you are ready preparing. Go ahead and compile the kernel:
make oldconfig make bzImage
… which will/should, after some time, result in
Kernel: arch/x86/boot/bzImage is ready...
. Grab this kernel binary and the init.xz I provide in my google drive and you should be running your very own kernel. -
@sebastian-roth I got an kernel panic this time. The last line of error message was “VFS: unable to mount root fs on unknown-block(1,0)” I wonder if the initrd.xz was broken.
Thanks for detailed instruction about compiling the kernel! I found some Wiki pages in Ubuntu website, but failed to finish it in Ubuntu 16.04. I will try your way this time.
-
@Maorui2k I just downloaded kernel and initrd again and it worked great. Are you sure you used the right files?
bzImage-4.4.0-62-generic
andinit-ubuntu.xz
. As well you need to increaseFOG_KERNEL_RAMDISK_SIZE
(FOG web interface -> FOG Settings)! If I don’t change that I get the "VFS: unable to mount root fs on unknown-block(1,0)” as well. Change this setting to value172032
and it should work. -
@sebastian-roth I might forget to save the FOG_KERNEL_RAMDISK_SIZE parameter… The boot is fine now. And the new kernel has around 50% chance to reboot/shutdown!
It seems the attempts of reboot/shutdown I tried were too few. The fog official 4.4.0 kernel also has similar chance. So I went through the testing again with more attempts. Here is the result table.
Kernel Attempts Passed Failed 3.17.3 10 10 0 bzImage-4.4.0-62-generic 22 12 10 Fog-4.4.0 10 4 6 Other 4.4 you compiled 28 14 14 Fog-4.8.11 10 0 10 Ubuntu-16.04 12 12 0
It seems the kernel got some changes between 4.4 & 4.8 which made things much worser, and Ubuntu 16.04 indeed got something fixed here. And the reboot/shutdown of Ubuntu 16.04 looked more smooth. The kernels you provided would delay 1~2s before really taking effects.
I uploaded two dmesg logs https://drive.google.com/open?id=0Bx_soHaLoSYETXhEeUVBRVllNVE
dmesg-bzImage-fog-4.4.0.log & dmesg-bzImage-4.4.0-62-generic.log -
@Maorui2k Oh, this is a new turn in this story. Thanks for the thorough testing.
The boot is fine now. And the new kernel has around 50% chance to reboot/shutdown!
Hmm, don’t really understand. Where is this kernel in the table? Is it last line (100% ok) or is it not even on the list?
How can it be that Ubuntu from CD does reboot/shutdown properly but it won’t when booted over iPXE. But why is Kernel 3.17.3 fine…???
-
@sebastian-roth the new kernel is bzImage-4.4.0-62-generic, and other 4.4.0 based kernels had similar results. Fog-* means fog official kernels. The last line is official Ubuntu 16.04 kernel. I guess APIC or ACPI or some related module have a stability/compatibility issue, and Ubuntu fixed it.
-
@Maorui2k Ok, so I compared the dmesg outputs of
bzImage-4.4.0-62-generic
(original Ubuntu kernel but PXE booted plus modules packed into the init.xz) andUbuntu-16.04
(original Ubuntu kernel booted via CD). There are still some minor differences that mostly come from the fact that Ubuntu does load more kernel modules on boot. So I added some more of those to the initrd. Give the newinit-ubuntu.xz
a try but I am pretty sure the drivers won’t make a difference.The other thing I noticed it that all dmesg outputs except the Ubuntu 16.04 one show the below messages:
Misrouted IRQ fixup and polling support enabled This may significantly impact system performance
This comes from FOG adding the
irqpoll
kernel parameter. I am not sure if this is causing the issue but I would ask you to give this a try. Make a backup copy of the file/var/www/fog/lib/fog/bootmenu.class.php
and then edit it. There should be two lines withirqpoll
(e.g. line 895 and 1577 in FOG version 1.4.4). Just comment those two lines (//
in PHP). Should look like this then:... "osid=$osid", // "irqpoll", "chkdsk=$chkdsk", ...
Save the file and PXE boot again with one of the
bzImage-4.4.0-62-generic
kernel andinit-ubuntu.xz
. Try several times to see if this has any positive effect. You can also check in dmesg output (dmesg | grep Misrouted
). -
@sebastian-roth The new initrd had a little improvement, I did 20 times reboot/shutdown, 70% passed. I’m not sure if it is just a result of probability.
The irqpoll has no any effect, and caused a hung during the boot process before network initializing messages. The passed ratio was still around 70% in 10 times reboot/shutdown.
-
@Maorui2k Could you please grab a new
dmesg
output of the latest initrd? I’d compare those again and hopefully we’ll figure this out at some point.This is all really strange. I could imagine it making a difference if you boot Ubuntu from CD versus PXE booting it. But why the heck does the 3.17.3 kernel work even when this is PXE booting as well. Could you please do some more testing on those three (at least 25 better 50 each):
- 3.17.3 kernel - PXE boot
- bzImage-4.4.0-62-generic and latest initrd - PXE boot
- Ubuntu-16.04 - CD boot
Does it make a difference if you cold boot (from fully switched off state) or warm boot?
-
@sebastian-roth log uploaded https://drive.google.com/open?id=0Bx_soHaLoSYETXhEeUVBRVllNVE dmesg-bzImage-4.4.0-62-generic-with-new-initrd.log
I did 50 times reboot/shutdown each. Here is the result.
3.17.3 kernel - PXE boot 46 passed, 4 failed. bzImage-4.4.0-62-generic and latest initrd - PXE boot 27 passed, 23 failed. Ubuntu-16.04 - CD boot 50 passed, 0 failed.
Seems 3.17.3 also has a very little chance of failure, but acceptable for me.
Is it possible to use Ubuntu 16.04 kernel and generate an initrd from it? It’s the best one.