Client hangs at EFI stub:
-
@george1421 Thank you all for your continued support in trying to figure out why this is not working. I appreciate all the work and time you have put into this for me and I have learned a lot so far about how this all works.
-
@sgilbe Do you still have access to this server?
-
@george1421 Yes I do
-
@sgilbe Well my first attempt to rebuild the kernel gave me the same results as you. Not what I expected so I need to work a bit more. If I can get something that boots in the next day or so, are you willing to test to see if it resolves your booting issue?
-
@george1421 I am more than willing to test any kernel that you have for me to try. If you need to build several to test different configurations I can work with that as well. It does not take much time to be able to switch between kernels with the USB drive. I have a CentOS on the system and can map my share to be able to just grab and add the kernels as needed.
-
@george1421 any update on a new kernel to be able to try out? Just checking in.
-
@george1421 What are the required FOS settings. I have been trying to build a FOS bzImage but with no success on getting it to boot yet. Would I need a new init.zx if I move to a newer kernel?
-
@sgilbe I haven’t found the right combination to start with a clean kernel and just to get it to run on a standard system. But I do have to admit I haven’t had a lot of extra time lately to work on this.
As for needing a new init.xz. Its not at that point yet. The kernel boots and inits the hardware then connects to the init.xz to startup linux. The issue is within the kernel at this point. It may be as Sebastian mentioned that there was a patch that ubuntu added to make the kernel boot. I’m not at a give up point, but there has to be a solution here.
-
@sgilbe @george1421 I think I have an idea of what kernel modules need to be enabled for this type of CPU. I don’t have my dev laptop with me now, but I’ll work on it tonight/tomorrow so you can try it out.
-
@rodluz If you have an idea, I’m interested since I can’t seem to get the FOS kernel to boot on this hardware, and without having the hardware in hand its difficult to debug the issue too.
The FOS Linux original kernel configuration to start with is here: https://github.com/FOGProject/fos/blob/master/configs/kernelx64.config
-
@sgilbe Try this kernel out. https://drive.google.com/drive/folders/1sP6dfRymYaFTCr8iRiK64hN2pp2X836n?usp=sharing
This is kernel 6.5.6 with some config changes specific for gen 3/4 scalable Xeon CPUs. Please let us know if this works so I can document the changes.
Something else to look at… I had an issue like this with another Linux system last week. The issue turned out to be a Mellanox 40G PCIe card not playing nicely with the Kernel. Have you tried taking out non-essential PCIe cards from the host to test?
-
@rodluz Thank you for your help. I have tested the kernel by putting it on the USB drive that I have setup for FOG and it is still hanging.
As far as pcie cards go I will get a list from lshw and post it here when I get a chance most likely later today.
-
@george1421 If possible would it be helpful if I could get you remote access to the system?
-
@rodluz here are the devices in the system. lshw.txt lshw-businfo.txt
Let me know if that helps.
-
@rodluz I can also try and get you remote access to this system as well if that would help in debugging this issue.
-
@sgilbe Have you tried removing the QSFP card to see if it is still giving you that issue? I doubt that it’s the problem, but it wouldn’t hurt to try.
I’ll keep looking at the kernel config options to see if I find something else that could be missing. It may be a lot of back-and-forth trying different kernel options, since I don’t have any system with those CPUs.
-
@sgilbe I made a few changes to the kernel. Can you try the with the new one here? https://drive.google.com/drive/folders/1sP6dfRymYaFTCr8iRiK64hN2pp2X836n
-
@rodluz I will try and remove the QSFP card and am trying the new kernel now. Will let you know of the outcome.
-
@rodluz It is still hanging. Removing card now.
-
@sgilbe After removing the card it is still hanging.