Problems with Optiplex 5260 in an advanced PXE environment
-
Hi,
i’m trying to use FOG (1.5.4) with several Dell Optiplex 5260.
I can access the FOG pxe boot menu.
If i’m trying to set “Perform Full Host registration and inventory” or “Quick Registration” or anything else, i can only see
bzImage32.xz ok
init_32.xz ok
and bootloop on the FOG pxe menu.
I’ve tried several kernel, but it doesn’t change anything.
How can i solve this problem or debug it.thanks
Pierre -
@pbriec said in Problems with Optiplex 5260:
bzImage32.xz ok
init_32.xz okI find this interesting. For some reason iPXE is detecting that it needs the 32 bit kernels to boot. While you can boot a 64 bit machine with 32 bit kernels you should not get a boot loop. Have you by chance replaced the kernels by hand or did you use the FOG Configuration page to do it? That system should have a core-i5 so its surely a 64 bit machine.
Does the kernel even load or as soon as the kernel and inits are sent it returns to the FOG iPXE menu?
Some hints I can give you
- Make sure the firmware is up to date on the target computer.
- Make sure secure boot is disabled on the target computer. You can turn it back on after imaging is complete.
Also until FOG 1.5.5 is released, please downgrade your FOG kernels to version 4.15.2 to bypass the GPT/MBR disk partition creation delay. The developers have the solution worked out for version 1.5.5, but it hasn’t been released now.
-
Hi @george1421
thanks for you answer.
I’ve tried to change the kernel by the FOG Configuration page. I also try by hand, but it’s the same.
As soon as the kernel and inits are sent, it returns to the FOG pxe menu. It’s very fast.What do you mean by latest firmware, the latest BIOS of the PC?
Of course, i’ve disabled the Secure boot.So i will try tomorrow the kernel 4.15.2, it should work?
thanks
Pierre -
@pbriec said in Problems with Optiplex 5260:
What do you mean by latest firmware, the latest BIOS of the PC?
Yes I mean the bios of the PC. The issue I was trying to avoid is the naming conflicts between bios mode and uefi mode and actually the bios firmware that you update on the computer. I see a caused more confusion than make clear.
So i will try tomorrow the kernel 4.15.2, it should work?
4.15.2 will fix a specific issue you WILL HAVE when we can get it to boot correctly. Its still not clear to me why its trying to download the 32 bit kernel and init stuff. If you are IN the iPXE Menu, that means that you did successfully transfer the right iPXE boot image to the target computer. Where we are stuck at is booting FOS (the FOG customized linux OS that runs on the target computer). Since it bounces back to the iPXE menu right after loading init_32.xz and does not attempt to boot FOS the kernel must have failed to launch.
So how to fix… I think the first step is to rerun the fog installer using the current settings. It should redownload the fog base kernels and inits over again (I might rename bzImage, bzImage32, init.xz and init_32.xz to something else, that way when the installer runs and replaces the files you can see easily). Then attempt to pxe boot this target computer to see if you get the same results. Rerunning the installer on an installed system will not break it, only reset the default environment.
-
@pbriec Please follow George’s advice to make sure your setup is right. On the mentioned machine you should definitely see
bzImage
andinit.xz
boot up as this is a 64 bit CPU. Try to see why it does bitbzImage32
first.From what I read between the lines this seems to be specific to the Optiplex 5260 machines, right? So other clients PXE boot fine and can register as well as capture and deploy images?!
If you have it boot
bzImage
properly and you still see the it looping then you might want to try some more advanced debug steps. From what I know about this it will be fairly hard to diagnose this but lets give it a try. Download this kernel image and save in/var/www/fog/service/ipxe
on your FOG server. Now register that client with its MAC address in the web interface by hand and set option Host Kernel in the host settings toKernel.SR.SKIP_PCI.4.8.11.64
and Host Init toinit.xz
. As well set Host Kernel Arguments toearlyprintk=efi
.Now get your camera/smartphone ready to take a video. Rest it on a pile of books so we get a steady picture. Best if you can capture a 60 fpm video that we can step through later on. We want to see what’s happening between loading init.xz and jumping back to the FOG menu.
-
I’ve run again the installer. It is the same
in fact, i’m new in the FOG world. I’ve tested the pxe with another PC and i can access the Full Inventory. With VMs, it also works. Now i have to deploy 50Pcs with Optiplex 5260 configuration :-(. I hope i will not have to deploy by hands
I don’t have to make videos as here is the output (no bootloop)
Kernel.SR.SKIP_PCI.4.8.11.64... ok init.xz... ok done exit_boot..._
thanks
Pierre -
@pbriec Will have to check my code when I get home. Don’t have that here at the moment. But I am wondering about the third line
done
. Not sure where this is coming from but I can’t remember having that in my debugging output. Possibly I am wrong, will see later.I worked with JJ on a similar issue where it seems to hang on exit_boot. But this is a HP Stream 11 x360 device. I really wonder if we see buggy UEFI firmware in both cases?!?
Can you please try booting Live Linux Distros, like Ubuntu Live, System Rescue Disk or so. Will be interesting to see if those boot properly.
The other thing you can try is booting that downloaded kernel image directly from a UEFI USB boot stick. To create one format as FAT32 and create directory
X:\EFI\boot\
, put the kernel binary into that and name itbootx64.efi
. Now boot off this USB stick and see how far you get. If it proceeds beyond the exit_boot stage it will crash in a kernel panic later because there is no root filesystem to mount and boot off. That’s ok. We just want to see if it can get past exit_boot when started in a different way from a USB stick instead of PXE boot. -
-
@pbriec said in Problems with Optiplex 5260:
I’ve run again the installer. It is the same
Have you confirmed that you have the latest firmware in stalled on these 5260s? That was also part of my request.
The second thing is now what you posted
Kernel.SR.SKIP_PCI.4.8.11.64... ok init.xz... ok
Now the 64 bit kernels and inits are being sent? What changed? This is what I would have expected the first time. Did the installer fix this issue or did you specifically define init.xz when you defined the kernel to use. I’m only focusing on FOG system behavior here.
I think the next thing we need to test is if you can usb boot into FOS (FOG customized linux that runs on the target computer). FWIW FOS is the combination of bzImage and the virtual hard drive init.xz.
Review this tutorial, pay special attention to the caveats mentioned. https://forums.fogproject.org/topic/7727/building-usb-booting-fos-image/4
Look at the FOG Forum chat bubble for some additional instructions to take you to step 7 in the tutorial. “Burn” the image onto a USB flash drive of at least 512MB in size. Edit the grub.cfg file as in the tutorial. Then boot from this USB drive and select option 6 from the GRUB menu. The whole purpose of this is to see if FOS runs on the target computer. That will tell us if the error is coming from FOS or iPXE not starting FOS correctly. I’m suspecting the issue is not with FOS, but iPXE, but that is only a guess based on what we know right now.
-
I’ve updated the bios firmware, nothing change.
I’ve tried your FOS and it is starting so the problem come from ipxe
Pierre
-
Ok, i’ve made some changes on my systems
As i’ve already a pxe server on my enterprise, i’ve tried to swap the PXE server in the configuration so now, the FOG pxe server is the main pxe server. I’ve made changes on the dhcp server.
I’ve deleted the host with FOG gui and now, i can access the full inventory.How can i use 2 pxe server by using pxe chainloading?
BOOT --> 1st PXE server --> chainloading to FOG iPXE server
What is the best practices?
here is what i have on the pxe menu config:fogpxe dhcp cpuid --ext 29 && set arch x86_64 || set arch i386 params param mac0 ${net0/mac} param arch ${arch} param platform ${platform} param product ${product} param manufacturer ${product} param ipxever ${version} param filename ${filename} isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme :bootme #chain http://172.16.6.157/fog/service/ipxe/boot.php?mac=${net0/mac} chain http://172.16.0.8/fog/service/ipxe/boot.php?mac=${net0/mac}
in fact, all was commented except last line.
If the lines are uncommented, it does’nt boot to FOG from my iPXE serverI’m using the other pxe server for other purposes than FOG
Can you help me? -
@pbriec I’m not sure why you’ve gone down this path. But based on your previous test FOS boots to the debug console prompt. So that tells me the issue is not with FOS, but with iPXE not being able to start FOS for some reason. We have seen this before with mainly Lenovo systems with faulty firmware (why I suggested you update the firmware).
I think we are at the point where we need to get @Sebastian-Roth involved with debugging iPXE to find out why its not launching the FOS kernel correctly.
-
@george1421 @pbriec Not sure where we are headed with this. Directly booting the kernel as UEFI binary like I suggested seemed to cause the kernel to hang just the same. So I am very confused on what exactly is being done. It doesn’t add up for me because the FOS USB boot suggested by George is very similar to what I asked you to test…
-
@Sebastian-Roth Just a second on chat with OP. He has a different configuration than we are expecting.
-
@Sebastian-Roth OK I have a bit more understanding what is going on here.
The OP has 2 iPXE boot servers. He has a main one that does other things already. He has setup a chain statement in his main pxe boot server to load boot.php from the FOG server. So the booting process is still under control of the iPXE kernel running from the primary pxe boot server (not-FOG server). If he updates his dhcp server to point to the FOG server this Optiplex 5260 pxe boots correctly. If he chain loads from his main pxe boot server to fog it fails. According to the OP the chain loading works with other hardware, just not with this specific model.
From his existing iPXE boot server he is using this command to chain: chain http://x.x.x.x/fog/service/ipxe/boot.php?mac=${net0/mac}
The version of iPXE running on his main pxe boot server is
1.0.0 1b104
(pretty old).For bios based systems the OPs using pxelinux.0 for uefi based systems its ipxe.efi (on the main pxe boot server).
So to make this work, we need to get his main pxe boot server to load the proper iPXE kernels from the FOG server to support iPXE booting or possibly replace the iPXE boot kernels on the primary PXE boot server with the newer ones from the FOG server. The only question is if the built in iPXE script in the FOG iPXE kernels would cause his main pxe boot server problems.
I know I talked in a circle here. I needed to get everything down in one place do see if we can come up with a solution.
At this time I don’t feel its a FOG component level issue, but the OPs environment.
One possible solution would be to use the FOG server as the main pxe boot server and then just migrate the menus from the existing main server to the FOG server.
-
Please understand I’m not faulting anything at the moment. What we have is a puzzle and I’m trying to find all of the bits that makeup this environment. I think there is a solution here, we just need to find the right combination.
Looking back over our chat session the OP kept saying iPXE but my intuition was telling me syslinux. Then I looked back for the dhcp config file posted.
option tftp-server-name “172.16.0.8”; option root-path “/data/tftpboot/”; next-server 172.16.0.8; filename “pxelinux.0”; if option arch = 00:06 { filename "efi32/syslinux.efi"; } else if option arch = 00:07 { filename “efi64/syslinux.efi”; filename "ipxe.efi"; } else if option arch = 00:09 { filename “efi64/syslinux.efi”; } else { filename "pxelinux.0"; } }
There may be some formatting errors above that chat did to the config file, but I can clearly see syslinux kernels being called. The OP also made reference to pxelinux.cfg in the chat. Which again is a syslinux thing.
-
@george1421 OK this problem is making my head explode.
In your main pxe boot server, I assume you have a menu entry that points to the FOG server. The issue we have is we need to replace the in memory iPXE kernel with the FOG iPXE kernels (like what you did when you changed the dhcp option 66 and 67).
So in your main pxe boot server for the menu entry for FOG see if this replaces iPXE with FOG’s iPXE.
:call_fog set next-server <fog_server_ip> iseq ${platform} efi && set filename ipxe.efi || set filename undionly.kpxe chain tftp://${next-server}/${filename}
That should chain load the fog iPXE kernel. There is an embedded script in the FOG iPXE kernel here: https://github.com/FOGProject/fogproject/blob/master/src/ipxe/src/ipxescript
Its a generic script, the only thing we have to pay attention to is line 29.
chain tftp://${next-server}/default.ipxe ||
What that script will do is again acquire the pxe boot server {next-server} from dhcp, which will point back to your main pxe boot server, but ignores dhcp option 67 and calls default.ipxe. On your main dhcp server you need to create a new text file called default.ipxe and in it place this script
#!ipxe chain tftp://<fog_server_ip>/default.ipxe
That should call the fog based ipxe boot script, you will be running under the FOG iPXE kernel done by the fist chain, and the second chain will jump you into the FOG iPXE menu.
Will it work?? Maybe /maybe not but it sounds good on paper.
-
Hi @george1421, thanks for taking time to solve this problem
here is my lan where x=0 for site1 and x=4 for site2. They have both the same configuration.
my DHCP/DNS server are 172.16.x.3
my FileServer/TFTPBOOT are 172.16.x.5
my FOG Server are 172.16.x.8:call_fog set next-server 172.16.4.8 iseq ${platform} efi && set filename ipxe.efi || set filename undionly.kpxe chain tftp://${next-server}/${filename}
here is the output from my vm when i hit the new item call_fog
thanks
Pierre -
@pbriec Oh we are very close. I see it’s doing very much like we need.
- I see that its using an updated (different) ipxe kernel since the build number has changed
(960d1)
and it has all of the proper modules (DNS, FTP, HTTP…) That tells me you have the FOG iPXE kernel running (goal 1). - Its chain loading from your main pxe boot server to the fog server (tftp://172.16.4.5 -> 172.16.4.8) So its making it to your FOG server (goal 2).
The issue is its not supplying the needed parameters along the way. I think we need to adjust the default.ipxe on your main server to detect the required parameters. We should review the embedded ipxe script from the github site to see what is missing. https://github.com/FOGProject/fogproject/blob/master/src/ipxe/src/ipxescript
The problem is very close to a solution since we are getting all of the bits in place to make this work,.
- I see that its using an updated (different) ipxe kernel since the build number has changed
-
@pbriec Would you post the content of /tftpboot/default.ipxe from your FOG server? The more I look at this setup the more I think it should work as we configured it. I have to be missing something…