Cannot exit IPXE menu and boot from hard drive?
-
@Sebastian-Roth I did not know that
refind.efi
was an older version and refind_x64.efi was a newer one. That’s a nice touch if it’s still true.
Also @wmw509 sorry for not double checking my guide, I forgot that you had to rename the files after extracting, I have a script that manages restoring my refind binary and config on every update so I haven’t had to extract and look at the file names in quite some time. -
No worries! The files were actually names the same as they are in Fog in one of the packages I installed. It was just that refind.efi that threw me for a bit of a loop.
Seems like refind might not be what was causing me problems after all
-
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
Seems like refind might not be what was causing me problems after all
As long as the platform == pcbios then is working on the bios side, but that doesn’t follow through with what you reported from inside ubuntu. Are you working with multiple hardware? OR is iPXE misreporting the platform?
OK I’m trying to think how this is possible (ipxe reporting pcbios and ubuntu reporting efi). I think I might have an idea.
There are some firmware that are dynamic depending on the boot media they will dynamically switch between bios and uefi mode depending on what they detect. If the boot media is bios based then they will switch into bios mode, if the firmware detects a efi boot media they will switch into uefi mode. There is no firmware switch its all dynamic.
So lets assume you are pxe booting this computer and you send unidonly.kpxe to this computer. This computer will see, oh that’s a bios boot loader and boot it in bios mode. FOG Imaging will work just fine in bios mode and you can deploy a uefi image with FOG Imaging in bios mode. Then after the imaging reboot, again you send undionly.kpxe to the target computer. Again it sees that is a bios boot loader so it switches to bios mode. So now you are in the iPXE menu and the menu times out to boot to the disk, but SANBOOT hangs because its trying to chain to a bios mode disk, but the disk contains a uefi boot loader. When you change the boot order so the disk is first in the order, the firmware looks to see the disk is a uefi disk and switches to uefi mode and boots from the hard drive.
I think this is how iPXE (undionly.kpxe) is reporting pcbios yet the actual OS is uefi. There is a lot of suppositions here, but it make a logical connection.
So what specifically do you have configured for dhcp option 67?
-
@george1421 said in Cannot exit IPXE menu and boot from hard drive?:
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
Seems like refind might not be what was causing me problems after all
As long as the platform == pcbios then is working on the bios side, but that doesn’t follow through with what you reported from inside ubuntu. Are you working with multiple hardware? OR is iPXE misreporting the platform?
I could be wrong, but I believe iPXE is misreporting the platform. For these tests today I did a fresh new Ubuntu 18.04 install on the host, so no hardware has been changed and it appears to be efi according to ubuntu.
OK I’m trying to think how this is possible (ipxe reporting pcbios and ubuntu reporting efi). I think I might have an idea.
There are some firmware that are dynamic depending on the boot media they will dynamically switch between bios and uefi mode depending on what they detect. If the boot media is bios based then they will switch into bios mode, if the firmware detects a efi boot media they will switch into uefi mode. There is no firmware switch its all dynamic.
I kind of follow you on this, but am a little confused. I was under the impression that most major linux distribution installers were the dynamic part and would detect what the host system supported and use that method to install?
So lets assume you are pxe booting this computer and you send unidonly.kpxe to this computer. This computer will see, oh that’s a bios boot loader and boot it in bios mode. FOG Imaging will work just fine in bios mode and you can deploy a uefi image with FOG Imaging in bios mode. Then after the imaging reboot, again you send undionly.kpxe to the target computer. Again it sees that is a bios boot loader so it switches to bios mode. So now you are in the iPXE menu and the menu times out to boot to the disk, but SANBOOT hangs because its trying to chain to a bios mode disk, but the disk contains a uefi boot loader. When you change the boot order so the disk is first in the order, the firmware looks to see the disk is a uefi disk and switches to uefi mode and boots from the hard drive.
This certainly sounds like the behavior I am getting!
I think this is how iPXE (undionly.kpxe) is reporting pcbios yet the actual OS is uefi. There is a lot of suppositions here, but it make a logical connection.
So what specifically do you have configured for dhcp option 67?
I am using pfSense for the DHCP server and option 67 is set to unidonly.kpxe.
-
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
I could be wrong, but I believe iPXE is misreporting the platform.
If this is true its the first time (ever) I’ve heard of that. Could it happen, I doubt it for the simple fact that the boot loader has to be uefi or bios and it has to match the hardware or the computer doesn’t boot. A uefi based computer CAN NOT boot undionly.kpxe. Unless…
I kind of follow you on this, but am a little confused.
What I’m referring to is the firmware not the target OS. There is some firmware that will reconfigure itself on the fly based on what type of boot media is installed. I know our one Lenovo server is that way. There is no firmware mode switch inside the firmware settings, the firmware itself picks its mode.
I am using pfSense for the DHCP server and option 67 is set to unidonly.kpxe.
OK Good, with pfsense did you populate the efi fields with ipxe.efi for the 64bit and i386/ipxe.efi for the 32 bit versions of efi?
If we are still in doubt as long as the fog server, dhcp server, and pxe booting client are on the same subnet we can use tcpdump to collect what the target computer is being told to do. This is a good mystery and I’d really like to know why your system is acting the way it is. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue grab a pcap of the pxe booting process, pxe boot to the iPXE menu is enough of a capture. Then upload the file to a file share site and post the link here and I’ll take a look at what is going down the wire. The capture filter provided will only give us the dhcp and tftp communications and nothing more.
-
@wmw509 Would be interesting to see If it reports a different platform if your pfSense also hands out the UEFI binaries as mentioned by George!
There are two more things coming to my mind. First I am wondering if there is some confusion about the wording. Host can mean different things but in the FOG world it’s used for the Client machines that PXE boot for deploymens. So I am wondering if your Ubuntu server system is your FOG server or really one of the PXE booting systems.
Second I may ask you to schedule a debug deploy task. Same as If you create a normal task but Just before you click the create button in the FOG web UI there is a checkbox for debug mode. Now PXE boot the machine in question and wait for it to give you a Linux shell. Run the
ls -al /sys/firmware/efi
command in there and report what it looks like. -
Haha ok, I won’t point the finger at iPXE. Just didn’t see the result I was expecting there.
As far as the pfSense DHCP settings, no I did not have those ipxe.efi files in the 32 and 64 bit sections. I have changed that now and attached a picture of the config.
Here is a link to the pcap as well.
https://drive.google.com/file/d/1tYG7Klc5OGYMeZJTocrV73uvkWs6FftK/view?usp=sharing
-
My bad if I made that confusing, when I refer to host I mean the computer that I am trying to boot up using PXE/Fog. The actual Fog client is not the host, although coincidentally its running on an ubuntu server as well.
Just ran the debug task, once I got to a linux shell the efi directory did NOT exist. Only directories in /sys/firmware were acpi, dmi, and memmap.
-
@wmw509 ok looking at the pcap (you can follow along in wireshark if you care).
If you look at the discover packet, the target computer is saying its in BIOS mode. (dhcp option 93 in the discover packet).
The OFFER packet from your dhcp server is properly crafted in that it sends the boot server and undionly.kpxe to the target computer.
A few packets down we can see it pull the undionly.kpxe file from the FOG server.
Mystery solved in that since undionly.kpxe was sent to the target computer it will report that its a pcbios when queried. In this case SANBOOT will be called and of course hang because it can’t identify a uefi disk.
The root issue is your computer is announcing that its a bios based computer.
SOOOOOOOO, in your firmware settings, do you need to enable the UEFI network stack in your firmware? I know the Dells you need to specifically enabled this so network booting shows up under the UEFI section in the boot manager or only the LEGACY will show network booting options.
BTW you have pfsense configured correctly. So once you get the target computer announcing right it should boot. When you get the target computer to think its uefi then the dhcp option 93 will be either 7 UEFI BC or 9 UEFI X86_X64
-
IT WORKS NOW!!! Is it too early to celebrate with a beer?
I made a few changes in the BIOS of my host - mainly disabling CSM. I think turning that off, in combination with updating the files you mentioned in my pfSense DHCP server finally did it.
When I check the platform type with the default.ipxe script I now get the proper response, efi.
Now I can start testing this on some other hardware types and see how it goes. I really can’t say thank you enough, @george1421 @Sebastian-Roth you guys have been extremely helpful.
-
@wmw509 Disabling CSM makes sense. I’ve seen a lot of bioses that if CSM is enabled it only lets you use the legacy/bios pxe boot. I’ve had that disabled on all my hardware for so long I forgot it was a thing.