Cannot exit IPXE menu and boot from hard drive?
-
Same results after making those config changes. I think I might try @JJ-Fullmer suggestions.
I know this is more or less unrelated to Fog, but are you familiar with changing the refind binaries? When I download/decompress the image he suggests I do not see any filenames matching the ones in the /fog/service/ipxe/ directory. Am I just supposed to rename these files as they are in fog?
This is what I get after unpacking refind
bootaa64.efi bootia32.efi bootx64.efi refind.conf -
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
bootx64.efi
This should be renamed to match what FOG is using the same for the 32 bit version. You can discard the bootaa version since that is for ARM processors.
I just pxe booted into ubuntu 20.04 on a uefi VM. refind found grub64.efi and booted that. So the issue might be hardware related. Describe the number of disks and their structure on this mobo? Are they sata or nvme? Do you have 2 NVMe drives installed?
-
This host just has a single 60gb SATA drive. Here’s how the partitions look…
Device Start End Sectors Size Type
/dev/sda1 2048 1050623 1048576 512M EFI System
/dev/sda2 1050624 3151359 2100736 1G Linux filesystem
/dev/sda3 3151360 117231103 114079744 54.4G Linux filesystem -
Thanks for your reply, that post looks very similar to my issue. I am attempting to get those binaries right now but am a little confused about what files are needed specifically. The binaries in the /fog/service/ipxe directory are…
refind_aa64.efi
refind.efi
refind_ia32.efi
refind_x64.efiThe files I found after decompressing the files from sourceforge are below…
refind_x64.efi
refind_ia32.efi
refind_aa64.efiI do not see a refind.efi file in any of the sourceforge downloads. Is this file needed/used by fog? If so, any ideas where it is?
-
@wmw509 The
refind.efi
file is an older version (0.9.4). Tom added this a while ago as this seemed to be kind of a stable version. I have to say that I am not sure this one really is version 0.9.4 or perhaps a different one. You’d use that one by renaming the file. FOG servesrefind_x64.efi
to 64 bit architecture andrefind_ia32.efi
for 32 bit.Unfortunately there does not seem to be a general solution to this. People report that version xxx of rEFInd doesn’t work with their hardware while others have no problems and vice versa. So trying to provide binaries that suit every case is probably not possible. You can download older versions and try out which one works best for you.
-
Aha, that makes sense, thank you.
I have noticed some other potentially weird behavior in my attempts at getting this working. When I have been testing my Boot Exit Settings I have been testing each available selection for both EFI and non-EFI exit methods. I have gotten a few different results as I have done this, but it seems that Fog might be trying to use my non-EFI exit method even though I am definitely on a UEFI system.
If I leave the non-efi boot exit setting as SANBOOT for example, no matter what my efi boot exit setting are the host will always end up hanging with a blinking cursor in the top corner. Is it possible Fog (or my settings of Fog/the host) are causing some confusion as to what method to use?
-
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
If I leave the non-efi boot exit setting as SANBOOT for example, no matter what my efi boot exit setting are the host will always end up hanging with a blinking cursor in the top corner. Is it possible Fog (or my settings of Fog/the host) are causing some confusion as to what method to use?
That’s interesting. We rely on iPXE determining the correct mode be it UEFI or legacy BIOS and query its
platform
config parameter.To check which platform it sees on your systems you could quickly edit
/tftpboot/default.ipxe
and make the script look like this:#!ipxe show platform
Make a backup copy of the original file so you can switch forth and back quickly.
-
Once the changes were made in that script the host booted (unsuccessfully of course) but it displayed the following.
builtin/platform:string = pcbios
This is a bit above my head so forgive me if this is a stupid question, but if I expect to use Fog with only UEFI systems could I hardcode this somehow? Any other ideas?
-
@Sebastian-Roth I did not know that
refind.efi
was an older version and refind_x64.efi was a newer one. That’s a nice touch if it’s still true.
Also @wmw509 sorry for not double checking my guide, I forgot that you had to rename the files after extracting, I have a script that manages restoring my refind binary and config on every update so I haven’t had to extract and look at the file names in quite some time. -
No worries! The files were actually names the same as they are in Fog in one of the packages I installed. It was just that refind.efi that threw me for a bit of a loop.
Seems like refind might not be what was causing me problems after all
-
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
Seems like refind might not be what was causing me problems after all
As long as the platform == pcbios then is working on the bios side, but that doesn’t follow through with what you reported from inside ubuntu. Are you working with multiple hardware? OR is iPXE misreporting the platform?
OK I’m trying to think how this is possible (ipxe reporting pcbios and ubuntu reporting efi). I think I might have an idea.
There are some firmware that are dynamic depending on the boot media they will dynamically switch between bios and uefi mode depending on what they detect. If the boot media is bios based then they will switch into bios mode, if the firmware detects a efi boot media they will switch into uefi mode. There is no firmware switch its all dynamic.
So lets assume you are pxe booting this computer and you send unidonly.kpxe to this computer. This computer will see, oh that’s a bios boot loader and boot it in bios mode. FOG Imaging will work just fine in bios mode and you can deploy a uefi image with FOG Imaging in bios mode. Then after the imaging reboot, again you send undionly.kpxe to the target computer. Again it sees that is a bios boot loader so it switches to bios mode. So now you are in the iPXE menu and the menu times out to boot to the disk, but SANBOOT hangs because its trying to chain to a bios mode disk, but the disk contains a uefi boot loader. When you change the boot order so the disk is first in the order, the firmware looks to see the disk is a uefi disk and switches to uefi mode and boots from the hard drive.
I think this is how iPXE (undionly.kpxe) is reporting pcbios yet the actual OS is uefi. There is a lot of suppositions here, but it make a logical connection.
So what specifically do you have configured for dhcp option 67?
-
@george1421 said in Cannot exit IPXE menu and boot from hard drive?:
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
Seems like refind might not be what was causing me problems after all
As long as the platform == pcbios then is working on the bios side, but that doesn’t follow through with what you reported from inside ubuntu. Are you working with multiple hardware? OR is iPXE misreporting the platform?
I could be wrong, but I believe iPXE is misreporting the platform. For these tests today I did a fresh new Ubuntu 18.04 install on the host, so no hardware has been changed and it appears to be efi according to ubuntu.
OK I’m trying to think how this is possible (ipxe reporting pcbios and ubuntu reporting efi). I think I might have an idea.
There are some firmware that are dynamic depending on the boot media they will dynamically switch between bios and uefi mode depending on what they detect. If the boot media is bios based then they will switch into bios mode, if the firmware detects a efi boot media they will switch into uefi mode. There is no firmware switch its all dynamic.
I kind of follow you on this, but am a little confused. I was under the impression that most major linux distribution installers were the dynamic part and would detect what the host system supported and use that method to install?
So lets assume you are pxe booting this computer and you send unidonly.kpxe to this computer. This computer will see, oh that’s a bios boot loader and boot it in bios mode. FOG Imaging will work just fine in bios mode and you can deploy a uefi image with FOG Imaging in bios mode. Then after the imaging reboot, again you send undionly.kpxe to the target computer. Again it sees that is a bios boot loader so it switches to bios mode. So now you are in the iPXE menu and the menu times out to boot to the disk, but SANBOOT hangs because its trying to chain to a bios mode disk, but the disk contains a uefi boot loader. When you change the boot order so the disk is first in the order, the firmware looks to see the disk is a uefi disk and switches to uefi mode and boots from the hard drive.
This certainly sounds like the behavior I am getting!
I think this is how iPXE (undionly.kpxe) is reporting pcbios yet the actual OS is uefi. There is a lot of suppositions here, but it make a logical connection.
So what specifically do you have configured for dhcp option 67?
I am using pfSense for the DHCP server and option 67 is set to unidonly.kpxe.
-
@wmw509 said in Cannot exit IPXE menu and boot from hard drive?:
I could be wrong, but I believe iPXE is misreporting the platform.
If this is true its the first time (ever) I’ve heard of that. Could it happen, I doubt it for the simple fact that the boot loader has to be uefi or bios and it has to match the hardware or the computer doesn’t boot. A uefi based computer CAN NOT boot undionly.kpxe. Unless…
I kind of follow you on this, but am a little confused.
What I’m referring to is the firmware not the target OS. There is some firmware that will reconfigure itself on the fly based on what type of boot media is installed. I know our one Lenovo server is that way. There is no firmware mode switch inside the firmware settings, the firmware itself picks its mode.
I am using pfSense for the DHCP server and option 67 is set to unidonly.kpxe.
OK Good, with pfsense did you populate the efi fields with ipxe.efi for the 64bit and i386/ipxe.efi for the 32 bit versions of efi?
If we are still in doubt as long as the fog server, dhcp server, and pxe booting client are on the same subnet we can use tcpdump to collect what the target computer is being told to do. This is a good mystery and I’d really like to know why your system is acting the way it is. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue grab a pcap of the pxe booting process, pxe boot to the iPXE menu is enough of a capture. Then upload the file to a file share site and post the link here and I’ll take a look at what is going down the wire. The capture filter provided will only give us the dhcp and tftp communications and nothing more.
-
@wmw509 Would be interesting to see If it reports a different platform if your pfSense also hands out the UEFI binaries as mentioned by George!
There are two more things coming to my mind. First I am wondering if there is some confusion about the wording. Host can mean different things but in the FOG world it’s used for the Client machines that PXE boot for deploymens. So I am wondering if your Ubuntu server system is your FOG server or really one of the PXE booting systems.
Second I may ask you to schedule a debug deploy task. Same as If you create a normal task but Just before you click the create button in the FOG web UI there is a checkbox for debug mode. Now PXE boot the machine in question and wait for it to give you a Linux shell. Run the
ls -al /sys/firmware/efi
command in there and report what it looks like. -
Haha ok, I won’t point the finger at iPXE. Just didn’t see the result I was expecting there.
As far as the pfSense DHCP settings, no I did not have those ipxe.efi files in the 32 and 64 bit sections. I have changed that now and attached a picture of the config.
Here is a link to the pcap as well.
https://drive.google.com/file/d/1tYG7Klc5OGYMeZJTocrV73uvkWs6FftK/view?usp=sharing
-
My bad if I made that confusing, when I refer to host I mean the computer that I am trying to boot up using PXE/Fog. The actual Fog client is not the host, although coincidentally its running on an ubuntu server as well.
Just ran the debug task, once I got to a linux shell the efi directory did NOT exist. Only directories in /sys/firmware were acpi, dmi, and memmap.
-
@wmw509 ok looking at the pcap (you can follow along in wireshark if you care).
If you look at the discover packet, the target computer is saying its in BIOS mode. (dhcp option 93 in the discover packet).
The OFFER packet from your dhcp server is properly crafted in that it sends the boot server and undionly.kpxe to the target computer.
A few packets down we can see it pull the undionly.kpxe file from the FOG server.
Mystery solved in that since undionly.kpxe was sent to the target computer it will report that its a pcbios when queried. In this case SANBOOT will be called and of course hang because it can’t identify a uefi disk.
The root issue is your computer is announcing that its a bios based computer.
SOOOOOOOO, in your firmware settings, do you need to enable the UEFI network stack in your firmware? I know the Dells you need to specifically enabled this so network booting shows up under the UEFI section in the boot manager or only the LEGACY will show network booting options.
BTW you have pfsense configured correctly. So once you get the target computer announcing right it should boot. When you get the target computer to think its uefi then the dhcp option 93 will be either 7 UEFI BC or 9 UEFI X86_X64
-
IT WORKS NOW!!! Is it too early to celebrate with a beer?
I made a few changes in the BIOS of my host - mainly disabling CSM. I think turning that off, in combination with updating the files you mentioned in my pfSense DHCP server finally did it.
When I check the platform type with the default.ipxe script I now get the proper response, efi.
Now I can start testing this on some other hardware types and see how it goes. I really can’t say thank you enough, @george1421 @Sebastian-Roth you guys have been extremely helpful.
-
@wmw509 Disabling CSM makes sense. I’ve seen a lot of bioses that if CSM is enabled it only lets you use the legacy/bios pxe boot. I’ve had that disabled on all my hardware for so long I forgot it was a thing.