Unable to boot to disk after PXE Menu timeout
-
@jmvela2x On one of these windows computers, will you post a picture of the disk manager showing the disk and partition layouts?
-
@george1421 The boot drives are SATA. The system is a Gigabyte Z390 Aorus Ultra.
I am finding that I see different behavior (in 1.5.9.119) by changing the exit type at the host configuration page rather than Fog Configuration>Fog Settings>FOG Boot Settings though.
I thought the latter would override the former.
-
@jmvela2x The global settings are applied first, then host specific settings will override the global settings. This way if there is a one off situation you can fix the boot method for individual computers without impacting all computers.
Understand when you say you have a bios computer and SANBOOT doesn’t work, its equivalent to saying the sky is green. While it can happen, it should be. That is why I have so many questions, I must be missing something.
-
This post is deleted! -
It just sits here with a flashing cursor.
-
Here’s the output of <http://<fog_server_ip>/fog/service/ipxe/boot.php?mac=<mac_address_of_vm>> if it helps any:
#!ipxe
set fog-ip 10.132.81.150
set fog-webroot fog
set boot-url http://${fog-ip}/${fog-webroot}
set storage-ip 10.132.81.150
cpuid --ext 29 && set arch x86_64 || set arch i386
goto get_console
:console_set
colour --rgb 0x00567a 1 ||
colour --rgb 0x00567a 2 ||
colour --rgb 0x00567a 4 ||
cpair --foreground 7 --background 2 2 ||
goto MENU
:alt_console
cpair --background 0 1 ||
cpair --background 1 2 ||
goto MENU
:get_console
console --picture http://10.132.81.150/fog/service/ipxe/nothingisbeyondourreach.png --left 100 --right 80 && goto console_set || goto alt_console
:MENU
menu
colour --rgb 0x00567a 0 ||
cpair --foreground 1 1 ||
cpair --foreground 0 3 ||
cpair --foreground 4 4 ||
item --gap Host is registered as f223-100-15-z3!
item --gap – -------------------------------------
item fog.local Boot from hard disk
item fog.memtest Run Memtest86+
item fog.keyreg Update Product Key
item fog.deployimage Deploy Image
item fog.multijoin Join Multicast Session
item fog.quickdel Quick Host Deletion
item fog.sysinfo Client System Information (Compatibility)
choose --default fog.local --timeout 30000 target && goto ${target}
:fog.local
sanboot --no-describe --drive 0x80 || goto MENU
:fog.memtest
kernel memdisk initrd=memtest.bin iso raw
initrd memtest.bin
boot || goto MENU
:fog.keyreg
login
params
param mac0 ${net0/mac}
param arch ${arch}
param username ${username}
param password ${password}
param keyreg 1
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
param sysuuid ${uuid}
:fog.deployimage
login
params
param mac0 ${net0/mac}
param arch ${arch}
param username ${username}
param password ${password}
param qihost 1
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
param sysuuid ${uuid}
:fog.multijoin
login
params
param mac0 ${net0/mac}
param arch ${arch}
param username ${username}
param password ${password}
param sessionJoin 1
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
param sysuuid ${uuid}
:fog.quickdel
login
params
param mac0 ${net0/mac}
param arch ${arch}
param username ${username}
param password ${password}
param delhost 1
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
param sysuuid ${uuid}
:fog.sysinfo
kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=275000 web=http://10.132.81.150/fog/ consoleblank=0 rootfstype=ext4 storage=10.132.81.150:/images/ storageip=10.132.81.150 nvme_core.default_ps_max_latency_us=0 loglevel=4 mode=sysinfo
imgfetch init_32.xz
boot || goto MENU
:bootme
chain -ar http://10.132.81.150/fog/service/ipxe/boot.php##params ||
goto MENU
autoboot -
@jmvela2x said in Unable to boot to disk after PXE Menu timeout:
:fog.local
sanboot --no-describe --drive 0x80 || goto MENUThis is what I would expect for a bios exit.
Again can you post a picture of your disk layout using the windows disk manager for this target computer?
-
@george1421 Missed this the first time. Apologies.
-
@jmvela2x The only thing I see strange is there is a mfg recovery partition in the first position on the disk, where normally under bios the first partition is the OS partition.
Using diskpart can you see which partition is marked active?
-
@george1421 Nothing reports as active it seems. This is a GPT disk.diskpart.txt
-
@jmvela2x I’ll need to check a bios based computer, but I think the C drive needs to be marked active. Some of these OEM recovery system will insert themselves in the boot order in case the main OS is corrupt. They chain load from their partition to the C drive partition. So I could see why sanboot isn’t working because it can’t find an active partition to chain to. But its been a few years since I dealt with a bios based win10 install. So I need to confirm it.
-
@george1421 We also have a legacy (MBR) based install master image of Win10. I will test SANBOOT exit for this OS tomorrow when I am back on site.
Refind EFI should work though right? I’m booting PXE with legacy FW and the OS is UEFI based. This is what I see in that case:
-
@jmvela2x said in Unable to boot to disk after PXE Menu timeout:
Refind EFI should work though right? I’m booting PXE with legacy FW and the OS is UEFI based. This is what I see in that case:
I’m not sure I understand the statement. With a uefi system (pxe booting ipxe.efi) the exit mode should be refind. Now if you would have said bios works perfectly but uefi kind of hangs, then there are settings in the refind.conf file that we might need to tweak. But refind does do a pretty good job finding the uefi boot partition on the disk. What you have on the screen is typical of a uefi exit.
-
@george1421 The OS disk is UEFI so there’s no active partition to find with SANBOOT. The NIC Oprom booting PXE is BIOS (Legacy) using undionly.kkpxe.
-
@jmvela2x said in Unable to boot to disk after PXE Menu timeout:
The OS disk is UEFI so there’s no active partition to find with SANBOOT
I think its been a day, cause I’m not following. The disk image you provided below is surely bios. There needs to be an active partition to boot from.
Now for UEFI that is a different critter. I can’t say for sure if there is an active partition or not. The uefi firmware looks at each partition for a specific directory structure for a boot file [bootx64.efi] in an a directory called \eft\boot If the uefi firmware finds bootx64.efi it will load it which then bootstraps into the OS.
<sidebar> If you take a flash drive and format it fat32, create a directory efi\boot and then drop FOG’s ipxe.efi into that directory renamed as bootx64.efi. Now you have just created a usb boot drive into FOG imaging </sidebar>
You can image with FOG in bios mode, and lay down an image that is uefi. FOG doesn’t care about the disk structure. It only moves disks from here to there. But when you boot through PXE to the FOG menu and then onto the target OS, the boot loader you use (undionly.kpxe or ipxe.efi) needs to match the firmware mode. The boot loader (undionly.kpxe or ipxe.efi) then decides to use the exit mode SANBOOT or REFIND. You can not boot load a bios boot loader undionly.kpxe with a UEFI firmware. The same goes the other way around.
-
@george1421 The disk image I provided is EFI. See EFI 100MB EFI system partition in screenshot. Since the disk structure doesn’t matter to FOG that seems to be a non-issue either way.
My BIOS has two places where I can change settings for PXE boot. In the first image you can see that boot options 1 and 4 are legacy and UEFI PXE boot respectively. UEFI is only enabled for testing and all of my troubleshooting here has been done booting to the IBA* device seen in the screenshot (what I understand to be BIOS FW for PXE booting). Booting to the UEFI PXE device fails (since the bootloader I am using is not ipxe.efi).
-
@jmvela2x Ok… well I had a complete post where I went totally in a different direction. As part of that I grabbed disk structures from both a bios and uefi disk. As part of that I downloaded the image you provided for reference, and I’m ashamed to say that IS a uefi disk. It just looks like what ever created that disk structure had drunk monkeys in the layout creation.
Your efi partition is partition 5. While this is technically OK, it is far from standard. Below is a standard windows 10 disk format.
So all this talk about undionly.kpxe and SANBOOT can be thrown out the window. We should be working with rEFInd here. The only thing from your picture I did not see, is I guess on the system tab is the system in bios or uefi mode? I can see its pxe booting uefi. The issue with some of these advanced firmware is that you can net boot in uefi mode but have the default firmware mode as bios. You can do that with Dells that support both bios and uefi. You can take a uefi computer and via the boot menu, pxe boot in bios mode.
So now that we are dealing with a uefi computer, you should ensure in the host definition for this computer, you have the uefi exit mode as REFIND.
Just to restate what mobo this is: Gigabyte Z390 Aorus Ultra.
-
@george1421 I tried that too. I posted a screenshot previously of my output for REFIND. I end up with a flashing cursor and no progress.
The system has CSM support enabled (to boot to a UEFI OS) but I am netbooting in BIOS mode using undionly.kkpxe. I will attempt to change to ipxe.efi tomorrow and try again and see what results that yields.
-
@jmvela2x said in Unable to boot to disk after PXE Menu timeout:
I am netbooting in BIOS mode using undionly.kkpxe. I will attempt to change to ipxe.efi tomorrow
Just saying doing this will confuse things. The bios boot loader will be used (undionly.kpxe), so SANBOOT will be tried and will fail because the disk structure is efi and there is no MBR to reference. undionly.kpxe is a bios based boot loader so it can’t shell out to refind either since refind is an efi boot loader. Which is right for the disk, but can’t be started from bios (or CSM). Now the problem bits are falling in line.
So a bit off topic issue, but if your dhcp server is Windows 2012 or newer are you using profiles to send the right boot file to the target computer or are you using a static dhcp option 67? If you have a Windows 2012 or newer dhcp server there is a wiki page that shows how to setup dhcp profiles: https://wiki.fogproject.org/wiki/index.php/BIOS_and_UEFI_Co-Existence#Using_Windows_Server_2012_.28R1_and_later.29_DHCP_Policy If you go this path, make sure you activate the profile within your scope.
-
@george1421 I did run across that wiki at one point. I am relatively sure our DHCP server is Windows Server 2019 at this point. Our whole organization has been moving away from 2012 for over a year now.
Since we use both types of disks, the DHCP profile sounds like the best way to go.
Just for clarification and my own curiosity, will this issue happen in reverse if I use ipxe.efi and try to sanboot to an MBR OS disk? I intend to test this either way for my own learning (as well as undionly.kkpxe with an MBR disk. Also, what is the ‘Exit’ boot option for if not to just fall back to the next boot option in BIOS?