Cannot boot hard drive from PXE menu
-
Running git master as of Nov 29 2020.
FOG server is installed and running OK.
I can network boot the target host PC and register successfully with FOG. The host entry is visible in the FOG web console.
When the target host reboots, and is left alone to boot from hard drive, I get a blank screen with a cursor.
The computer is 100% UEFI BIOS, with latest BIOS version. There is no “Legacy” boot option anywhere. Strangely, with DHCP options, it is asking for legacy???
Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
Network card from manufacturer is motherboard RealTek 8111E
I tried symlink many different files under /tftpboot. When a .efi file is used, the PXE boot hangs. For ipxe.efi, I get error message “NBP is too big to fit in memory”. Default of “undionly.kpxe”, works (I can network boot and register the PC in FOG), but still boot from hard drive fail - just blank screen and flashing cursor.
If I set boot order to be Windows 10 in BIOS, PC will boot Windows no problems.
I have tried SANBOOT, EXIT, REFIND, GRUB, GRUB* options.
For GRUB, I get hang with:
Launching Grub… begin pxe scan… Starting cmain() …
For EXIT, I get:
Chainloadinf failed, hit 's' for the iPXE shell; reboot in 10 seconds
All others I just get blank screen and flashing cursor.
I can see in HTTP logs with DEFAULT EXIT for this host, HTTP 200 request for “refind_x64.efi”, and “refind.conf”, but still just cursor. I tried latest refind_x64.efi - same outcome.
Any help?
-
I think this issue is similar:
https://forums.fogproject.org/topic/14851/cannot-exit-ipxe-menu-and-boot-from-hard-drive/27?page=2
my packet capture shows dhcp client is identifying as `pcbios’:
ARCH Option 93, length 2: 0
The BIOS does not offer any CSM options, so I think this motherboard “Asus F1A75-I Deluxe” is junk for FOG.
-
@blankzapper It would be interesting to see the pcap of the pxe booting process taken from the fog server perspective. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue
This will not resolve anything only confirm what you are seeing. FWIW the actual dhcp option field you want to look at is 93 as you posted below.
So you say you can get into the fog iPXE menu and register the computer. What boot file are you sending to the target computer? What do you have set for dhcp option 67? Or are you using dnsmasq or windows dhcp server profiles to dynamically set the boot file?
I can tell you that you can not boot a bios boot loader (undionly.kpxe) on a uefi system or a efi boot loader (ipxe.efi) on a bios system. The formats are different and they will not boot. I know some systems are dynamic in that they will switch between bios and uefi mode based on the disk image format.
Now we’ve seen some luck by rolling back refind to version 0.11.0 the newer versions of refind seem to not find the boot media. But first we need to identify what boot file you are using so we can focus on the exit modes.
-
@george1421 said in Cannot boot hard drive from PXE menu:
So you say you can get into the fog iPXE menu and register the computer. What boot file are you sending to the target computer?
What do you have set for dhcp option 67? Or are you using dnsmasq or windows dhcp server profiles to dynamically set the boot file?Yes, registering is 100% successful (when not using any *.efi file). I use ISC DHCPd on gateway computer. Subnet is allocated for FOG building:
subnet 10.10.88.96 netmask 255.255.255.224 { option subnet-mask 255.255.255.224; option routers 10.10.88.97; option ntp-servers 10.10.88.97; option domain-name-servers 8.8.8.8; option bootfile-name "undionly.kpxe"; next-server 10.10.88.100; range 10.10.88.100 10.10.88.110; authoritative; }
.100 is FOG server.
I can tell you that you can not boot a bios boot loader (undionly.kpxe) on a uefi system or a efi boot loader (ipxe.efi) on a bios system. The formats are different and they will not boot. I know some systems are dynamic in that they will switch between bios and uefi mode based on the disk image format.
I think then this must be my issue. The target computer is 100% UEFI. The BIOS title is “ASUS UEFI BIOS Utility”, and boot options are (example) “Windows Boot Manager (P1: Samsung SSD 850 Pro 128GB)”. I looked many times in all options, and there is no “Legacy” boot options, or CSM options. For “PRO” model of this (old) motherboard, I see such options.
When undionly.kpe is symlinked to a .efi file (any except “ipxe.efi”, which causes the “NBP is too big…” error), the PXE boot will hang after DHCP IP is allocated…
Now we’ve seen some luck by rolling back refind to version 0.11.0 the newer versions of refind seem to not find the boot media. But first we need to identify what boot file you are using so we can focus on the exit modes.
Maybe I can try older refind. For refind, you should see a GUI or menu, but after 200 HTTP transfer of “refind_x64.efi” and “refind.conf”, I see nothing, just blinking cursor. It’s not even loading???
-
reading from file output.pcap, link-type EN10MB (Ethernet) 08:12:13.298167 IP (tos 0x0, ttl 20, id 0, offset 0, flags [none], proto UDP (17), length 576) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from c8:60:00:9a:64:b1, length 548, xid 0x19a64b1, secs 4, Flags [Broadcast] (0x8000) Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover Parameter-Request Option 55, length 36: Subnet-Mask, Time-Zone, Default-Gateway, Time-Server IEN-Name-Server, Domain-Name-Server, RL, Hostname BS, Domain-Name, SS, RP EP, RSZ, TTL, BR YD, YS, NTP, Vendor-Option Requested-IP, Lease-Time, Server-ID, RN RB, Vendor-Class, TFTP, BF Option 128, Option 129, Option 130, Option 131 Option 132, Option 133, Option 134, Option 135 MSZ Option 57, length 2: 1260 GUID Option 97, length 17: 0.224.29.129.46.242.184.220.17.166.228.200.96.0.154.100.177 ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" 08:12:13.298296 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 10.10.88.97.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x19a64b1, secs 4, Flags [Broadcast] (0x8000) Your-IP 10.10.88.102 Server-IP 10.10.88.100 Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 10.10.88.97 Lease-Time Option 51, length 4: 566762 Subnet-Mask Option 1, length 4: 255.255.255.224 Default-Gateway Option 3, length 4: 10.10.88.97 Domain-Name-Server Option 6, length 4: 8.8.8.8 NTP Option 42, length 4: 10.10.88.97 BF Option 67, length 13: "undionly.kpxe" 08:12:17.295057 IP (tos 0x0, ttl 20, id 1, offset 0, flags [none], proto UDP (17), length 576) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from c8:60:00:9a:64:b1, length 548, xid 0x19a64b1, secs 4, Flags [Broadcast] (0x8000) Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Request Requested-IP Option 50, length 4: 10.10.88.102 Parameter-Request Option 55, length 36: Subnet-Mask, Time-Zone, Default-Gateway, Time-Server IEN-Name-Server, Domain-Name-Server, RL, Hostname BS, Domain-Name, SS, RP EP, RSZ, TTL, BR YD, YS, NTP, Vendor-Option Requested-IP, Lease-Time, Server-ID, RN RB, Vendor-Class, TFTP, BF Option 128, Option 129, Option 130, Option 131 Option 132, Option 133, Option 134, Option 135 MSZ Option 57, length 2: 1260 Server-ID Option 54, length 4: 10.10.88.97 GUID Option 97, length 17: 0.224.29.129.46.242.184.220.17.166.228.200.96.0.154.100.177 ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" 08:12:17.295074 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 10.10.88.97.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x19a64b1, secs 4, Flags [Broadcast] (0x8000) Your-IP 10.10.88.102 Server-IP 10.10.88.100 Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: ACK Server-ID Option 54, length 4: 10.10.88.97 Lease-Time Option 51, length 4: 566758 Subnet-Mask Option 1, length 4: 255.255.255.224 Default-Gateway Option 3, length 4: 10.10.88.97 Domain-Name-Server Option 6, length 4: 8.8.8.8 NTP Option 42, length 4: 10.10.88.97 BF Option 67, length 13: "undionly.kpxe" 08:12:17.297001 IP (tos 0x0, ttl 20, id 2, offset 0, flags [none], proto UDP (17), length 58) 10.10.88.102.ah-esp-encap > 10.10.88.100.tftp: [udp sum ok] 30 RRQ "undionly.kpxe" octet tsize 0 08:12:17.307127 IP (tos 0x0, ttl 20, id 4, offset 0, flags [none], proto UDP (17), length 63) 10.10.88.102.acp-port > 10.10.88.100.tftp: [udp sum ok] 35 RRQ "undionly.kpxe" octet blksize 1456 08:12:22.767562 IP (tos 0x0, ttl 64, id 256, offset 0, flags [none], proto UDP (17), length 437) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from c8:60:00:9a:64:b1, length 409, xid 0x11ab33c, secs 4, Flags [Broadcast] (0x8000) Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" User-Class Option 77, length 4: instance#1: ERROR: invalid option Parameter-Request Option 55, length 23: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, MTU Vendor-Option, Vendor-Class, TFTP, BF Option 119, Option 128, Option 129, Option 130 Option 131, Option 132, Option 133, Option 134 Option 135, Option 175, Option 203 T175 Option 175, length 57: 177.5.1.16.236.129.104.235.3.1.20.1.23.1.1.34.1.1.22.1.1.19.1.1.20.1.1.17.1.1.39.1.1.25.1.1.41.1.1.16.1.2.33.1.1.21.1.1.24.1.1.38.1.1.18.1.1 Client-ID Option 61, length 7: ether c8:60:00:9a:64:b1 GUID Option 97, length 17: 0.224.29.129.46.242.184.220.17.166.228.200.96.0.154.100.177 08:12:22.767629 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 10.10.88.97.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x11ab33c, secs 4, Flags [Broadcast] (0x8000) Your-IP 10.10.88.103 Server-IP 10.10.88.100 Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 10.10.88.97 Lease-Time Option 51, length 4: 542768 Subnet-Mask Option 1, length 4: 255.255.255.224 Default-Gateway Option 3, length 4: 10.10.88.97 Domain-Name-Server Option 6, length 4: 8.8.8.8 BF Option 67, length 13: "undionly.kpxe" 08:12:23.883733 IP (tos 0x0, ttl 64, id 514, offset 0, flags [none], proto UDP (17), length 437) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from c8:60:00:9a:64:b1, length 409, xid 0x11ab33c, secs 10, Flags [Broadcast] (0x8000) Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" User-Class Option 77, length 4: instance#1: ERROR: invalid option Parameter-Request Option 55, length 23: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, MTU Vendor-Option, Vendor-Class, TFTP, BF Option 119, Option 128, Option 129, Option 130 Option 131, Option 132, Option 133, Option 134 Option 135, Option 175, Option 203 T175 Option 175, length 57: 177.5.1.16.236.129.104.235.3.1.20.1.23.1.1.34.1.1.22.1.1.19.1.1.20.1.1.17.1.1.39.1.1.25.1.1.41.1.1.16.1.2.33.1.1.21.1.1.24.1.1.38.1.1.18.1.1 Client-ID Option 61, length 7: ether c8:60:00:9a:64:b1 GUID Option 97, length 17: 0.224.29.129.46.242.184.220.17.166.228.200.96.0.154.100.177 08:12:23.883849 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 10.10.88.97.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x11ab33c, secs 10, Flags [Broadcast] (0x8000) Your-IP 10.10.88.103 Server-IP 10.10.88.100 Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 10.10.88.97 Lease-Time Option 51, length 4: 542767 Subnet-Mask Option 1, length 4: 255.255.255.224 Default-Gateway Option 3, length 4: 10.10.88.97 Domain-Name-Server Option 6, length 4: 8.8.8.8 BF Option 67, length 13: "undionly.kpxe" 08:12:25.805952 IP (tos 0x0, ttl 64, id 771, offset 0, flags [none], proto UDP (17), length 449) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from c8:60:00:9a:64:b1, length 421, xid 0x11ab33c, secs 18, Flags [Broadcast] (0x8000) Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Request MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" User-Class Option 77, length 4: instance#1: ERROR: invalid option Parameter-Request Option 55, length 23: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, MTU Vendor-Option, Vendor-Class, TFTP, BF Option 119, Option 128, Option 129, Option 130 Option 131, Option 132, Option 133, Option 134 Option 135, Option 175, Option 203 T175 Option 175, length 57: 177.5.1.16.236.129.104.235.3.1.20.1.23.1.1.34.1.1.22.1.1.19.1.1.20.1.1.17.1.1.39.1.1.25.1.1.41.1.1.16.1.2.33.1.1.21.1.1.24.1.1.38.1.1.18.1.1 Client-ID Option 61, length 7: ether c8:60:00:9a:64:b1 GUID Option 97, length 17: 0.224.29.129.46.242.184.220.17.166.228.200.96.0.154.100.177 Server-ID Option 54, length 4: 10.10.88.97 Requested-IP Option 50, length 4: 10.10.88.103 08:12:25.806037 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 10.10.88.97.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x11ab33c, secs 18, Flags [Broadcast] (0x8000) Your-IP 10.10.88.103 Server-IP 10.10.88.100 Client-Ethernet-Address c8:60:00:9a:64:b1 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: ACK Server-ID Option 54, length 4: 10.10.88.97 Lease-Time Option 51, length 4: 542765 Subnet-Mask Option 1, length 4: 255.255.255.224 Default-Gateway Option 3, length 4: 10.10.88.97 Domain-Name-Server Option 6, length 4: 8.8.8.8 BF Option 67, length 13: "undionly.kpxe" 08:12:25.865590 IP (tos 0x0, ttl 64, id 1028, offset 0, flags [none], proto UDP (17), length 70) 10.10.88.103.9369 > 10.10.88.100.tftp: [udp sum ok] 42 RRQ "default.ipxe" octet blksize 1432 tsize 0
Older refind (0.11.0), is the same result - just flashing cursor:
10.10.88.103 - - [30/Nov/2020:08:20:36 +1100] "POST /fog/service/ipxe/boot.php HTTP/1.1" 200 2806 "-" "iPXE/1.20.1+ (g4bd0)" 10.10.88.103 - - [30/Nov/2020:08:20:36 +1100] "GET /fog/service/ipxe/bg.png HTTP/1.1" 200 21280 "-" "iPXE/1.20.1+ (g4bd0)" 10.10.88.103 - - [30/Nov/2020:08:20:39 +1100] "GET /fog/service/ipxe/refind.conf HTTP/1.1" 200 29731 "-" "iPXE/1.20.1+ (g4bd0)" 10.10.88.103 - - [30/Nov/2020:08:20:39 +1100] "GET /fog/service/ipxe/refind_x64.efi HTTP/1.1" 200 221896 "-" "iPXE/1.20.1+ (g4bd0)"
-
@blankzapper If you could post the actual pcap file or upload it to a file share site and post the link here I would like to look at it in wireshark because its actually saying Hey, I’m a bios computer. So its either fibbing or there is something else going on here. Also creating these symlinks makes understanding what is going on super confusing. So lets reset all of the files they way they were delivered by the FOG installer.
Once that is done and since you are using isc dhcp server lets modify the configuration a bit to make it more dynamic in that the isc dhcp server will send the right boot file based on the target computer. Use this example to setup the isc-dhcp server configuration to support dynamic boot files. https://wiki.fogproject.org/wiki/index.php/BIOS_and_UEFI_Co-Existence#Example_1
Also go into the fog web ui and FOG Configuration -> FOG Settings -> Expand All and make sure the bios exit mode is san boot and the uefi exit mode is rEFInd. Make the global settings this as well as if you modified the target host exit mode make sure they match the global values or if you have not set them leave them default. Once we get things back to a known state then we can try to debug this.
As for refind resetting it back to an older release, you can get them from here. https://sourceforge.net/projects/refind/files/
We have not tested 0.12.0. The current version of fog ships with 0.11.4. FOG used 0.11.0 for a very long time and that worked well. FOR NOW lets hold off with messing with refind until we can identify what mode this target computer is booting with. I’m including the links here for completeness
-
https://drive.google.com/file/d/1TBNrP2vtHEWjQ_96rWaJIB73On7O4Wia/view?usp=sharing
Changed subnet IP addressing, but same exchange. Symlinks reverted. DHCPD config updated with VCI switches. BIOS/UEFI exit modes are default global/host. Refind restored to FOG supplied.
-
@blankzapper Well as they say the pcap doesn’t lie. Your computer IS saying that its a bios based x86 computer. Your dhcp server is sending the proper file and I see it booting into ipxe getting an IP address again and then pulling up the default.ipxe script that leads to the iPXE Menu. So your computer is working perfectly. Not how you want, but working as designed. Now I assume when it exits from the iPXE menu that is when things go bad. (because the disk structure is efi and we are pxe booting in bios mode. So the bios default exit mode SANBOOT should be called to mount the disk and boot.
So now in your bios (firmware) do you need to enable the uefi network stack? On the dual boot systems they don’t typically come with the uefi network stack enabled in the firmware. I would focus on the firmware settings because outside of your computer everything is working as intended.
-
@george1421 Thanks for your help. There is not UEFI network stack option. That would be the obvious switch Just RealTek Option ROM, which is BIOS boot. I think ASUS 100% cut that corner.
-
@blankzapper That is an interesting mobo then. Does it have a boot manager like the Dells where when booting you can hit F12 to pick the boot device? On the Dells there is a bios and uefi section on the systems that support both formats. You can tell it to pxe boot bios or pxe boot uefi.
Something else to check, make sure that mobo is running the latest bios version. They might have corrected that behavior in a later firmware update.
Lastly FOG will image a uefi computer while running in bios mode. Just don’t have the target computer boot through pxe to the hard drive. Only enable pxe booting when you need to reimage the computer. In a way that is how I do it at my office. I only have it pxe boot when the IT Tech is specifically going to reimage a computer. This keeps them from accidentally deploying an image to the wrong computer if everything was setup automatically.
-
@george1421 said in Cannot boot hard drive from PXE menu:
@blankzapper booting you can hit F12 to pick the boot device?
No. Also there is no option to “Enable F12 Boot Menu”.
Something else to check, make sure that mobo is running the latest bios version. They might have corrected that behavior in a later firmware update.
It’s latest BIOS released 2014. Nothing newer. I am seeing if it possible to hack this AMI BIOS. I updated RealTek option ROM, to 2016 release, but it didn’t fix anything. I think UEFI network boot option must be added. I don’t know if this can be done without big changes (and likely bricked motherboard!)