UEFI: NBP downloaded successfully - then blackscreen
-
Hello there!
I recently deployed a FOG Server onto my network, which only worked for legacy-bios hosts.
When trying to network boot an UEFI device I first get the following screen:
Then blackscreen. And this repeats every few minutes.I tried .efi files as boot files - did not work (or did I do anything wrong - I don’t know).
A possible solution would be this:
https://wiki.fogproject.org/wiki/index.php?title=BIOS_and_UEFI_Co-Existence#Using_ProxyDHCP_.28dnsmasq.29
but unfortunately I have to use dnsmasq as a DHCP proxy because there is an existing DHCP server in my network that I am not allowed to alter.I am happy to provide ma dnsmasq conf (I am sorry - its a mess, suggestions for improvement desired) and a .pcap file (from the failed “connection” attempt) at this link:
https://github.com/gabrielzeit/fog-files.gitI would really appreciate the help!
Thanks! -
@gabrielzeit said in UEFI: NBP downloaded successfully - then blackscreen:
Then blackscreen. And this repeats every few minutes.
Ensure that secure/safe boot is disabled on the target computer.
-
@george1421 Secure boot was already disabled (I already stumbled across this solution) - still not different.
-
@gabrielzeit I have to run to a meeting but your pcap is interesting in that you have 3 dhcp servers and 2 of them are pointing at a WDS deployment server.
.201 and .1 are pointing to WDS and .49 appears to be the fog server.
This configuration will cause intermittent issues.
-
@george1421 You are right - there are multiple dhcp servers But I disabled the isc-dhcp-service on my FOG server so I am more but puzzled why this still acts as a dhcp server.
Also my System-Admin tells me that we do not have a WDS server so that puzzles me even more
But here comes my newbie question: shouldn’t dnsmasq be able to handle this? Or is there any option for my case to get this right without changing anything about the dhcp servers?
-
@gabrielzeit Remember dnsmasq configured this way is also a “dhcp server”
If we look at the packet from the .205 dhcp server, you will see the next server points to .113 (assume to be wds server) and the boot file is surely a wds boot loader (akin to undionly.kpxe). The .1 dhcp server (looks like maybe a router) is also telling the same story.
-
@george1421 Okay that’s unfortunate, so you are telling me there is already another instance in the network trying/stealing my network boot? I have a real hard time here understanding this issue.
Do you have a concrete advice for me? I am a bit overwhelmed
PS. I tried network booting with a completely other device, and it was working. However, I primarily need the device to network boot that can’t do it.
-
@george1421 Additionally, I want to remind that the whole process works with legacy BIOS, just not with UEFI BIOS. So maybe there is the problem? (I already tried using .efi files)
-
@george1421 Sorry for the spam today.
I figured, that the problem might be with the NBP file (that I am serving the wrong one), because it’s working just fine with legacy.My dnsmasq config is the one that you have provided: https://forums.fogproject.org/topic/12796/installing-dnsmasq-on-your-fog-server?_=1693476360006
If it’s any worth: the machine I am trying to network boot is a Zotac Zbox (and thus has it’s own little bios distro (looks that way))
-
@gabrielzeit said in UEFI: NBP downloaded successfully - then blackscreen:
so you are telling me there is already another instance in the network trying/stealing my network boot
In a word, yes. The problem comes with two different sets of pxe boot instructions sent to your computer, it depends on which offer packet get to the target computer first, is the one it will follow. So like 18 out of 20 pxe boots (made up number to explain a point) dnsmasq may respond first, but the other 2 times this WDS service will respond first making your boot into FOG fail. Its something to look into but as you said, is probably not your problem pxe booting that device.
-
@gabrielzeit said in UEFI: NBP downloaded successfully - then blackscreen:
the machine I am trying to network boot is a Zotac Zbox
This device may simply not be compatible with iPXE. We can get you setup to usb boot into FOS linux to capture/deploy to this device if iPXE will not work.
-
@george1421 said in UEFI: NBP downloaded successfully - then blackscreen:
This device may simply not be compatible with iPXE.
But this would also mean that I completely can’t PXE boot my device, however the whole thing works when doing it with legacy BIOS, so my educated guess would be that there is some problem with the NBP file that I am serving. Any suggestions?
Sadly USB boot is not suitable for my use case
Concerning the “stealing” part: I will ask a system admin to look into this, because actually we shouldn’t have any WDS in our network
-
@gabrielzeit IPXE is a 3rd party software. When we say it may not be compatible with iPXE, I believe @george1421 was referring specifically to UEFI.
There are three (main) things to attempt:
snp.efi - SNP standard driver plus a few “custom normal drivers” from iPXE developers.
snponly.efi SNP standard driver, and that’s it. It’s using the SNP stack on the network card itself and that’s it.
ipxe.efi IPXE with standard ipxe built drivers that seem generally good.As neither of these files seem to be working for you you could attempt to build a firmware specific EFI file such as realtek.efi or intel.efi but this is outside the scope of direction we could possibly give you.
If pxe booting works with legacy, and you don’t mind having a configuration of dhcp that doles out the correct boot file for a specific boot type (UEFI vs Legacy), why not just keep that device in Legacy BIOS mode?
-
@gabrielzeit said in UEFI: NBP downloaded successfully - then blackscreen:
Sadly USB boot is not suitable for my use case
You are not getting to the FOG iPXE menu, this is an iPXE issue or as I mentioned before your pxe booting is getting hijacked by that WDS suspicious server. Your firmware doesn’t really indicate what is wrong, it just reboots. So that’s not really helpful in debugging.
You could load ipxe onto a usb stick (yes I know you said that USB wasn’t an option) to see if ipxe will boot from the usb drive for testing. If it does then you know the problem is somewhere along the pxe booting chain and not ipxe itself.
I don’t have the pcap anymore, but looking over the thread, I see this WDS server is responding with a bios boot file (wdsnbp.com) that, if sent to the uefi computer, would cause a uefi computer to download it and just reboot. I wanted to see if the DHCP DISCOVER from the client was saying it was a uefi computer, and then this rouge dhcp server was responding with a bios boot loader. That would be strange but also cause the symptoms.
-
@george1421 @Tom-Elliott
Luckily I could schedule a meeting with my System-Admin today, and he tried turning off the serving of the WDS boot file - seems in the configuration of option 66 and 67 of our companies DHCP server was an old Windows Server that served a meaningless WDS file, we changed it to my FOG Server and the issue is now solved.I am very thankful for your first-class support and I apologize for the inconvenience! You guys are my heroes!