VMWare Workstation 16 - Some VMs Won't PXE Boot From FOG Server
About six months ago, I setup a FOG Server Virtual Machine for deploying images to my other virtual machnes.
For the past month, I’ve noticed that if I create a new Virtual Machine and try to PXE-Boot off of my network, sometimes it works and connects to the FOG Server without any issues, but other times when it PXE Boots, it grabs an IP address and says “Downloading NBP File” but it than automatically goes back to the boot menu, with subsequent attempts saying “Downloading NBP File…” for half a second and than going back to the boot menu
I’ve looked at the differences between the VMs that can PXE Boot and the ones that can’t and the only differences are the amount of RAM and the size of the HDD/SSD, everything else is exactly the same.
Here is my following network setup:
PFSense - Router & DHCP Server | Network Booting is enabled and points to my Ubuntu VM, which hosts FOG
The default BIOS filename I’m using is “ipxe.efi” as all my virtual machines are set to UEFI
Ubuntu - FOG Project Server | Hosts the FOG Project
Server 2019 - Domain Controller | My Domain Controller for my environment
I’ve also attached a video demonstrating the “Downloading NBP File” issue as well.
If anyone has any suggestions, please feel free to respond!
Please note that during the “Page Swipe”, I shut down the VM and than turned it back on
Video Link - https://i.imgur.com/tJfQ0AM.mp4
@thecount1829 It might be interesting to see if there is any differences in the packet captures between working and not working for pxe booting.
I have a tutorial here on how to use the FOG server to capture the pxe booting process. You will get the best picture of the issue/problem if the pxe booting computer and the fog server are on the same subnet when tcpdump is used. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue
While I don’t have a reason to suspect this, it kind of sounds like you might have 2 dhcp serves on your network, possibly a primary and backup where the backup may not be configured for pxe booting? Then at random times the backup dhcp wins and the target computer doesn’t pxe boot. The pcap will show the complete DORA process (Discover, Offer, Request, Ack/Nak). One of the first things would be to look to see if there is more than one offer packet generated. Offer packets are generated by any dhcp server in ear shot of the PXE booting computer’s discover packet.
@TheCount1829 Are all the VMs in the same mode - UEFI or legacy BIOS?
@sebastian-roth Thanks for the reply!
All VMs are set to the same - UEFI with “Secure Boot” turned off. However, I did notice that if I change them to “BIOS” and than change the BIOS filename on PFSense to kdionly.pxe, they boot every time
Thanks for the reply! All VMs are on the same /24 subnet. I did have a Server 2016 DHCP server that was handling DHCP before, but that has since been turned off, although it’s not entirely erased from the disk. I’ll give that a shot tonight!
@thecount1829 Within pfsense in the netboot section (I think) of the dhcp server there are actually 3 fields you need to populate.
That should be all you need. DNSMASQ running on pfsense will detect the pxe booting computer type and then send out the proper boot file name.
Just to confirm, is the “i386/” need to be included in the “UEFI32” filename?
@thecount1829 Yes. To prove me right or wrong (I have been both at one time before) On the fog server look in the tftpboot directory you will see ipxe.efi in there as well as a folder called i386. Inside the i386 directory confirm that ipxe.efi is in there too.
I do have to say you don’t see to many 32 bit uefi systems unless they are really cheap devices.
You’re 100% correct! iPXE.efi and a folder called “i386” are in the “tftpboot” directory!
@thecount1829 well I was close to correct I guess the directory is called
i386-efiand not just
i386So you will need to update your pfsense install to
i386-efi/ipxe.efi. Close only counts for hours shoes and hand grenades.
@george1421 So, some great news and some (still good but not great news) The VM that wasn’t PXE Booting now is after adding the “i386-efi/ipxe.efi”, but I am now getting this error instead, and once again only on this VM
OK I think we might be onto something here. I want you to go onto the fog server and rename
/tftpboot/i386-efi/ipxe.savjust temporarily. PXE boot the target VM and it should error out (not what you posted in the last screen shot). If that is the case rename the file back. This will tell us that everything is working correctly (well not really, just the mechanics are working) with the dhcp server.
Now what we need to have is either a witness computer running wireshark or use the fog server if the target computer is on the same subnet as the fog server. For wireshark on a witness computer use the capture filter of
port 67 or port 68if you use the fog server here is the tutorial: https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue
Now those capture filters will only show us the DORA dhcp process. (Discover, Offer, Request, Ack/Nak). Specifically look at the Discover packet the VM sends out to say “hello world”. Within that discover packet dhcp option 94 or 93 (sorry I can’t remember) is where the client tells what type of computer it is. It would be interesting to know what kind of computer it is announcing as.
Now look at the Offer packet from the dhcp server. In the ethernet header will bee a boot-file field as well as dhcp option 67. Does the boot file match the type the computer claims to be?
If you have difficulties reading the pcap, upload the pcap to a public file share (dropbox, google drive, etc) and post the link here and we will take a look at it.
What I’m suspecting is that if the target computer is a uefi 64 bit, but for some reason the 32 bit uefi version of iPXE is loaded, it might have a problem running bzImage (fos linux) trying to start a 64 bit image running in a 32 bit ipxe environment. This is only a guess, your discovery will give us more clues to why.
@george1421 Sorry for the late reply!
I have just renamed my “ipxe.efi” to “ipxe.sav” and than rebooted the FOG Server to make sure the changes were applied, and the strange thing is that the virtual machines are still booting to FOG without an issue:
“ipxe.efi” to “ipxe.sav”
You did this in the i386-ipxe directory. If yes then I expect them to pxe boot OK, it was just this one where, when you said you fixed the pfsense setting things started working drew me into this path. FWIW all of your vms should never use the UEFI32 bit boot loader unless you created the VM very wrong.
@thecount1829 OK you need to look into that VM a bit more because its announcing itself as a 32 bit computer. See in the discover packet (when the client says hello world)
This should be type 7 or 9 for a 64 bit computer.
Here is the hardware type table.
Type Architecture Name ---- ----------------- 0 Intel x86PC 1 NEC/PC98 2 EFI Itanium 3 DEC Alpha 4 Arc x86 5 Intel Lean Client 6 EFI IA32 7 EFI BC (EFI Byte Code) 8 EFI Xscale 9 EFI x86-64
@george1421 I am very embarrassed to be typing this, but I believe I’ve figured out the issue. When I go to create a VM in VMWare Workstation 16, I have two options of “Typical” which uses Easy Install and does most of the installation in the background, as well as “Custom” install, where I choose all of the settings manually.
I also have several different ISO versions of Windows 10, including “Windows 10 (Version 1809)”, “Windows 10 (Version 20H2”, and “Windows 10 (Enterprise Editions)”. Both “Windows 10 (Version 1809)” and “Windows 10 (Version 20H2)” are 64-bit while “Windows 10 (Enterprise Edition)” is 32-bit
Now when I was creating a new VM just about an hour ago to do some more testing, I was using both “Easy Install” and the “Windows 10 (Enterprise Edition)” ISO, which Easy Install listed just as “Windows 10”, while using “Windows 10 (Version 20H2)”, Easy Install says “Windows 10 x64”
After seeing this, I continued with creating the “Windows 10 (Enterprise Edition)” VM and once again, I couldn’t get it to PXE-Boot, also correlating to Wireshark detecting a 32-bit OS. However, when I went to create a “Windows 10 (Version 20H2)” and/or a “Windows 10 (Version 1809)” VM, it would PXE-Boot no problem at all, so I believe the issue was with the actual ISO and is now resolved!
Thanks so much for your help!
@thecount1829 Well I don’t see this as a total waste of time. Mainly because you now know more than you did this AM. You also learned how to debug pxe booting issues. Knowing where to look is the key.
As I tell the youngins I mentor, “If it would have worked the first time, what would you have learned?”. Its all good here. In the end you solved the problem and can keep moving with your project. Well done!!
issue was with the actual ISO and is now resolved!
With easy mode vmware must tailor the VM config to match the settings on the install ISO. I have vmware workstation 15 on my office computer I’ll have to see if there is a way to tweak the VM this way.