FOG 1.5.2 TFTP OpenTimeout
-
@jvenus Is your fog server also running under virtual box? Its rare that tcpdump would mess up the pcap like this.
You can use wireshark on a witness computer with a little loss of clarity on what is going on. With wireshark you want to use the capture filter of
port 67 or port 68
we won’t see any tftp transfers but at least we can see what the dhcp server is telling the target computer. -
-
Yes. Fogserver is Debian 10 VM with the same Virtualbox that runs CentOS.
-
I used Wireshark on the Win10 host and this is all that was captured - win10_pcap.pcap
-
-
@jvenus OK I see the problem and have a solution for you.
The issue is PXE booting involves 2 locations where the pxe boot info needs to exist. Many routers as dhcp servers don’t get this right. PXE booting sometimes involves 2 protocols bootp and dhcp. In the pcap you provided the bootp fields are not filled out but dhcp options 66 and 67 are. The bootp fields are in the ethernet header as {next-server} and {boot-file}. Its kind of a toss up which fields a PXE rom will look at and most dhcp servers just automatically fill out both. So look at your dhcp server settings, see if there is something that mentions bootp and turn it on. If there isn’t an option then lets install dnsmasq on your FOG server to supplement the missing pxe booting information.
I have a tutorial on installing dnsmasq here: https://forums.fogproject.org/topic/12796/installing-dnsmasq-on-your-fog-server
TBH I have not tried to install dnsmasq under debian but the flow and configuration should be the same.
-
The ‘boot-file’ DHCP option in the Sophos UTM 9 is now set. The ‘next-server’ is not wanting to play nice. I’ll give that a kick and try to get that set and then run another capture. I’ll install the dnsmasq if that doesn’t work. It may be a few minutes before I know.
Thanks
-
Ok. Moderate success with the ‘next-server’. I set it to 192.100.
-
PXE boot doesn’t time out, so yay! But, now it gives this. (I hit “s” to enter PXE shell.) I put the ‘boot-file’ option the same file name as the option 67 (bootfile-name) which is ‘undionly.kpxe’. Those should be the same, no?
-
Wireshark capture - no_confi_method.pcap
-
-
@jvenus OK give me a minute to digest the pcap. But at this point where we typically see this fail (not in your case for vb) is that spanning tree is enabled and the port is not forwarding data yet.
So dhcp process is working because ipxe makes it to the target computer then ipxe startup. its seeing the network adapter because we see the mac address, but its not receiving the dhcp packets, in ipxe. I think at the command prompt key in
dhcp net0
and it should query again. -
Running
dhcp net0
gives the same result as in the screengrab. I’ll wait for more info when you get the time. Thanks so much. -
@jvenus I’m still looking into this, but looking at the pcap I can see what its doing.
For PXE part that is 100% good and the tftp is OK. iPXE starts up just fine then issues a dhcp. Now if you look at the pcap you see this cycle of discover offer discover offer and so on. What is going on here the discover asks for certain fields from the dhcp server and the fields returned from dhcp are not sufficient so iPXE queries again and the cycle repeats. What’s missing I don’t know yet. I’m getting a side by side setup.
-
@jvenus I’m not finding a smoking gun here, but something is missing from the dhcp request that iPXE feels it needs.
-
@george1421 Whatever information I can give, just let me know.
-
@jvenus Well I don’t know the answer. The ONLY thing I can see different between one that works and yours is that you are sending dhcp option 28 (broadcast address) to the target computer where normally its not sent.
-
@george1421 I’m going to try to get it working on Ubuntu 20.04 LTS this weekend. I’ll keep an eye on this post as well as let you know if there is any difference in this attempt.
-
@jvenus A different host OS isn’t the issue, its the dhcp server that seems to be the root of the issue. I just can’t find the different.
The discrepancy is between ipxe and your dhcp server. BUT what I might do is try a physical machine instead of VB. We have seem some strangeness in the way VB does things especially in regards to pxe booting. BUT I’m still conflicted, at this point iPXE should be in control of the network and VB not blocking things.
-
@george1421 I have an old machine I can throw linux on. I’ll do that this weekend instead of the Ubuntu VM.
-
@george1421 Interesting find…
I decided to attempt a pxe boot from that spare physical machine to the fogserver VM before installing / configuring FOG server on the physical machine and continuing testing to it instead of the current VM. Attached is the Wireshark output from the Win10 host while booting from a physical machine to the Debian ‘fogserver’ VM. Again, all three are in the same subnet. It seems like it passes all of the information just fine. The physical machine will boot into the FOG menu.
-
@jvenus See this is what I thought sometimes VB can be flaky when PXE booting. This is why I asked to test with a physical machine. From the pcap it appears the pxe booting process is normal. You see one set of dhcp requests for the PXE rom and then a second set for iPXE. If you were to boot into FOS Linux (by picking one of the registration functions) you would have seen a third dhcp sequence.
I remembered there was something we had to do to get around VB’s strangeness. A quick google-fu search of the forums suggest that you use ipxe.kpxe instead of unionly.kpxe with virtual box VMs. undionly.kpxe is a shim driver that uses the network card’s built in undi driver. So its very small and fast. If the built in undi driver in the network card is not behaving well FOG has the ipxe boot loaders that have all of the common network drivers built in. In this case ipxe.kpxe and the (recommended for general use) ipxe.efi for uefi systems. For bios mode you can typically use undionly.kpxe but if that driver doesn’t work then the larger ipxe.kpxe boot loader will work.
https://forums.fogproject.org/topic/10160/virtualbox-pxe-boot-no-configuration-methods-succeeded
-
@george1421 I can’t thank you enough for all of your help. I’ll switch over to the ipxe.kpxe bootfile instead for general use from now on. Should I have any problems, I’ll definitely let you know. But for now, many many thanks.
-
Well, I spoke too soon. To more thoroughly test it, I’m trying it on a newer laptop (Latitude 5510) on the same subnet. After switching the bootfile-name and option 67 to ‘ipxe.kpxe’, it’s not working on the 5510 but it does work on the original old machine (Optiplex 990 MicroTower) that I used last night for testing. There is no displayed error on the 5510 screen. It starts to pxe boot, says it’s “downloading NBP file…” and then goes straight into the Dell Support Assist (where it goes if it doesn’t find a boot device).
In the 5510 BIOS, I’ve got Network UEFI stack w/ PXE boot enabled. I’ve tried it with Secure Boot on/off; UEFI Boot Path Security on/off; POST behaviour set to ‘thorough’ (suggestions from https://www.dell.com/support/article/en-us/sln317555/bios-settings-to-allow-pxe-boot-on-newer-model-dell-latitude-laptops?lang=en). It’s not using a docking station but rather using the on-board NIC so no need for the USB options in that post. Although, they are enabled by default in this BIOS. But looking at the 5510.pcap, I can see it finds the ipxe.kpxe bootfile. The 5510 just doesn’t do anything after that. That makes me think it’s specific to that machine NIC or BIOS, especially since it works on the Opti990. That being the case, I don’t know what my options are given that the majority of my inventory items are newer Dell laptops.
Here are the pcaps.
5510.pcap
original_old.pcap -
Another thread with the same issue but different laptop model - https://forums.fogproject.org/topic/14685/dell-7000-series-laptops-pxe-booting. I reference it because he’s tried the same settings from that Dell KB website and has a video that shows the error.
And, of course, you already know this b/c you replied to the poster. Forgive my lack of sleep.
-
@jvenus Lets be clear ipxe.kpxe is only for fixing the booting issues with virtual box. In general use (physical machines) use undionly.kpxe for bios and ipxe.efi for uefi systems. Understand these two files are only used to get into the FOG iPXE menu (period). Also understand that these boot loaders are firmware specific. You CAN NOT boot a uefi based system with undionly.kpxe, conversely you can not boot a bios system with ipxe.efi. If you have both types of hardware in your environment you can not use static dhcp settings. If you have a windows 2012 or later dhcp server, or linux dhcp server then you can setup dynamic pxe booting based on the target computer.
Once imaging starts then FOS Linux takes over and iPXE is history. For this make sure you have the version 5.6.x series of kernels installed. Understand this is not the fog server host OS kernels but the kernels sent to the target computer for imaging.