NBP filesize is 0 Bytes; PXE-E18: Server response timeout
-
@foggymind ok with the fog server and cleints on the same subnet we can get the fog server to listen in on the dhcp process. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue
This will capture the pxe booting process. Once captured you can load into wireshark or upload so one of us can look at it. What you want to look for is multiple OFFER packets. You should have one from each DHCP that hears the clients DISCOVER packet. Look to see if you have more offer packets than known dhcp servers.
Also look into the Discover packet, there should be the bootp settings in the ethernet header. You should see a {next-server} this should point to your fog server’s IP as well as a {boot-file} that should be the name of the boot loader (undionly.kpxe for bios and ipxe.efi for uefi systems). Then down in the dhcp options make sure 66 is the ip address of the fog server and 67 is the proper boot loader. Do this for every offer packet making sure they ALL point to the same settings.
-
@foggymind Did some more testing and legacy PXE boot doesn’t work either, with error E-23. So I think it might be something related to the TFTP service. Not sure how to verify the status but I’ll do some digging to see what I can find.
-
@foggymind IMO the first step is to ensure that the clients are being told the proper information to boot. So wireshark/tcpdump is the first step. If the pxe clients are being told the right thing then focus on the tftp server or something getting in the way of tftp booting.
What devices are your dhcp servers? Are they windows dhcp servers?
I have seen where someone pluged in a home wifi router into a company business network that caused this random pxe boot information because it was giving out bad data. SOHO routers will often give their LAN IP address out as the {next-server} value. Depending on which DHCP server answers the client first (rouge dhcp or company dhcp) it caused pxe booting issues.
-
@george1421 They’re Windows servers but I confirmed my suspicions because it turns out the TFTP service wasn’t running for some reason. It’s been restarted now so I’m going to test again. I’ll let you know how it goes. Thanks for the input though, it’s been very helpful!
-
@foggymind said in NBP filesize is 0 Bytes; PXE-E18: Server response timeout:
Just started randomly receiving this error while attempting to UEFI boot
We probaly focused too much on the word “random” here… If you had told us this stopped working altogether, then we might have considered the TFTP service to be an issue but not if it’s random or if legacy BIOS clients still work. Anyhow. Let us know if you need further help with this.
-
@george1421 Thanks! The TFTP service definitely wasn’t running so that was an issue, but it is now and as a result of attempting to capture the traffic it seems as though the FOG server isn’t receiving any TFTP requests. I’m at a loss because I didn’t change any of the DHCP settings and confirmed they’re correct. I did follow this guide to set it up a while back: https://wiki.fogproject.org/wiki/index.php/BIOS_and_UEFI_Co-Existence but it had been working fine until now.
-
@sebastian-roth Thanks for the input, yeah I realize that you could interpret that as randomly occurring - should have rephrased it! Looks like the fog server isn’t receiving the TFTP requests for some reason.
-
It could be interperted as you said this:
Just started randomly receiving this error while attempting to UEFI boot
That is where we made a left turn in the discussion(right out of the gate).
Is the remote pxe booting client on the same subnet or beyond a router? IF beyond a router what is your MTU of the link?
-
@foggymind said in NBP filesize is 0 Bytes; PXE-E18: Server response timeout:
Looks like the fog server isn’t receiving the TFTP requests for some reason.
Either firewall (though this would be either working or not) or some random DHCP server in your network is answering as well and leads the PXE booting machines to a different TFTP server address.
Did you attempt to capture DHCP traffic in your network to see what’s going on?
-
@sebastian-roth Yeah that could definitely be it, I haven’t captured the DHCP traffic, is that something I would use wireshark for?
-
@foggymind Oh yeah, one more thing I just remembered. I noticed that ALL the hosts show up as a red unknown status in the fog interface. They used to show a green status if they were on. I know that they are on but they still show that status. I’m not sure if that happened at the same time, but it definitely wasn’t like that before. The weird thing is that if I apply a computer name change, the fog client picks it up and applies the change.
-
@foggymind I did this (below) and captured via wireshark but it doesn’t seem to show anything interesting, just the DHCP server that’s serving the IP address. Is there something in particular I should be looking for?
-
@foggymind You need to capture the pxe booting process not the dhcp release /renew. This is release/renew is handled by windows where the pxe booting process is handled by the uefi/pxe rom. The process is to start wireshark with a capture filter of
port 67 or port 68
to capture the dhcp process. If your fog server and target computers are on the same subnet then you can use tcpdump on the fog server to get the best details into what is not working.Again are your clients that you are trying to pxe boot beyond a router? I have see (and again just recently) that if your MTU of the link is set below the block size for tftp the file will not transfer because the packets will be fragmented. tftp doesn’t handle fragmented packets. So everything will look like its working up to the point where it tries to send the file to the target computer. If everything is on a LAN and your mtu is the default of 1500 then we should look elsewhere.
-
This post is deleted! -
I’m taking another look at this and basically have no idea where to go from here, any help would be greatly appreciated!
-
@foggymind Another note, I seem to be able to rename systems so communication with the fog client seems to be working.
-
@george1421 I ran tcpdump on the fog server and got the following results.
-
@foggymind What was your capture filter on the FOG sever? Was it according to the tutorial? https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue
From the screen shot it looks strange in that there was a discover, then 3 offers from the dhcp server and then after 4 seconds the client sends the request. What is the target computer playing hard to get?
I would expect to see after that ACK, a query from the target computer to the FOG server asking for the file on udp port 69.
Make sure you have the proper capture filter. Once you have a good pcap upload it to a file share site and post the link here or IM me the link. I need to look into each packet to find out what is going wrong. From the picture it should be working.
-
I am wondering why we see three offers all coming from the very same IP. Is this the way Windows DHCP servers do this when they are in a sync pool?
@foggymind Would be helpful if you could save and upload a PCAP file from Wireshark so we can have a deeper look as well to be able to help. If you don’t want to post this to the public then send George and/or me a private message in the forum.
-
@george1421
Thanks for the quick reply!From the screen shot it looks strange in that there was a discover, then 3 offers from the dhcp server and then after 4 seconds the client sends the request. What is the target computer playing hard to get? – not sure!
Yes, I followed that tutorial, is there something else I need to do with the filters or just type in the command?