File Not Found
-
Is this “faulty” client the same hardware than all the others? Have you tried connecting it to a different port on the switch? Portfast setting on the switch port? If this all does not help then it would be interesting to see a packet dump.
Open a root terminal/console on your FOG server and install
tcpdump
package. Then issue the following command:tcpdump -i eth0 -w file_not_found.pcap
. Let this command just sit there and start up that client. After you see the error on the screen go back to your server and stop tcpdump with Ctrl+c. Please upload the PCAP-file to the forum. -
What’s running DHCP? Any DHCP Classes/policies ?
-
Thanks for the replies guys, in answer to some questions:
Wayne: Option 66&67 in windows dhcp as per the book. No other policies in place.
Frank: The hardware is different in this one, none others the same on this site. I’ve tried different ports on the switch, different switches etc. I’ve got some 80 pc’s registered in fog from different points in the organisation, many booting fine from the cisco switch that it’s plugged into. Spanning-tree portfast is enabled, as is bpdu guard etc etc. My laptop boots to fog just fine in the same port if that helps anything?
Attached is the tcp dump results.file_not_found.pcap
Thanks for the help so far.
-
@bennewell Thanks for uploading the packet dump. I am really glad you did. This seams to be a pretty special PC and I would have never thought about something like this happening (never saw this before).
First is a screenshot of TFTP transfer from one of your working clients (also in the packet dump):
Client asks for the file undionly.kpxe and gets an answer with the full file size (tsize). The clients knows that it cannot recieve a file that big in one go and aborts. Then requests the file again with block size option to recieve the file in chunks. Works like a charm.Now lets look at the “faulty” client trying to get the file:
Pretty similar at first. But on the second request it requests a different file \003 which does not make any sense to me. Looking at the DHCP packet leading to the TFTP transfer I can’t see a problem. Looks all pretty solid. Only thing I wonder is that the “faulty” client seams to wait for about 8 seconds between DHCP discover and DHCP request although your DHCP provides offer and ack within milliseconds.Has anyone else ever seen something like this. I kind of doubt that this particular NIC is incapable to boot via PXE properly. Searching the web didn’t reaveal much so far but I guess we will stumble upon a solution at some point.
Edit: Just found this which I didn’t know about anymore: https://forums.fogproject.org/topic/4528/1-2-0-upgrade-tftp-error-file-not-found (Interestingly it seams to be a similar NIC but probably not the same problem as you are able to PXE boot other machines on that switch/port)
-
That thread looks to be of a very similar issue. Although this was a fresh 1.2 install, rather than an upgrade.
The NIC is the same in this pc, it’s a HP machine, with a Realtek GBE FE card onboard. I have other machines (different brand) with the same card (or at least, the same model number) that boot perfectly.
I’ve just tried another identical machine, and it’s got the same issue, so it’s fault with that model of card, and not just this physical card specifically.
I’m all but a but stumped on this.
Thanks for the help so far.
-
Seams like PXE booting is not the only issue with Realtek PCIe FE NICs. Even if you can get it to load undionly.kpxe you might still run into this: http://forum.ipxe.org/showthread.php?tid=7356
Quick fix, use undionly.kkpxe or ipxe.pxe but you need to get iPXE to load first. You can try loading it from CDROM or USB key (https://rom-o-matic.eu/ -> output format iso/usb) or maybe you can try to burn iPXE into the NICs EEPROM (I’ve never done this myself and don’t know anything about it - from what I read I think the PXE ROM is included in the BIOS if this is an onboard NIC - maybe BIOS update will help?) -
Maybe this is of help to you: http://forum.ipxe.org/showthread.php?tid=7029
This is on your own risk! I haven’t done this myself and don’t know enough about flashing a patched BIOS. I won’t be liable if anything goes wrong!
-
@Uncle-Frank The error about the file being too big, aborting and asking again is how my environment works…
DHCP Option 003 is what defines the router. It’s also how you’d create vendor class filters in Windows Server 2008 and lower… wonder if someone fat fingered something somewhere?
-
@bennewell Any progress on this? I am really keen to know what is causing this!!
-
Marking this solved here as the issue seams clear.
@bennewell Still it would be great to hear if you found what is causing it and can write a short summery for other users to know.