File Not Found
Reasonably new install of fog here, it’s fine with most machines and still works as normal. However, a certain HP tower with a realtek gbe onboard network card is starting to get on my wick. When booting, I simply get a file not found error. It works fine on other machines, so I’m a bit reluctant to start changing things.
Is there anything obvious that I should be looking at?
Thanks in advance. Ben
Marking this solved here as the issue seams clear.
@bennewell Still it would be great to hear if you found what is causing it and can write a short summery for other users to know.
@bennewell Any progress on this? I am really keen to know what is causing this!!
@Uncle-Frank The error about the file being too big, aborting and asking again is how my environment works… :-\
DHCP Option 003 is what defines the router. It’s also how you’d create vendor class filters in Windows Server 2008 and lower… wonder if someone fat fingered something somewhere?
Maybe this is of help to you: http://forum.ipxe.org/showthread.php?tid=7029
This is on your own risk! I haven’t done this myself and don’t know enough about flashing a patched BIOS. I won’t be liable if anything goes wrong!
Seams like PXE booting is not the only issue with Realtek PCIe FE NICs. Even if you can get it to load undionly.kpxe you might still run into this: http://forum.ipxe.org/showthread.php?tid=7356
Quick fix, use undionly.kkpxe or ipxe.pxe but you need to get iPXE to load first. You can try loading it from CDROM or USB key (https://rom-o-matic.eu/ -> output format iso/usb) or maybe you can try to burn iPXE into the NICs EEPROM (I’ve never done this myself and don’t know anything about it - from what I read I think the PXE ROM is included in the BIOS if this is an onboard NIC - maybe BIOS update will help?)
That thread looks to be of a very similar issue. Although this was a fresh 1.2 install, rather than an upgrade.
The NIC is the same in this pc, it’s a HP machine, with a Realtek GBE FE card onboard. I have other machines (different brand) with the same card (or at least, the same model number) that boot perfectly.
I’ve just tried another identical machine, and it’s got the same issue, so it’s fault with that model of card, and not just this physical card specifically.
I’m all but a but stumped on this.
Thanks for the help so far.
@bennewell Thanks for uploading the packet dump. I am really glad you did. This seams to be a pretty special PC and I would have never thought about something like this happening (never saw this before).
First is a screenshot of TFTP transfer from one of your working clients (also in the packet dump):
Client asks for the file undionly.kpxe and gets an answer with the full file size (tsize). The clients knows that it cannot recieve a file that big in one go and aborts. Then requests the file again with block size option to recieve the file in chunks. Works like a charm.
Now lets look at the “faulty” client trying to get the file:
Pretty similar at first. But on the second request it requests a different file \003 which does not make any sense to me. Looking at the DHCP packet leading to the TFTP transfer I can’t see a problem. Looks all pretty solid. Only thing I wonder is that the “faulty” client seams to wait for about 8 seconds between DHCP discover and DHCP request although your DHCP provides offer and ack within milliseconds.
Has anyone else ever seen something like this. I kind of doubt that this particular NIC is incapable to boot via PXE properly. Searching the web didn’t reaveal much so far but I guess we will stumble upon a solution at some point.
Edit: Just found this which I didn’t know about anymore: https://forums.fogproject.org/topic/4528/1-2-0-upgrade-tftp-error-file-not-found (Interestingly it seams to be a similar NIC but probably not the same problem as you are able to PXE boot other machines on that switch/port)
Thanks for the replies guys, in answer to some questions:
Wayne: Option 66&67 in windows dhcp as per the book. No other policies in place.
Frank: The hardware is different in this one, none others the same on this site. I’ve tried different ports on the switch, different switches etc. I’ve got some 80 pc’s registered in fog from different points in the organisation, many booting fine from the cisco switch that it’s plugged into. Spanning-tree portfast is enabled, as is bpdu guard etc etc. My laptop boots to fog just fine in the same port if that helps anything?
Attached is the tcp dump results.file_not_found.pcap
Thanks for the help so far.
What’s running DHCP? Any DHCP Classes/policies ?
Is this “faulty” client the same hardware than all the others? Have you tried connecting it to a different port on the switch? Portfast setting on the switch port? If this all does not help then it would be interesting to see a packet dump.
Open a root terminal/console on your FOG server and install
tcpdumppackage. Then issue the following command:
tcpdump -i eth0 -w file_not_found.pcap. Let this command just sit there and start up that client. After you see the error on the screen go back to your server and stop tcpdump with Ctrl+c. Please upload the PCAP-file to the forum.
Just tried the realtek one, same error, please see photo attached. Other machines boot fine, this one has network access when booted, is plugged in to the same switch too.
At what point do you get the error, can you take a video showing the error?
I’m going to assume you are currently booting undionly.kpxe, so I would first try booting realtek.kpxe and seeing if that makes a difference.