Unable to boot to Fog
-
Server
- FOG Version: 1.3.0-RC-23
- OS: Debian
Client
- Service Version: 6017
- OS: Windows 10 Pro
Description
About a week ago we updated SVN. After the update we were getting a chainloader error. In order to keep our users from having to hit esc or remove the network cable, I removed the DHCP entry on our DHCP servers. After selecting the options 066 and 067 in on our DHCP, now Fog does not even come up on our systems. It errors out PXE-e53 No Boot File Found. Our original boot image was ipxe.efi but in forums I have seen that the boot image is undionly.kpxe. Can you clarify this, also?
-
No boot file found means it really couldn’t find the file.
Turn off your Windows Firewall temporarily on your desktop and try:
tftp –i x.x.x.x get undionly.kpxe
where x.x.x.x is your fog server IP.There’s an entire article on troubleshooting TFTP here:
https://wiki.fogproject.org/wiki/index.php?title=Troubleshoot_TFTP -
Please update to RC-25 first.
Seeing as you, from what I’m reading, pulled the plug from the hosts communicating, the DHCP isn’t handing out the information to properly PXE boot. You should re-add this entry.
If your systems are using EFI/UEFI bios you will need to use EFI/UEFI binaries (Ipxe.efi, snp.efi, etc…) unless you have your nic’s to boot through legacy mode.
If your systems are booting using Legacy/CSM/BIOS booting modes, you will need to use any of the *pxe labelled files (ipxe.pxe, undionly.kpxe, undionly.kkpxe, etc…).
Please update and maybe describe more what issue you’re seeing.
-
Update to RC-25 which also required a schema update. The BIOS is currently booting to legacy boot. Re-added the entries into DHCP. (My test machines in my office are a Dell Optiplex 990 and an HP EliteDesk 880)
Also ran the command tftp -i 10.0.0.17 get undionly.kpxe and ipxe.efi. Which appears to be successful.
Unfortunately neither of these steps have corrected the issue that when the system boots it still goes to the PXE-e53 - No boot file found.
-
@SharonLampe Which file is currently set in the DHCP options?
Legacy BIOS can’t boot .efi files and UEFI BIOS can’t boot anything BUT .efi files.
-
You should do a wireshark capture briefly using the
bootp
filter to capture the network booting process. So, start wireshark with that filter on one computer, then on another computer try to network boot. Save the capture file. You can look through it yourself or we can look through it. What we will look for is what’s identifying as a DHCP server, and that/those servers properly handing out the correct information in their responses. -
Not to send you in two different directions at once, but if your fog server, dhcp server, and pxe booting clients are on the same subnet I would suggest installing tcpdump on your FOG server and then do the following.
- Start tcpdump on the console of the FOG server, running as root, with this command
tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4100
- PXE boot the client until you get the error
- Press control-c to exit the tcpdump program.
- You can either review the output.pcap file with wireshark or post it here and we can/will look at it.
The answer is in the capture why / if you are getting the file not found error.
The reason why I like the tcpdump approach is that once the dhcp part of the conversation is complete, we will never see any unicast messages between the fog server and the target computer if you use an external (wireshark) monitor. Running the tcpdump program on the FOG server gives us some insight on what the FOG server is being asked to do (i.e. send the iPXE kernel file).
- Start tcpdump on the console of the FOG server, running as root, with this command
-
@george1421 I have the file on my Linux machine. I am not a Linux guru so I am not sure how to pull the file from Linux to get it to you. The Linux server is not GUI based. How would I retrieve these files to send them to you?
-
@SharonLampe I use pscp which is a putty component. But I’m a bit command line oriented. While I haven’t use it WinSCP should also work to copy the file from your linux server to your windows computer. From there you should be able to unload the pcap file to the FOG Forum servers.
-
@george1421 Attached is the file. Thank you so much for you help! 0_1479855472664_output.pcap
-
@SharonLampe well this one is pretty interesting cause I’m seeing double vision on this one.
A quick diagram would be this.
- Packets 31 to 35 are the same client saying hello I’m here world I need configuring. This is remarkable since this should only happen in one packet. Its the client sending out 4 dhcp discover requests rapid fire.
- I can see from this discover packet that this is a Dell computer that is running in bios mode. (so undionly.kpxe needs to be sent for dhcp option 67 {boot-file} and this is just a wild guess that is is one of the older dells (pre o7010).
- Packets 35 to 50 are replies from your dhcp server. This again is remarkable since there should only be one reply from your dhcp server. Looking at the time codes I can see these are in 3 groups of 4 responses each (which seems to align with the 4 packets send in step 1.
- looking at the dhcp server reply I can see that dhcp server 10.0.0.8 is replying to your client. And the address its replying with your IP address is 10.0.20.92 with a next server of 10.0.0.8 (your dhcp server!! which should be your FOG server if you want pxe booting to work). The subnet mask is 255.255.0.0 so both the dhcp server and the target computer are in the same subnet (this is good). Your upstream router is 10.0.0.1 (not really relevant).
- In that same conversation ID you have a second dhcp server 10.0.0.9 responding to the target computer with your IP address is 10.0.21.9 (not typically a good thing). Its telling the target computer that its next server (dhcp option 66) is itself 10.0.0.9 (still not good for pxe booting, this next server must be the FOG server).
OK from here I’ve seen enough to know why its not working.
Lets identify what servers are 10.0.0.8 and 10.0.0.9 are. Once that is figured out, we need to ensure they have dhcp options 66 {next-server} (should be your FOG server’s IP address) and dhcp options 67 {boot-file} (should be undionly.kpxe for bios based computers) values set. Right now they are not sending the client the proper information to boot via pxe.
And while this is only a suspicion I question why the target computer is ending out multiple dhcp discover requests. This is either because of a network misconfiguration, or something is not right with that target computer. There should be just one disover, with a reply from the dhcp server, then the target computer providing a full list of parameters its needs, and then finally the target computer responding with (OK got it). That should be 4 packets total to 30 ish packets.