Cannot PXE boot on Client PCs
-
@zguo Just to be clear, what device is your dhcp server? Is it the fog server or something else?
At this point FOG isn’t involved if FOG isn’t your dhcp server. The error message is telling you that the dhcp server isn’t answering your computer.
-
@george1421 Thanks for the reply! I only have one server, so that should be my DHCP server. I’ve set up FOG on this server. Both FOG and this server are assigned the same IP, which is the same as the one you see in the screenshot. So what should I do to verify the issue which bothers the connectivity from the Client PC to the server? Are there any further things that I should set up?
-
@zguo in your screen shot you have listed the full path to the boot file, that should only be the boot file name in reference to the /tftpboot directory which for tftp is the root directory for tftp.
I think you have something missing in your setup. You can use the FOG server or a witness (third computer) with wireshark installed to monitor the communications. If you use wireshark you can use a capture filter of
port 67 or port 68
Or follow the guide here to use the fog server to monitor the dhcp/pxeboot process. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue then load the output file into wireshark.So what you are looking for is the DORA process (DISCOVER, OFFER, REQUEST, ACK).
So the pxe booting computer will send out a discover packet (please configure me)
The dhcp server will respond with an OFFER packet that should contain the next-server and boot-file fields as well as dhcp options 66 and 67 which will align with the next-server and boot-file fields. My guess is that there is something wrong with OFFER packet or your dhcp server. -
@george1421 Thanks for your reply! I tried the Wireshark and I saw none of ports 67, 68, 69 or 4011 generate packets. But if I tried any other random ports then I could see some packets. So I guess like you said, maybe because the boot file name on the router setting is not correct?
We are using the Sophos router but we saw they require the full path to the boot file for the boot file name section in their instruction. So my question is, universally for FOG, the boot file name should always be a single file name, like
undionly.kpxe
oripxe.efi
, is that correct? If not, then how to set it up specifically on the Sophos router? Also, I saw plenty of boot files under the/tftpboot
directory. So what is the best boot file that you recommend me to try?Currently, I have restrictions to access the router settings, I may have to adjust the boot file name on Monday.
-
@zguo Not seeing packets regarding port 67 or 68 is suspicious, but also follows what you see on the pxe booting client. So I see a connection.
One other comment is that many soho routers (ones you might find in your home), do not support pxe booting properly. They work perfect for dhcp but fail to send out the right boot file names and often define the router as the pxe boot server.
dhcp works by sending broadcast messages to all computers on the local subnet, so a computer running wireshark or tcpdump should record these packets. If you ran tcpdump on the fog server and it did not record any dhcp packets that means that something is blocking these broadcast messages from getting to your fog server. Understand at the moment your problem is your network infrastructure and not specifically with FOG.
I do have to say that if your fog server and pxe booting computer is on a different subnet than your dhcp server you will need to configure your router to send dhcp broadcast messages across the router because they are not enabled by default.
Yes on only the file name and not path with FOG (there is an exception if you have a x32 bit uefi computer but I don’t think that applies here). The tftp server program uses the tftpboot directory as the root directory. This keeps the tftp service program from being able to download random files from your fog server. So it does a change root to /tftpboot. That is probably more than you care about at the moment. But in this case having a full path or not isn’t your current issue, getting the client and fog server to see the dhcp DISCOVER request is the first step.
-
@george1421 I updated the boot file name. Now the DHCP should be okay. From the attached image, you can see the Client PC can see the DHCP IP address, but then there’s the error of TFTP server response timeout. I still couldn’t see DHCP packets from either Wireshark or
tcpdump
but now the client PC can see the DHCP IP. I am not sure if it’s okay? If not, then how can I configure DHCP packets on my network?tftpd-hpa
on my server is active and running, and ufw firewall is blocked, but still cannot access TFTP from the client PC. What should I do in this case?tftp-hpa config:
# /etc/default/tftpd-hpa # FOG Modified version TFTP_USERNAME="root" TFTP_DIRECTORY="/tftpboot" TFTP_ADDRESS=":69" TFTP_OPTIONS="-s"
netstat -antup | grep ":69"
udp 0 0 0.0.0.0:69 0.0.0.0:* 739/in.tftpd udp6 0 0 :::69 :::* 739/in.tftpd
-
@zguo Ok this is a good start. Look at the “server” ip address, that should be your FOG server’s IP address. I’m guessing its your router’s IP address. As I mentioned before most soho routers put themselves as the boot-server / next-server.
We have to be missing something here, because you should be able to see the dhcp process with both tcpdump and wireshark, because the pxe booting computer is now getting the info.
-
@george1421 I powered on another PC, here DHCP IP here is 172.20.4.1? So this is like the previous one with the “server” IP written as 172.20.4.1?
I didn’t adjust the server IP address. It remains the same, it’s 172.20.4.25. Same with FOG and “next-server” in the router settings. Not sure why it shows 172.20.4.1. I guess maybe because the client PC cannot find the server, then it goes to the default gateway? Then back to the question, how can I let my client PC detect the server?
-
@george1421 Also, we confirmed that both the client and server are on the same subnet. Nothing is blocked between them
-
@zguo said in Cannot PXE boot on Client PCs:
Not sure why it shows 172.20.4.1. I guess maybe because the client PC cannot find the server, then it goes to the default gateway? Then back to the question, how can I let my client PC detect the server?
99% of the time, your dhcp server is not telling the pxe booting computer the right thing. And your dhcp server’s IP address is at .1
When you run wireshark are you running it as administrator AND are you picking the wired network interface to monitor. It should see multicast on any computer connected to the same subnet.
For wireshark, launch is as Administrator, then your wireshark startup screen should look like this before launching capture. If you don’t see any data or the ethernet port, then you did not run wiershark as admin. Make sure you enter a capture filter like this (1), and then select the network adapter where you see traffice (2)
-
@george1421 I followed your instructions and launched Wireshart as admin. Click this for my Wireshark file. The IP of the host with the Wireshark is 172.20.4.54, and the FOG server IP is 172.20.4.25. I only see they are communicating with port 80 (HTTP), not 67 or 68. If I run
tcpdump
on the FOG server, there’s more for port 22 (SSH), but still no for port 67 or 68. I don’t know why DHCP packets don’t exist. My router is Sophos. Anything could block the DHCP packets?And there’s a very small progress. I manually put the command of
chain http://172.20.4.25/fog/service/ipxe/boot.php
on the network boot page of the client host, so the client host can at least do the registration and inventory, and I can see the client host pops up on the FOG web. But since the client host still cannot find the FOG server automatically by the boot file, then I can’t capture the OS image.So what is the very first thing I should do now? Figuring out the network problem? But regarding the network, is that the thing I’m able to configure by myself, or do I have to contact the router provider?
Click here for the Sophos instruction. In their settings, they need both DHCP server and TFTP server. And looks like the DHCP server is defaulting the IP with
.1
. The IP that could be changed is the one for the “TFTP server”, which is the setup I don’t have. Seems like now there’s a mismatch between the setup that Sophos expects and the FOG expects. How can we figure out this mismatch? It seems like I have to make the FOG server the “TFTP server”, and then set up another one as the DHCP server? -
@zguo said in Cannot PXE boot on Client PCs:
I don’t know why DHCP packets don’t exist. My router is Sophos. Anything could block the DHCP packets?
I agree looking at the pcap there are no broadcast messages period. That is abnormal for a busy network. I can’t understand how your network is working, but obviously it is because you have clients picking up dhcp based addresses, right?
When we have situations where soho routers are used or people have routers managed or are unchangeable by a third party, we would typically install DNSMASQ on the FOG server to supply the pxe boot info. This normally solves the pxe boot issue, but your pxe booting clients are not even getting a dhcp address, that is before pxe booting in the startup process.
I have a tutorial for installing dnsmasq on your fog server in under 10 minutes, but my confidence level is at 40% that it will solve your issue. Once your pxe booting clients can get a dhcp address, dnsmasq should take care of sending the right boot file to your pxe booting computers. https://forums.fogproject.org/topic/12796/installing-dnsmasq-on-your-fog-server
This may be a question for your networking admins, but are you running dhcp snooping on your network. This would be a feature of your network switches to block and or limit rouge dhcp servers on your network. I don’t know if that would restrict broadcast messages. You have something strange (something I haven’t seen before) going on in your environment.
-
@george1421 Bravo!
dnsmasq
works. Now I can capture the OS image. Thank you very much! -
@george1421 I am running into the PXE boot problem on the ASUS motherboard, but other PCs work well for capturing and deploying the OS images. Here in this attached picture, you see that the server IP is correct, and a message “NBP file downloaded successfully”. But after that, it’s not booting to the FOG but just booting into the hard drive. The secure boot is disabled. I found that this ASUS machine can only do PXE boot with UEFI, but not with legacy. Should I change something on
dnsmasq
(ltsp.conf
in/etc/dnsmasq.d
), or I should perform other changes?Edit: The main issue might be “NBP file size is 0 Bytes”, so it didn’t technically download the boot file
-
@george1421 The size of
ipxe.efi
is 0 in the FOG server. So the booting page on the ASUS is not wrong. Is that normal?What I did with
dnsmasq
was just following your instruction. In/etc/dnsmasq.d/ltsp.conf
, only edit<fog_server_IP>
is edited
-
@zguo This issue is not related to dnsmasq. Something has zeroed out the byte size of that boot file. If you can the quickest way to fix this is to just rerun the FOG installer, it will recreate/fix file that were changed since its last run. I will not delete anything you added or changed in the UI. It will not touch dnsmasq either.
-
@george1421 The attached image is the other PC that is luckily working. This one is using
undionly.kpxe
. This shows the iPXE booting page that we are familiar with, and it’s the same as the one on iPXE’s official website. But the ASUS PXE booting page looks so different so I have no idea.Can you specify “rerun the FOG installer”? Do you mean reboot or reinstall FOG? I rebooted the FOG server,
ipxe.efi
size is still 0, and the ASUS PC is still not going into the FOG. -
@zguo how did you install fog on this server? Did you use the tarball file or git and pulled from the repo? Either way there should be a fogproject folder, and I think bin/installfog.sh bash script to rerun the fog installer.
-
@george1421 Cool! After reinstalling FOG,
ipxe.efi
has file sizes now so the ASUS PC can PXE boot withipxe.efi
. Thank you very much!