When DHCP-PXE booting process goes bad and you have no clue

  • Moderator

    Sometimes when we can’t understand why the pxe booting process is going sideways and we’ve touched on the common causes, the easiest solution is to setup the FOG server to eavesdrop on the dhcp setup process between the dhcp server and pxe booting client.

    This process will work perfectly if the fog server, dhcp server, and pxe booting client are in the same broadcast domain (IP subnet, network, vlan). It will work if your fog server and pxe booting client are on the same subnet too, its just not as clean of a trace. And the last way to achieve this is to use a second computer running wireshark on the same subnet as the pxe booting computer. Setting up the wireshark filters are similar to the tcpdump filters, but that is a bit beyond the scope of this tutorial.

    This is going to be a pretty low impact test. We just want to capture a packet trace of the pxe booting process to the error.

    First a little background. The DHCP protocol is broadcast based. That means that discovery, offer, request and ack are all sent as broadcast messages (because the client doesn’t have an IP address during this process). Knowing this fact we can eavesdrop on the communication between the dhcp server and pxe client with the FOG server as long as all three are in the same broadcast domain, subnet, vlan, etc.

    So what we need to do is this:

    1. Install tcpdump on your FOG server from your linux distributions repository.
    2. Start tcpdump on the FOG server’s linux console with this command tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4011
    3. PXE boot the target computer until you see the error or the FOG iPXE menu
    4. Wait about 5 seconds then hit ctrl-C on the FOG server’s linux console.
    5. You can review the pcap with Wireshark or upload it to a developer/moderator for their review.

    Just a quick sidebar: We are telling tcpdump to write the output of the packet capture to output.pcap. And we have setup some filters because we only care about dhcp (port 67 and 68), tftp (port 69), and dhcpProxy (4011). One thing you should do is keep the time when you start tcpdump and start the pxe boot process on the client as short as possible. Because if you have a busy dhcp network we may key in on the wrong dhcp boot process. So you want to start tcpdump and then right away start the target pxe booting.