PXE-E23: client recieved tftp error from server (linux)



  • Hello,

    I’m trying to pxe boot a linux server. I get the following error: pxe error I have my edge router functioning as a DHCP server. Here’s how I have that configured: DHCP server config. I tried referencing these threads: https://forums.fogproject.org/topic/11749/uefi-pxe-not-downloading-ipxe-efi-file , https://forums.fogproject.org/topic/10919/pxe-can-t-find-correct-bootfile-size they don’t seem to relate however. This is part of an ongoing project and so we already have some machines working with the pxe boot but I can’t seem to establish new machines. I want to be able to push some of the images I’ve collected to the other machines for testing. Last semester we had an individual familiar with FOG but he’s gone now, so it’s up to me now so any help is appreciated. I’ve collected some tcpdump info as well if that might help.

    -chanstag-


  • Moderator

    @chanstag A proper dhcp/pxe boot process should be this.

    1. PXE client - Discover
    2. DHCP Server - Offer
    3. PXE client - Request
    4. DHCP Server - ACK.

    That second pcap is discover, offer 1, offer 2, discover. Meaning that the pxe client didn’t get what it needed to even get an IP address.

    So a couple things I’ve gleaned from looking at the pcap.

    1. If your dhcp server is running on your router, its not responding to the dhcp discover from the pxe booting computer.
    2. You have a non-fog created dhcp server running on the fog server. Its responding with an IP address but its not sending the boot file or next server information. The FOG created dhcp server will send the proper boot information.
    3. You have dnsmasq installed on the FOG server, and it IS responding correctly. But because there is a dhcp server also running at the same IP address the target computer is getting confused and taking the dhcp offer and not the proxy dhcp offer.
    4. Your target pxe booting computer is a uefi type BC (00007). Not really important, but dnsmasq should be sending ipxe.efi to the target computer so it can boot.

  • Moderator

    @chanstag This is very strange. I see 2 dhcp responses from the fog server, but none from your router. What is the dhcp server for the 192.168.1.0/24 subnet?

    What is device 192.168.1.2?

    Also why am I seeing to responses from your FOG server? Do you have dhcp services running there as well as dnsmasq?

    Check to see if you have a dhcp server running on the fog server. ps aux|grep isc-dhcp

    Also did you by chance replace the original pcap with this new one. Now they both look the same. I was trying to see if I missed something the first time.



  • Ok, I setup dnsmasq v2.76 as you said but no luck. I’m getting this error: PXE-E18: server response timeout. Perhaps I’m supposed to do something on the router? Here’s what that looks like now: https://imgur.com/a/ha7zbZH. Also here’s another tcpdump: https://drive.google.com/open?id=1IMd2Mg_Iwlvrg_Lj_8WtOmu1XNOnO3Ur. Here’s the logging from dnsmasq if that helps any: https://imgur.com/a/EEQTtzV.


  • Moderator

    @chanstag Well I think I found the issue but there is not much we can do about it. The issue is in the structure of the response from your dhcp server. Basically strings need to be terminated with a null character to define end of string. While I can’t post pictures at the moment, but in the dhcp server response, dhcp response 67 the ipxe.efi is being terminated with 0xFF (which also happens to be 377 octal).

    If you look at the pcap with wireshark you can see the very last line is the tftp request for ipxe.efi\377 which lines up with the dhcp server Offer and ACK packets dhcp option 67. If you look at the hex codes after ipxe.efi there is an 0xFF, which should have been 0x00 to signal end of string. While what the dhcp server is sending is not incorrect because there is a byte count for that parameter, some PXE implementations don’t follow the byte count variable, but instead rely on the null character.

    So what do we do?

    Use dnsmasq on the fog server to supply the pxe boot information.
    The quick steps are this.

    1. Remove the pxe boot information from your router.
    2. Install dnsmasq service from your linux distribution’s repo
    3. Make sure its at least version 2.76 by issuing this command at the fog server’s linux command prompt sudo dnsmasq -v The version needs to be 2.76 or later.
    4. Create a configuration file called ltsp.conf in /etc/dnsmasq.d directory.
    5. Paste this content into that file.
    # Don't function as a DNS server:
    port=0
    
    # Log lots of extra information about DHCP transactions.
    log-dhcp
    
    # Set the root directory for files available via FTP.
    tftp-root=/tftpboot
    
    # The boot filename, Server name, Server Ip Address
    dhcp-boot=undionly.kpxe,,<fog_server_IP>
    
    # Disable re-use of the DHCP servername and filename fields as extra
    # option space. That's to avoid confusing some old or broken DHCP clients.
    dhcp-no-override
    
    # inspect the vendor class string and match the text to set the tag
    dhcp-vendorclass=BIOS,PXEClient:Arch:00000
    dhcp-vendorclass=UEFI32,PXEClient:Arch:00006
    dhcp-vendorclass=UEFI,PXEClient:Arch:00007
    dhcp-vendorclass=UEFI64,PXEClient:Arch:00009
    
    # Set the boot file name based on the matching tag from the vendor class (above)
    dhcp-boot=net:UEFI32,i386-efi/ipxe.efi,,<fog_server_IP>
    dhcp-boot=net:UEFI,ipxe.efi,,<fog_server_IP>
    dhcp-boot=net:UEFI64,ipxe.efi,,<fog_server_IP>
    
    # PXE menu.  The first part is the text displayed to the user.  The second is the timeout, in seconds.
    pxe-prompt="Booting FOG Client", 1
    
    # The known types are x86PC, PC98, IA64_EFI, Alpha, Arc_x86,
    # Intel_Lean_Client, IA32_EFI, BC_EFI, Xscale_EFI and X86-64_EFI
    # This option is first and will be the default if there is no input from the user.
    pxe-service=X86PC, "Boot to FOG", undionly.kpxe
    pxe-service=X86-64_EFI, "Boot to FOG UEFI", ipxe.efi
    pxe-service=BC_EFI, "Boot to FOG UEFI PXE-BC", ipxe.efi
    
    dhcp-range=<fog_server_ip>,proxy
    
    1. Be sure to replace <fog_server_ip> exactly with the IP address of your fog server. Be aware that <fog_server_ip> appears multiple times in the config file.
    2. Save and exit your text edit.
    3. Issue the following command to restart dnsmasq service sudo systemctl restart dnsmasq
    4. Ensure that dnsmasq service is running in memory by issuing this command ps aux|grep dnsmasq. You should see more than one line in the response. If its running then go to step 10.
    5. Ensure that dnsmasq starts when the system is rebooting with sudo systemctl enable dnsmasq
    6. PXE boot a target computer. With skill (luck) you should see the fog iPXE menu.

  • Moderator

    @chanstag Ok looking at it now. I do see what looks like octal 377 after ipxe.efi. That is the unprintable character. I’m looking at the pcap more in depth at the moment.




  • Moderator

    @chanstag Well I guess the next step is to grab a pcap of what is going on the network. Please follow this procedure and upload the pcap file here: https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue

    There is something unexpected going on here. If we can’t find out what your router is doing, we can move the pxe boot setting into dnsmasq running on the fog server. But lets see what the pcap file tells us.



  • @george1421 I noticed that as well. So I tried to cleanup any white space around it and reentering the the name multiple times, it still produces that extra character. Also that is correct, the FOG server’s IP is 192.168.1.2 .


  • Moderator

    @chanstag Your picture is interesting, in that after ipxe.efi there is an unprintable character shown as a block character. Make sure there isn’t any spaces or other junk after .efi in your router’s configuration.

    It is also safe to assume the fog server is at 192.168.1.2?

    If after checking your router settings and still are not having luck, we’ll get a pcap of the pxe booting process and look at what is flying down the network wires.



  • That’s actually what I had it set to before. After changing the bootfile-name setting in my router back to “ipxe.efi”, It gives me the same error. Here’s the screen shot: https://imgur.com/a/5GZUh2K. Thanks for the info though, I didn’t know what the difference in the two files were.


  • Moderator

    When you see NBP in the pxe boot loader that tells us your target computer is in UEFI mode, the error picture indicates you are sending undionly.kpxe to the target computer. Undionly,kpxe is a bios mode boot loader. For a UEFI system you need to send ipxe.efi boot loader. From the screen shot of your router’s dhcp settings it appears that it doesn’t support dynamic pxe boot file names. In that it will send a bios boot loader name for a bios system and a uefi boot loader name for a uefi base system.


Log in to reply
 

525
Online

5.9k
Users

13.3k
Topics

125.1k
Posts