• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Some computer models adding garbage bytes to undionly.kpxe tftp filename, causing failure to PXE boot.

Scheduled Pinned Locked Moved Solved
FOG Problems
4
9
700
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E
    EBCF
    last edited by Nov 11, 2019, 4:26 PM

    Hi all,

    I’ve recently got a FOG server up and running and am having problems with some models of computer failing to PXE boot. (Specifically, Dell Vostro 220, which the wiki reports as working). I’ve had other models of computer go to the iPXE menu no problem.

    On the affected PC the error is

    PXE-T01: File not found
    PXE-E3B: TFTP Error - File Not Found
    PXE-M0F: Exiting PXE ROM

    Doing some digging, I ran a pcap on the FOG server’s host, and discovered that for the affected computer model, the source file name in the TFTP read request has a bunch of garbage bytes appended to it, which are not there for the computers that behave normally. I have attached an excerpt of the packet capture, showing first the TFTP traffic for a “good” computer, trimmed after the first data block, and after that the traffic for the “bad” computer.

    Our environment is DHCP provided by a Unifi Security Gateway, FOG running in an Ubuntu 18.04 LXC container on Proxmox.

    I updated the affected PC to the latest available BIOS and that didn’t help.

    How can I resolve this problem? The idea that comes to my mind is to make a copy of undionly.kpxe named to match what the computers are requesting, but that seems like a kludge and I’m wondering if there’s any better way.

    excerpt.pcap

    (PS: It seems like every packet is quadruplicated in the capture. I think that’s just a quirk of running tcpdump on Proxmox, I probably used the wrong tcpdump options.)

    1 Reply Last reply Reply Quote 0
    • D
      Daniel Miller
      last edited by Nov 12, 2019, 7:59 PM

      @EBCF Another route you might be able try if the dnsmasq gets problematic is to use a mapfile for the tftp server. Documentation for the format is here and I recall that you edited the xinitd service specification at /etc/xinetd.d/tftp (at least on my distribution) to get the service to use it. There is a write up on how in the workarounds for the Acer Iconia Tab w500

      1 Reply Last reply Reply Quote 0
      • G
        george1421 Moderator
        last edited by Nov 11, 2019, 7:50 PM

        @EBCF said in Some computer models adding garbage bytes to undionly.kpxe tftp filename, causing failure to PXE boot.:

        Our environment is DHCP provided by a Unifi Security Gateway

        Without looking at the pcap just yet, my bet is that the dhcp server is at fault. We have seen some non-mainstream dhcp servers not ending the file name string with an ascii null character, but rely on the byte count to signal the text length. Let me look at the pcap to see how close I am.

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

        1 Reply Last reply Reply Quote 0
        • G
          george1421 Moderator
          last edited by Nov 11, 2019, 7:52 PM

          Well I got shutdown before I could get started. Your pcap only contains the tftp packets. I/we need to see the dhcp packets. If your fog server, dhcp server, and pxe booting computer are on the same subnet (or in reality if your fog server and pxe booting client on the same subnet) please follow this tutorial and run it on the fog server. This will capture the dhcp/bootp as well as the tftp transfer: https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

          1 Reply Last reply Reply Quote 0
          • S
            Sebastian Roth Moderator
            last edited by Nov 11, 2019, 8:06 PM

            @EBCF From what I see in the PCAP the NIC firmware is at fault. It seems to send those characters that are not allowed in the specification. Have you tried to update the firmware on those machines yet?

            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

            1 Reply Last reply Reply Quote 0
            • E
              EBCF
              last edited by Nov 11, 2019, 11:31 PM

              67_68_69_4011.pcap

              Since the pcap I ran earlier captured everything, I’ve filtered it to cover ports 67, 68, 69, 4011, and uploaded accordingly. (I can send the entire capture privately if you need it; since I don’t know what information might be in it I don’t wish to post it publicly). It looks like you’re right that option 67 in the DHCPOffer and DHCPAck isn’t null-terminated though.

              The Unifi Security Gateway runs EdgeOS which is a fork of Vyatta, and can run either ISC DHCPD or dnsmasq as the DHCP server. (There’s an option to select which but I can’t find it at the moment). If any of that helps.

              For the PC, I updated to the latest BIOS. On Dell’s support site there’s no specific NIC firmware for this model that I can find. I guess I can see if Windows update turns up anything.

              G 1 Reply Last reply Nov 12, 2019, 1:47 AM Reply Quote 0
              • G
                george1421 Moderator @EBCF
                last edited by Nov 12, 2019, 1:47 AM

                @EBCF I’m pretty sure its related to not having a null terminated string.

                broken_dhcp.png

                Some pxe boot firmware will take the byte count (in this case 0x0d) and others need a null terminated string and just ignore the byte count. Its kind of a toss up.

                So what can you do? Well you can install dnsmasq on your fog server to supply the pxe boot information and then just ignore your dhcp server for pxe boot information. That is what I do at home with my soho isp router. It sends out itself as the bootp server for some stupid reason. I use dnsmasq on the fog server to override it. I have a tutorial on installing dnsmasq on the fog server if you need it.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                1 Reply Last reply Reply Quote 1
                • D
                  Daniel Miller
                  last edited by Nov 12, 2019, 7:59 PM

                  @EBCF Another route you might be able try if the dnsmasq gets problematic is to use a mapfile for the tftp server. Documentation for the format is here and I recall that you edited the xinitd service specification at /etc/xinetd.d/tftp (at least on my distribution) to get the service to use it. There is a write up on how in the workarounds for the Acer Iconia Tab w500

                  1 Reply Last reply Reply Quote 0
                  • S
                    Sebastian Roth Moderator
                    last edited by Sebastian Roth Nov 13, 2019, 3:38 AM Nov 13, 2019, 9:35 AM

                    @EBCF From what we see in the PCAP file it’s very likely the firmware is not handling the DHCP information properly. Here is the information handed out by the DHCP server:

                    01_dhcp_ack.jpg

                    See how the DHCP packet itself is terminated by 0xFF (DHCP option 255 to mark the end). Now when we look at the TFTP request we see that it requests the filename plus 0xFF and a couple more non ASCII characters.

                    02_tftp_request.jpg

                    My guess is that it’s not properly handling the length information (length: 13) given in the DHCP ACK response.

                    So what to do about it as upgrading the firmware doesn’t seem to be available? One important thing you need to know is that the filename information is present in the DHCP answers twice:

                    03_dhcp_boot_info.jpg

                    First in the DHCP “header” (highlighted in blue) and second as DHCP option. I don’t know the DHCP spec well enough to tell you why this is the case. Though I know that some clients handle this perfectly fine and others just don’t.

                    So one thing you can try is adjusting the DHCP config file to remove the DHCP option if that’s possible with your Unifi Security Gateway. In ISC-DHCPD there are two different parameters for this: filename (DHCP header) and option bootfile-name (DHCP option) (reference).

                    If that doesn’t work out I’d think @Daniel-Miller suggesting on TFTP maps is the best route to go! Try out this map rule set:

                    # if the requested file contains non-ASCII characters
                    # send undionly.kpxe as default to fix Dell Vostro 220 issue
                    
                    e ^[a-zA-Z0-9/.\-].*$
                    r .* undionly.kpxe
                    

                    This regular expression should allow for all common filenames including the characters /, . and - (ref). If that doesn’t work, try e ^[[:ascii:]].*$ instead (ref).

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    1 Reply Last reply Reply Quote 0
                    • E
                      EBCF
                      last edited by Nov 13, 2019, 3:05 PM

                      Thanks everyone for the support.

                      @Daniel-Miller and @Sebastian-Roth I opted for the approach you suggested of making TFTPD ‘correct’ the filenames. The affected PC now goes to the FOG menu. I missed Sebastian’s map file and ended up writing my own which is:

                      # Workaround for PXE clients that misinterpret the DHCP options
                      # because they expect a null-terminated string
                      
                      # match the extensions followed by any characters and replace it with
                      # just the extensions
                      
                      r \.pxe.* \.pxe
                      r \.ipxe.* \.ipxe
                      r \.kpxe.* \.kpxe
                      r \.kkpxe.* \.kkpxe
                      
                      

                      @george1421 I wouldn’t even have guessed that proxy DHCP was a thing. I’ve decided not to go for it this time but I’ll keep it in mind for future.

                      1 Reply Last reply Reply Quote 0
                      • 1 / 1
                      1 / 1
                      • First post
                        1/9
                        Last post

                      165

                      Online

                      12.0k

                      Users

                      17.3k

                      Topics

                      155.2k

                      Posts
                      Copyright © 2012-2024 FOG Project