PXE-E78 Cannot locate boot server

  • Moderator

    @george1421 OK I sit here a bit red faces, since my server does the same thing AND I have it setup to use dhcp. AND my server is working correctly. I need to look into this now.

  • Moderator

    @mkstreet This is very strange, indeed. If you manually set the values in resolv.conf and then start and stop dnsmasq does it change the settings? I’m going to confirm on my Pi server to see if it does the same things. In my experience it does not.


  • @george1421

    Ok I changed /etc/network/interfaces and did a reboot (to be sure my changes were implemented):

    compteach@iepcomlabsrv:/etc/network$ cat interfaces
    # This file describes the network interfaces available on your system
    # and how to activate them. For more information, see interfaces(5).
    
    # The loopback network interface
    auto lo
    iface lo inet loopback
    
    # The primary network interface
    auto em1
    iface em1 inet static
    address 10.0.253.24
    netmask 255.255.255.0
    gateway 10.0.253.1
    dns-nameservers 110.164.252.222 8.8.8.8
    
    

    However, I still seem to have unknown hosts and the resolv.conf just has the loopback address:

    compteach@iepcomlabsrv:/etc/network$ sudo service dnsmasq status
     * Checking DNS forwarder and DHCP server dnsmasq                                                                                                                       * (running)
    compteach@iepcomlabsrv:/etc/network$
    compteach@iepcomlabsrv:/etc/network$
    compteach@iepcomlabsrv:/etc/network$ cat /etc/resolv.conf
    # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
    #     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
    nameserver 127.0.0.1
    compteach@iepcomlabsrv:/etc/network$ ping google.com
    ping: unknown host google.com
    compteach@iepcomlabsrv:/etc/network$
    
    

    Do I need to do anything with the /etc/dnsmasq.conf file ?

  • Moderator

    @george1421 The pxe error PXE-E78 means that the boot server that is either being returned by dhcp either doesn’t exist or that value is not being returned. Lets fix the IP address on the fog server. You can keep it at this address, it just needs to be configured in linux as static. Once you do that you should inspect the content of the /etc/resolv.conf file to make sure it is configured as you need it.

  • Moderator

    @mkstreet Well the first issue is that the FOG server MUST be at a static IP address (period). The issue comes (even if you use dhcp reservations) just as you noted the resolv.conf file will change based on outside forces. Once the network mode is set to static then resolve.conf should stay put. I don’t know if dnsmasq is forcing a dhcp renew or not, but what you have is not what I see on my FOG-Pi server that IS running dnsmasq.


  • After sudo service dnsmasq stop, then /etc/resolv.conf has

    compteach@iepcomlabsrv:/etc$ cat /etc/resolv.conf
    # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
    #     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
    nameserver 110.164.252.222
    nameserver 8.8.8.8
    

    But when I start … sudo service dnsmasq start then the /etc/resolv.conf has, as you said,:

    compteach@iepcomlabsrv:/etc$ cat /etc/resolv.conf
    # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
    #     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
    nameserver 127.0.0.1
    
    

    I am not sure how to fix this because the /etc/resolv.conf is dynamic and overwritten (as the files own comments state). Do I list the internal DNS in the /etc/network/interfaces ?

    compteach@iepcomlabsrv:/etc/network$ cat interfaces
    # This file describes the network interfaces available on your system
    # and how to activate them. For more information, see interfaces(5).
    
    # The loopback network interface
    auto lo
    iface lo inet loopback
    
    # The primary network interface
    auto em1
    iface em1 inet dhcp
    
    

    RE: LTSP.CONF
    I’ve removed the two lines at the end of ltsf.conf.
    And did a sudo service dnsmasq restart
    the boot results are the same… meaning I get the PXE-E78 error…etc

    RE: BIOS.
    I looked in the BIOS settings. It says Boot Mode Auto, and Boot Priority Legacy First.
    I haven’t changed these BIOS settings in probably over a year and it was working before with this FOG server…

  • Moderator

    @george1421 Justs for clarity, the device you are attempting to pxe boot is in BIOS (legacy) mode?

    FWIW: Your ltsp.config will only respond to requests from bios (legacy) based computers.

  • Moderator

    @mkstreet You have a few things going on here lets take the easy ones first.

    Unable to resolve internet names.

    What does the /etc/resolv.conf point to ? Guess: 127.0.0.1? Your fog server resolve.config should point toward your buildings internal dns server not itself.

    As for the dnsmasq, I want you to remove (comment out) the dhcp hosts and ignore at the end of the ltsp.config file then restart dnsmasq. From a sanity standpoint dnsmasq only works on a multinet broadcast so by default it will not cross a router unless you setup stuff on the dhcp relay. So running it on your single subnet is not an issue. Remove those settings to ensure that the right information gets passed out. I confirmed that your ltsp.config file matches my example file exactly. (I added the code block around your config file so it was a bit more readable).

    NO change is needed to FOG for dnsmasq since all settings are external to FOG.


  • I’ve just noticed too when I tried to do sudo apt-get update that it seems the 10.0.253.24 cannot resolve hostnames because the apt-get update fails because it cannot locate the update mirror.
    Further, something like ping google.com cannot find the host either.

    But if I do…
    sudo service dnsmasq stop

    Then everything is golden. Can apt-get update, can ping google.com, etc.


  • I should make clear that before this network change that led to me attempting DNSMASQ,
    this Fog server was working fine. So, is my problem a Fog configuration setting that needs to change to recognize working with DNSMASQ?

    In other words, DNSMASQ is newly added to a Fog server that had been working before…


  • @george1421

    10.0.253.24 is your FOG server with DNSMASQ running on it.
    *** Correct.

    10.0.253.1 is your router on the 10.0.253.0 subnet. It is also running a dhcp relay/helper service that forwards dhcp requests to your site dhcp server.
    *** I accept your description as correct. I have not specifically configured 10.0.253.1 … so where/how it is setup, I am not certain.

    172.16.1.1 is your corporate dhcp server that is untouchable (and a bit suspicious at the 1.1 address but…)
    *** Yes…

    Here is the ltsp.conf. The very last line references the MAC addr of the computer I want to load. Because long term I was nervous about this DNSMASQ answering requests from the wider network, I planned to list the roughly 50 hosts in this file by MAC Addr. My proof of concept was to just put one MAC Addr in to start with…


    compteach@iepcomlabsrv:/etc/dnsmasq.d$
    compteach@iepcomlabsrv:/etc/dnsmasq.d$ cat ltsp.conf

    # Don't function as a DNS server:
    port=0
    
    # Log lots of extra information about DHCP transactions.
    log-dhcp
    
    # Dnsmasq can also function as a TFTP server. You may uninstall
    # tftpd-hpa if you like, and uncomment the next line:
    # enable-tftp
    
    # Set the root directory for files available via FTP.
    tftp-root=/tftpboot
    
    # The boot filename, Server name, Server Ip Address
    dhcp-boot=undionly.kpxe,,10.0.253.24
    
    # rootpath option, for NFS
    #dhcp-option=17,/images
    
    # kill multicast
    #dhcp-option=vendor:PXEClient,6,2b
    
    # Disable re-use of the DHCP servername and filename fields as extra
    # option space. That's to avoid confusing some old or broken DHCP clients.
    dhcp-no-override
    
    # PXE menu.  The first part is the text displayed to the user.  The second is the timeout, in seconds.
    pxe-prompt="Press F8 for boot menu", 3
    
    # The known types are x86PC, PC98, IA64_EFI, Alpha, Arc_x86,
    # Intel_Lean_Client, IA32_EFI, BC_EFI, Xscale_EFI and X86-64_EFI
    # This option is first and will be the default if there is no input from the user.
    pxe-service=X86PC, "Boot from network", undionly
    
    # A boot service type of 0 is special, and will abort the
    # net boot procedure and continue booting from local media.
    #pxe-service=X86PC, "Boot from local hard disk", 0
    
    # If an integer boot service type, rather than a basename is given, then the
    # PXE client will search for a suitable boot service for that type on the
    # network. This search may be done by multicast or broadcast, or direct to a
    # server if its IP address is provided.
    # pxe-service=x86PC, "Install windows from RIS server", 1
    
    # This range(s) is for the public interface, where dnsmasq functions
    # as a proxy DHCP server providing boot information but no IP leases.
    # Any ip in the subnet will do, so you may just put your server NIC ip here.
    # Since dnsmasq is not providing true DHCP services, you do not want it
    # handing out IP addresses.  Just put your servers IP address for the interface
    # that is connected to the network on which the FOG clients exist.
    # If this setting is incorrect, the dnsmasq may not start, rendering
    # your proxyDHCP ineffective.
    dhcp-range=10.0.253.24,proxy
    
    # This range(s) is for the private network on 2-NIC servers,
    # where dnsmasq functions as a normal DHCP server, providing IP leases.
    # dhcp-range=192.168.0.20,192.168.0.250,8h
    
    # For static client IPs, and only for the private subnets,
    # you may put entries like this:
    # dhcp-host=00:20:e0:3b:13:af,10.160.31.111,client111,infinite
    dhcp-host=f8:0f:41:a0:04:75,net:allow
    dhcp-ignore=#allow
    

    MOD Edit: Added code block for readability

  • Moderator

    @mkstreet OK then let me see if I can put the bits together now.

    10.0.253.24 is your FOG server with DNSMASQ running on it.
    10.0.253.1 is your router on the 10.0.253.0 subnet. It is also running a dhcp relay/helper service that forwards dhcp requests to your site dhcp server.
    172.16.1.1 is your corporate dhcp server that is untouchable (and a bit suspicious at the 1.1 address but…)

    As long as the dnsmasq server is in the same subnet as the pxe booting client this setup should work no problem. What Sebastian noted is that your dnsmasq server is handing out the name of the boot-file but its not handing out the IP address of the next-server (so the pxe client is listening to what its getting from the 172.16.1.1 dhcp server).

    Can you post the complete ltsp.config file from your dnsmasq setup? Some where you are missing a command. Hint: There should be two spots where you have to enter the IP address of your fog server in the dnsmasq configuration.


  • @Sebastian-Roth 10.0.253.24 is my server. This is the server that has Fog as well as my newly setup DNSMASQ.

    I setup DNSMASQ following the instructions for:
    “Using FOG with an unmodifiable DHCP server”
    https://wiki.fogproject.org/wiki/index.php?title=Using_FOG_with_an_unmodifiable_DHCP_server/_Using_FOG_with_no_DHCP_server

    Because I don’t have access to / it’s not easy to change (if at all) the 172.x.x.x server.

    I thought this was the intent of the DNSMASQ setup for “Using Fog with an unmodif…” ?

    What can I send you from 10.0.253.24? I can send the DNSMASQ.conf and the ltsp.conf files?

    If it’s essential now or going forward operationally, I can disconnect from the 172.x.x.x network and run isolated from that while loading with Fog.

  • Moderator

    @mkstreet The PCAP info is very helpful. I see two DHCP servers answering the inital DHCP discovery packet sent by the client. One server (10.0.253.1) hands out an IP (10.0.253.29) and next-server info (172.16.1.1) while the other one (10.0.253.24) hands out the filename info (undionly.kpxe) but no next-server info. After that I see that second DHCP server sending a DHCP request to 172.16.1.1. I guess we need to see the DHCP configuration of both your servers to be able to help you.


  • The forum says I cannot upload the pcap file because I don’t have enough privilege.

    I have put the pcap file here instead:
    http://s000.tinyupload.com/?file_id=94259635495204452092


  • Hi,

    As far as I know, they are on the same subnet. They fog server and target computer are all on the same routers. I have two routers for the lab and the routers are joined together with an Ethernet cable. One of the two routers has the Ethernet cable to the LAN outside the lab.

    I ran the tcpdump and the file is attached below.

    [0_1475551540689_output.pcap](Uploading 100%)

    I also ran TFTP under Windows to see if I could access the file from the server:

    C:\Users\User>
    C:\Users\User>tftp -i iepcomlabsrv GET undionly.0
    Transfer successful: 88751 bytes in 1 second(s), 88751 bytes/s

    C:\Users\User>ls -l *.0
    -r–r--r-- 1 User 0 88751 2016-10-04 10:26 undionly.0

    C:\Users\User>

  • Moderator

    @mkstreet Is the fog server, dhcp server and target computer in the same subnet?

    If so then lets grab a pcap of the pxe booting process. On your FOG server install tcpdump program. Then start the pcap with the following syntax sudo tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4011 After you start the tcp dump program then pxe boot your target computer. Once you get the error then press ctrl-c to stop the tcpdump program. Upload the pcap file here. That will let us see what is going on in your pxe boot environment.


  • re: boot file and its extension
    I did do this using ln . Here is the listing of the directory:

    compteach@iepcomlabsrv:/tftpboot$ ls -l
    total 3088
    -rw-r–r-- 1 fog root 840 Mar 18 2015 boot.txt
    -rw-r–r-- 1 root root 400 Oct 4 07:20 default.ipxe
    lrwxrwxrwx 1 root root 18 Oct 4 09:46 ipxe.0 -> /tftpboot/ipxe.efi
    -rw-r–r-- 1 fog root 903232 Mar 18 2015 ipxe.efi
    -rw-r–r-- 1 fog root 334980 Mar 18 2015 ipxe.kkpxe
    -rw-r–r-- 1 fog root 335028 Mar 18 2015 ipxe.kpxe
    -rw-r–r-- 1 fog root 334508 Mar 18 2015 ipxe.krn
    -rw-r–r-- 1 fog root 334929 Mar 18 2015 ipxe.pxe
    -rw-r–r-- 1 fog root 25340 Mar 18 2015 memdisk
    -rw-r–r-- 1 fog root 16794 Mar 18 2015 pxelinux.0.old
    -rw-r–r-- 1 fog root 171488 Mar 18 2015 snp.efi
    -rw-r–r-- 1 fog root 171680 Mar 18 2015 snponly.efi
    lrwxrwxrwx 1 root root 13 Oct 3 10:36 undionly.0 -> undionly.kpxe
    -rw-r–r-- 1 fog root 88703 Mar 18 2015 undionly.kkpxe
    -rw-r–r-- 1 root root 88751 Apr 22 2015 undionly.kpxe
    -rw-r–r-- 1 fog root 88751 Mar 18 2015 undionly.kpxe_orig
    -rw-r–r-- 1 fog root 88786 Mar 18 2015 undionly.pxe
    -rw-r–r-- 1 fog root 147728 Mar 18 2015 vesamenu.c32
    compteach@iepcomlabsrv:/tftpboot$

    re: George’s tutorial.
    I’ve rechecked and everything matched/matches what the tutorial shows.

    Other suggestions, please?


  • Follow George’s tutorial, it’s good.

    But just guessing, you didn’t copy your desired boot file, and change the copy’s extension to .0

  • Moderator

    This seems to be dnsmasq day today.

    I wrote a tutorial a while ago for setting up dnsmasq on centos 7. Please compare the configuration file here to what you have with your setup. https://forums.fogproject.org/topic/6376/install-dnsmasq-on-centos-7

    If your fog server, dhcp server and pxe booting clients are in the same subnet then that is all you need to do. If any is on a different subnet then you need to do a few more things.

291
Online

8.9k
Users

15.6k
Topics

145.0k
Posts