Yet Another TFTP/PXE Issue?!?!



  • Hey Guys,

    I’m definitely not a linux guru as proven by my countless hours in trying to solve why my clients can’t boot to the FOG server via iPXE.

    I’m running CentOS 7 with FOG build 6064. I’ve ensured that my DHCP is configured with Options 66 (IP of Fog server) and 67 (undionly.kpxe). I’ve ensured all passwords (except database) are set to the same thing.

    When I boot from the NIC, the following is displayed and then immediately tries to load the OS on the box:

          **Initializing and establishing link…
          CLIENT MAC ADDR: XXXXXXXXXXX GUID XXXXXXXXXX
          CLIENT IP : 192.168.13.36     MASK: 255.255.255.0     DHCP IP: 192.168.13.203
          GATEWAY IP: 192.168.13.1
    
          PXE-E32: TFTP open timeout
          PXE-M0F: Exiting PXE ROM**
    

    I’ve tried to tftp via windows but it also fails. I know the tftp service is running as well (systemctl status tftp).

         **C:\Users\USERNAME>tftp 192.168.13.236 GET undionly.kpxe.
         Timeout occurred
           Connect request failed**
    

    ANY HELP would be appreciated. I love FOG and would love to use it in this new environment.

    Thanks!

    EDIT: I do have Spanning Tree Protocol enabled on my switch stack if that matters


  • Developer

    @cschneider-tech said:

    When I run command tftp <192.168.13.236> GET undionly.kxpe it returns “The system cannot find the file specified”

    After reading the whole lot it all boils down to TFTP not delivering files properly. This can be caused by firewall settings on the FOG server, missconfigured TFTP, permissions on /tftpboot and many more things. Please follow this: https://wiki.fogproject.org/wiki/index.php/Troubleshoot_TFTP


  • Moderator

    @cschneider.tech OK good now we are getting to the details.

    Your dhcp server is a windows 2012 box. You setup options 66 and 67 on your dhcp server. What precisely did you enter into options 66 and 67?

    The brix units are not relevant quite yet, but its good to know what critter we are dealing with.

    Understand STP == OK, RSTP == better, FSTP == better, Port Fast == better. Disabling STP/RSTP should be done as a test only. In a production network where you have uncontrolled ports not having STP turned on is a recipe for disaster.

    I just wanted to make sure with the router / dhcp server questions that the dhcp server was capable of setting options 66 and 67. Some people will try to use a home router for a dhcp server, and these devices typically aren’t capable enough to support boot server and boot file options.



  • @george1421 Again, sorry for that! I’m usually one who measures twice and cuts once!

    We’re using Gigabyte Brix machines in our environment. The models range but all have Realtek unfortunately.

    The DHCP is Windows Server 2012 R2. Also a domain controller, though not Primary. It’s NOT the firewall, as we use a WatchGuard appliance.

    I suppose instead of wasting devs time, I can try to disable RSTP or enable PortFast (FastLink) to see if that resolves the issue


  • Moderator

    @cschneider.tech said:

    @george1421

    Sorry George! Turns out I made a typo. Of course that’s confusing! (I’ve made the change in the original post)

    Yeah, please don’t do that. It will confuse the heck out of the devs when they look at this thread. Because I say one thing and your post says something else. They’ll think I’ve gone a bit daft.

    OK that image tells me a bunch and now I understand the IP assignment. (side note, hopefully your fog server has a static address or bad things will happen if it moves in the future).

    Something else that strikes me is that you have a realtek NIC. This may not be important quite yet, but it is worth noting because there are realtek specific boot files for ipxe. But lets not go there quite yet.

    What is your dhcp server? Is it a microsoft windows dhcp server, or are you using FOG as your dhcp server (not likely since your fog server and dhcp server have different IP addresses), or is your dhcp server your firewall (also not likely because of the IP addresses)?



  • @george1421

    Sorry George! Turns out I made a typo. Of course that’s confusing! (I’ve made the change in the original post)

    The DHCP server is 192.168.13.203
    The FOG server is 192.168.13.236
    The Gateway (firewall) is 192.168.13.1

    I tried to boot once more a little bit ago and took a screenshot. Here it is (only change is the client IP had a different lease):

    0_1453512650585_IMG_1637.PNG


  • Moderator

    OK lets back up a bit.

    What is the address of your DHCP server? (ignoring what the screen says, I need to know what is physically there)
    Are you using FOG as your dhcp server or some other device? If it is some other device what is it?


  • Moderator

    @cschneider.tech Just to be clear don’t include the greater than, less than symbols. That was just for the logical name.

    Please try it again with just the IP address there.

    Now you have me 100% confused here. The FOG server is 192.68.13.236? How the heck can the FOG server be at that address and DHCP hand out that same address to a different computer (the computer you are trying to pxe boot)?



  • @george1421 said:

    From your screen shot above the PXE boot rom is picking up the IP address [192.68.13.236 ] but its not able to open the file from the tftp (FOG) server.

    Your second attempt to tftp is confusing me a bit. What is the address of the fog server? It can’t be [192.68.13.236] because that is what the PXE boot rom picked up from the [192.168.13.203] dhcp server.

    This command should be like this
    tftp <fog_server_ip> GET undionly.kpxe

    If this fails then you have something messed up in the fog server configuration.

    192.168.13.236 is in fact the IP of the FOG server. It’s not explicitly reserved in DHCP, however. I wonder why the client IP is 192.68?! I don’t understand why it’s pulling .68 and not .168?!

    When I run command tftp <192.168.13.236> GET undionly.kxpe it returns “The system cannot find the file specified”

    is it trying to find the file at the root directory? Shouldn’t I be telling the command WHERE the file lives? e.g. - the full path to the ipxe folder


  • Moderator

    @cschneider.tech said:

    Thanks for your guys’ replies!

    Spanning tree is not required but I would like to keep it there as it’s incredibly helpful. I’m using RSTP not STP. Port Fast (Fast Link in Netgear terms) is definitely an option. I can try to enable this on the port that the FOG server is connected to but I think it’s all or nothing.

    It really does sound like STP is the culprit here! Getting closerrrr!!!

    It could be STP or something else. It is a good idea to use RSTP anyway, I’ve seen MS Windows 7 boxes not pickup the ip address because of the timing with STP. RSTP has no real value on the fog server (in this issue), the target computer NEEDS this turned on because the link is dropped several times during booting. The link stays up on the fog server 100% of the time.



  • Thanks for your guys’ replies!

    Spanning tree is not required but I would like to keep it there as it’s incredibly helpful. I’m using RSTP not STP. Port Fast (Fast Link in Netgear terms) is definitely an option. I can try to enable this on the port that the FOG server is connected to but I think it’s all or nothing.

    It really does sound like STP is the culprit here! Getting closerrrr!!!


  • Moderator

    From your screen shot above the PXE boot rom is picking up the IP address [192.68.13.236 ] but its not able to open the file from the tftp (FOG) server.

    Your second attempt to tftp is confusing me a bit. What is the address of the fog server? It can’t be [192.68.13.236] because that is what the PXE boot rom picked up from the [192.168.13.203] dhcp server.

    This command should be like this
    tftp <fog_server_ip> GET undionly.kpxe

    If this fails then you have something messed up in the fog server configuration.


  • Moderator

    @cschneider.tech Just for reference, as the ipxe kernel and the fog kernel boots it drops the ethernet link several times. If port fast or RSTP is not enabled it may take up to 20-30 seconds for the port to go into a forwarding state. By that time the fog kernel has already booted and missed picking up the IP address. This is one reason why the device isn’t getting the IP. There are a few others but a likely suspect.


  • Senior Developer

    @cschneider.tech Getting the IP isn’t the issue. It’s the fact that the PXE get’s the IP, then iPXE has to re-aquire it’s own IP address. This Dropping of the forwarding state and re-enabling is heavily slowed down which causes the timeout issue.



  • This post is deleted!


  • This post is deleted!

  • Senior Developer

    As you have spanning tree on are you required to have it? If required can you enable rapidstp/ port-fast?


Log in to reply
 

482
Online

39.3k
Users

11.0k
Topics

104.6k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.