• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    PXE boot problems, TFTP, No configuration methods succeeded

    Scheduled Pinned Locked Moved Solved FOG Problems
    20 Posts 3 Posters 2.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      c70m83
      last edited by

      Hello,
      there are some problems i have and i am getting crazy.
      My setup:
      UCS Univention server: DHCP, DNS with some virtual machines (10.0.2.2)
      Debian based FOG Server 1.5.8 running as VM on the UCS Server (10.0.32.180)
      Spanning Tree disabled on all switches.

      My Task: Capture Images from ~40 Lenovo ThinkCentre M92p(windows 7) and deploy windows 10 Images to all of them.

      I am working on this several days and read a lot on this forum but things getting confusing.
      I get it easily to work on some machines like: PC01, PC04, PC08, PC12, PC18, PC43. I Captured an Images and deployed an other on and let them join the domain.
      But there are some strange behavior like

      1. PC24: This Machine stuck on “TFTP open timeout”, I tested tftp on the same machine with windows 7 and it couldn’t connect either. But at the same time PC43 PXE boot works perfectly and PC08 as well under windows 10. I had this problem before and a BIOS update worked mostly but not here (and on 2 computers). I tried to use an USB boot drive to get in iPXE but this ended with “No configuration methods succeeded”. The USB boot drive worked perfect an PC43

      2. PC08: as mentioned before PXE boot capture and deploy with FOG worked perfect but now this machine stuck on: “No configuration methods succeeded”. I had this problem before and i rid of by take an other network jack/disabling STP.

      3. PC12: as mentioned before PXE boot capture and deploy with FOG worked perfect. Then i captured an new image for the machine and after the task finished successful. On reboot he always gets “TFTP open timeout”. i didn´t change anythink in the meantime.

      For troubleshooting i restarted: Fog Server VM ,clients, services. checked: tftp, dhcp config files. Firewalls are disabled, file permissions on /tftpboot. I compared the BIOS settings and tested different ones.

      Does someone have any clue whats going wrong with my setup?
      Thank you for any advise.

      1 Reply Last reply Reply Quote 0
      • george1421G
        george1421 Moderator
        last edited by

        Some of your post is not clear so I have a few questions.

        1. You have ~40 computers that are M92p.
        2. They are all the same model?
        3. If they are all of the same model, do I read some work and some no?

        General observations
        Lets make it clear PXE booting is a separate process then booting into windows. So having windows 7 or windows 10 have no impact on PXE boot up.

        If you have both bios and uefi computers on your network, your dhcp server must be configured for dynamic boot file names. If not and you have a static dhcp option 67 you can either boot bios or uefi computer. If you need to boot the other, then you will need to update dhcp option 67.

        Since you mentioned Lenovo computers, I would suggest that you upgrade the computers to the latest bios release. This will do 2 things for you: 1. We have seen some very bad early release bios on Lenovo computers. Updating the bios usually corrects random behavior. 2. You will then be sure that you are on the same bios version so if one computer acts bad we can rule out the bios as the cause. Right now if your bios is all on different versions one version may work and the other not.

        Lastly, on your network switches, make sure you have one of the fast spanning tree protocols enabled (portfast, RSTP, MSTP, fast-stp). Where you will see the problem is with pxe booting, where pxe booting will work and ipxe gets running on the target computer, but iPXE can not get an IP address. Or if you get past iPXE when you boot into imaging it won’t get an IP address. You can test if its a spanning tree issue by placing a dumb (cheap, unmanaged switch) between the target computer and the building network switch. If that dumb switch fixes the problem then its probably a spanning tree issue.

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

        C 1 Reply Last reply Reply Quote 0
        • C
          c70m83
          last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • C
            c70m83
            last edited by

            Thanks for your replay!

            1. yes all ~40 are M92p.
            2. no. 3 differnt models, 5 differnt BIOS versions (so far, i only touched 21 of 40)
            • I had working PXE boot with all 3 models and all 5 BIOS verions.
            • PC12 and PC08 were PXE booting but now they dont PXE boot anymore.

            I will only use BIOS boot so far.
            Tomorrow is will try to update all the BIOS version to latest version. It is the same BIOS for all three models.

            To avoid spanning tree issue i already used sometime a dump switch and disabeld all STP features, but i will investigate this is little bit more tomorrow.

            1 Reply Last reply Reply Quote 0
            • S
              Sebastian Roth Moderator
              last edited by

              @c70m83 Just to try and see why exactly those three (PC24, PC08 and PC12) seem problematic you might try to connect those to other ports on the switch where you know that other PCs work.

              As George said, PXE boot does not depend on Windows or any operating system. No matter what you have installed on disk PXE does work!! So if it used to work then something within your network / DHCP setup changed!

              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

              1 Reply Last reply Reply Quote 0
              • C
                c70m83 @george1421
                last edited by

                @george1421
                I updated to the latest BIOS version. -->no change on the Problems

                I enabled RSTP on all Switches. PC12 and PC08 work now properly. The “No configuration methods succeeded” is gone also “TFTP open timeout” on PC12.

                But PC24 and PC05, PC21 has still the “TFTP open timeout” problem. PC01-PC28 are all the same model and now have all the same BIOS version. I testet them on ports where other machines (PC13,PC18) of the same type work. I got the same error.

                And now the interesting part: When i put a dumb switch in between. The normally working and direct to the building switch attached PCs like PC08,12,13,18 get the “No configuration methods succeeded” error.

                I think i have a network problem.

                george1421G 1 Reply Last reply Reply Quote 0
                • S
                  Sebastian Roth Moderator
                  last edited by

                  @c70m83 said in PXE boot problems, TFTP, No configuration methods succeeded:

                  When i put a dumb switch in between. The normally working and direct to the building switch attached PCs like PC08,12,13,18 get the “No configuration methods succeeded” error.

                  What model is the building switch? Maybe it detects an intermediate switch and shuts down the port??? Don’t think I have seen this before but you never know.

                  Make sure you don’t have any loops in your network setup!!!

                  Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                  Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                  1 Reply Last reply Reply Quote 0
                  • george1421G
                    george1421 Moderator @c70m83
                    last edited by

                    @c70m83 said in PXE boot problems, TFTP, No configuration methods succeeded:

                    But PC24 and PC05, PC21 has still the “TFTP open timeout” problem.

                    OK good you have eliminated quite a few variables in your network. So lets focus on PC05, PC21, and PC24, with the understanding that PC01-PC28 are all the same model.

                    Are these computers in the same firmware mode (bios vs uefi)? If the fog server and PC05 is plugged into a known good network port (one where one of the other computes pxe booted successfully), can you grab a pcap of the booting process?

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                    C 1 Reply Last reply Reply Quote 0
                    • C
                      c70m83 @george1421
                      last edited by

                      @george1421
                      All Computers are set to BIOS mode. I set everywhere the same BIOS settings.
                      i used wireshark during boot of PC05.
                      44:37:e6:b8:8b:7c and IP: 10.0.5.5 is the PC05 who is trying to PXE boot.
                      10.0.2.2 is DHCP Server
                      10.0.32.180 FOG Server
                      PC05PXEboot_faild.pcap

                      @Sebastian-Roth said in PXE boot problems, TFTP, No configuration methods succeeded:

                      What model is the building switch? Maybe it detects an intermediate switch and shuts down the port??? Don’t think I have seen this before but you never know.
                      Make sure you don’t have any loops in your network setup!!!
                      We use three Enterasys B5G124-48P2 and one Netgear GS724Tv4. I check every networkjack in the building but coundn´t find any loop. I hope didn’t missed one.

                      george1421G 2 Replies Last reply Reply Quote 0
                      • george1421G
                        george1421 Moderator @c70m83
                        last edited by

                        @c70m83 Well looking at the packet capture I see 4 dhcp servers giving offers, and 3 giving acks. I haven’t done a deep dive on the pcap yet, but I would say that is a bit suspicious.

                        dhcp servers
                        10.0.2.1, 10.0.2.2, 10.0.2.3, 10.10.0.1

                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                        1 Reply Last reply Reply Quote 0
                        • george1421G
                          george1421 Moderator @c70m83
                          last edited by

                          @c70m83 looking a bit deeper.

                          10.0.2.1, 10.0.2.2, 10.0.2.3 appear to be the same device. They are telling the client the same story. If I had to guess 10.0.2.1 is a VIP address shared by 10.0.2.2, 10.0.2.3. This cluster is giving out an IP for the next server of 10.0.32.180 and a boot file name of undionly.kpxe This is what I would normally expect.

                          10.10.0.1 on the other hand is giving out possibly bad information for your network. Its giving out a next server {boot-server} of 172.23.56.254 with no boot file name. Its saying the dhcp server is 172.23.56.254 with itself as the default router. If I had to guess this is a soho router

                          While I can’t say if its 10.10.0.1 causing the problem, I might try to unplug that from your business network and see if that dhcp server is causing your clients to get confused. I would start with that first.

                          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                          C 2 Replies Last reply Reply Quote 0
                          • C
                            c70m83 @george1421
                            last edited by

                            @george1421
                            Your are right 10.0.2.1 10.0.2.2 and 10.0.2.3 is the same Server/device.

                            I have no clue what device the 10.10.0.1 is. But i will find out.
                            Thank you for diving deep in the pcap.

                            1 Reply Last reply Reply Quote 0
                            • C
                              c70m83 @george1421
                              last edited by

                              @george1421
                              I found the 10.10.0.1 device it was a bad configured WiFi access point.
                              PC24 and PC05, PC21 has still the “TFTP open timeout” problem

                              This problem no longer occurs today:

                              And now the interesting part: When i put a dumb switch in between. The normally working and direct to the building switch attached PCs like PC08,12,13,18 get the “No configuration methods succeeded” error.

                              george1421G 1 Reply Last reply Reply Quote 0
                              • george1421G
                                george1421 Moderator @c70m83
                                last edited by george1421

                                @c70m83 Were you able to remove 10.10.0.1 from your environment?
                                PC05 is still giving the timeout?

                                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                C 1 Reply Last reply Reply Quote 0
                                • C
                                  c70m83 @george1421
                                  last edited by

                                  @george1421
                                  Yes i have removed that WiFi AP, but the Timeout is still there.

                                  1 Reply Last reply Reply Quote 0
                                  • george1421G
                                    george1421 Moderator
                                    last edited by

                                    Ok I should have given you this tutorial to follow to do the pcap from the FOG server. This will give us the unicast communications between the target computer and the FOG server: https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue

                                    If you could grab another pcap with one of the impacted computers that would help with the next debug.

                                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                    C 1 Reply Last reply Reply Quote 0
                                    • C
                                      c70m83 @george1421
                                      last edited by

                                      @george1421
                                      I followed the instructions and captured with command tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4011 on the fog server

                                      pcap from power on till “TFTP open timeout” on PC05 (it is empty)
                                      output05.pcap

                                      pcap from power on till “TFTP open timeout” on PC24
                                      output24.pcap
                                      2020-04-08 18.10.31.jpg

                                      pcap from power on till FOG menu on PC18 (this is a working one)
                                      output18a.pcap

                                      10.0.2.5 is also the same device as 10.0.2.1; 10.0.2.2, 10.0.2.3.

                                      george1421G 2 Replies Last reply Reply Quote 0
                                      • george1421G
                                        george1421 Moderator @c70m83
                                        last edited by

                                        @c70m83 Well initially I thought that you did something wrong, the FOG server should have seen “SOMETHING!!”.

                                        Your pcap from the working one is typical from what I would expect to see. Discover, Offer, Request, ACK (we aren’t seeing the ack for some reason, it may be sent out unicast), then a tftp request asking the size, then a tftp request asking for the file. The following dhcp requests are iPXE starting up and then asking for default.ipxe to load the FOG iPXE menu.

                                        So now what does this tell us? The first 2 systems are not sending out a DHCP Discover, or if they are sending out a Discover it not making it to the network. Because if the FOG server isn’t seeing it your dhcp servers are not too.

                                        So if you move PC24 (or PC05) to the network connect for PC18 and get the same results (working or not) then you will know if its network or the PC. If PC24 works with the network cable for PC18 then the problem is with the PC24 network port or cable itself. If PC24 doesn’t work plugged into the PC18 cable then its the computer not sending out the dhcp request.

                                        I can say that we’ve seen issue with spanning tree (but not usually at this point in booting) as well as green ethernet settings in the network switch causing the realtek network adapters to not work as expected.

                                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                        1 Reply Last reply Reply Quote 0
                                        • george1421G
                                          george1421 Moderator @c70m83
                                          last edited by

                                          @c70m83 I was thinking about this picture on the drive home. It didn’t hit me until I was almost home. How is this screen shot possible?

                                          1. this host should have been captured by the FOG server It has an IP address so it has to have talked with 2.1 so the FOG server should have seen that dialog. At the very least the DISCOVER sent out by the target computer.

                                          So this means either 2.1 is not telling the target computer the proper netboot information or the network is going away after the initial dhcp transaction.

                                          Also you dhcp server setup has me a bit confused where you have 5 IP addresses all to the same dhcp server and they all respond to the pxe booting client. They are all saying the same thing, but I wonder why.

                                          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                          C 1 Reply Last reply Reply Quote 0
                                          • C
                                            c70m83 @george1421
                                            last edited by

                                            @george1421 I found a solution. I am not totally sure why the problem occur, but now i know what i have to do to get the PXE boot running. There is selfbulid captive portal running on 10.0.2.2 that uses IPTABELS to allow access to the internet. By logging on captive portal an other rule is added to the IPTABELS PREROUTING nat chain

                                            ACCEPT	all	--	anywhere	anywhere	MAC	44:37:E6:B8:85:78
                                            

                                            In this chain there are some standard rules on top and bottom.

                                            DNAT	tcp	--	anywhere	anywhere	tcp	dpt:domain to:10.0.2.2
                                            DNAT	tcp	--	anywhere	anywhere	tcp	dpt:domain to:10.0.2.2
                                            
                                            ACCEPT	all	--	anywhere	anywhere	MAC	44:37:E6:B8:85:78
                                            
                                            ACCEPT	all	--	anywhere	10.255.255.255				
                                            ACCEPT	all	--	anywhere	224.0.0.252		
                                            NFLOG	all	--	anywhere	anywhere		
                                            DOCKER	all	--	anywhere	anywhere	ADD	RTYPE match dst-type LOCAL
                                            ACCEPT	tcp	--	anywhere	anywhere	tcp	dpt:ssh
                                            DNAT	tcp	--	anywhere	anywhere	tcp	dpt:https to:10.0.2.2:443
                                            DNAT	tcp	--	anywhere	anywhere	to:	10.0.2.2:80
                                            DNAT	udp	--	anywhere	anywhere	to:	10.0.2.2:42
                                            

                                            Usually it is not a problem to reach IP addresses in the LAN if you are not logged in to the captive portal. I explicitly tested it today. If I am in windows and am not logged in to the captive portal, I can access all other websites in the LAN in the browser, except the FOG management portal on 10.0.32.180.

                                            It looks the same with the boot process. If the computer is logged on to the captive portal then the PXE boot works without any problems.
                                            If the computer is not registered on the captive portal then I always get the “TFTP timend out” message.

                                            I just didn’t get the captive portal to play a role in this.

                                            Many thanks to george1421 und Sebsatian Roth for the help.

                                            1 Reply Last reply Reply Quote 2
                                            • 1 / 1
                                            • First post
                                              Last post

                                            165

                                            Online

                                            12.3k

                                            Users

                                            17.4k

                                            Topics

                                            155.8k

                                            Posts
                                            Copyright © 2012-2025 FOG Project