• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    DHCP lease timeout issue

    Scheduled Pinned Locked Moved
    FOG Problems
    3
    12
    9.2k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      Adam Taylor
      last edited by

      Hey guys,

      I am having a problem where some machines are not receiving dhcp address within the allowed time once the kernel boots and it tries to do a task.

      PXE boot get a lease and starts iPXE (gives DHCP enough time)
      iPXE then gets a lease and loads the correct action (gives DHCP enough time)
      The kernel starts, abunch of info goes by and then the following happens in exactly 10 seconds:

      Starting network…
      udhcpc (v1.22.1) started
      Sending discover…
      Sending discover…
      Sending discover…
      No lease, failing
      ssh-keygen…etc, etc, etc

      Doesn’t give the DHCP enough time!

      Is there a way to increase the timeout on the dhcp requests? Our server runs the whole campus, wireless, etc… It always responds…just never within the 10 seconds that it is given in fog on this step. If the timeout was 20 seconds, I’m pretty sure this issue would go away.

      Any thoughts?

      Adam

      1 Reply Last reply Reply Quote 0
      • A
        Adam Taylor
        last edited by

        Well…i found the issue.

        When i updated FOG from the SVN to fix the resize partition issue, it installed a new kernel also. This kernel seems to be the source of the issue here. I reverted back to the original 1.2.0 package kernel and now it is able to boot correctly.

        What was changed between the 1.2.0 default kernel and then current SVN kernel?

        Thanks,

        Adam

        1 Reply Last reply Reply Quote 0
        • A
          Adam Taylor
          last edited by

          We found out the main issue. The kernel switch alieavated it some but the issue remains.

          Spanning-tree on the network here is interfering with DHCP in FOG. It’s not allowing the port up fast enough for the latest kernel/tools and dies at discovery.

          Can the timeout please be lengthened for the DHCP discover requests in the latest tool? Is there any way that i can set it…can you point me where to go?

          Thanks,

          Adam

          1 Reply Last reply Reply Quote 0
          • Tom ElliottT
            Tom Elliott
            last edited by

            DHCP timeouts aren’t really something fog controls in any form. If you’re getting ipxe menu screens, then it seems (to me) that you’re booting systems using USB NICs?

            Is this issue occurring on all of your systems or specific systems?

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

            1 Reply Last reply Reply Quote 0
            • A
              Adam Taylor
              last edited by

              I am getting ipxe screens (but had to tweek spanning-tree even for that) becuase iPXE was not waiting long enough for the DHCP request to come through (ipxe waits 15 seconds…with full spanning-tree, it takes from 13-20 seconds to respond). We enabled fast learning for spanning-tree and it now responds in about 9-12 seconds which makes iPXE happy. But once it boots the kernel and starts (for a task or a registration), it does the above with the discover statements and then gives up. We timed it and it is only giving that process 10 sec max and then gives up and the task just hangs with a blinking cursor which slowly moves everything off the screen.

              I am the network person on our campus so i can test but we cannot turn off spanning tree (nor would be want…it keeps people from incorrectly connecting network cables and taking down the entire network). I tried turning if straight off for a test and it was happy and all worked. That last “Sending Discover” process however just will not give the network enough time to “turn on”.

              Any enterprise type network would have this issue and i can’t see me being the only one here

              On .32 though, the PXE part would wait as long as needed and when the kernel booted and it did it’s things, it also waited as long as needed. It just seems to be something with this kernel/tools combo that refuses to wait more then 10 sec…

              Also, no, the USB is PCIe 1x based, not USB and are standard intel branded 1G chips.

              As far as type, Dell Optiplex 755,760,780,790,990,9020,9030 all show this problem. The less “switches” between the computer and the network core, the better chance of success…the further away…the more likely spanning-tree won’t start fast enough for it because there are more switches in which spanning-tree must calculate before allowing traffic on that port.

              Does that help…lol

              1 Reply Last reply Reply Quote 0
              • Tom ElliottT
                Tom Elliott
                last edited by

                If you need more time I can put a delay before dhcp starts but after the kernel loads maybe that’ll help?

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                1 Reply Last reply Reply Quote 0
                • A
                  Adam Taylor
                  last edited by

                  I’m willing to give that a try.

                  1 Reply Last reply Reply Quote 0
                  • Tom ElliottT
                    Tom Elliott
                    last edited by

                    I’m specifying a 60 timeout value. Really it shouldn’t take any more than 30 seconds but I’ve seen a few times where it may take 45 seconds. I’ll inform you when the init’s are built.

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    1 Reply Last reply Reply Quote 0
                    • Tom ElliottT
                      Tom Elliott
                      last edited by

                      SVN 2485 released.

                      Should add timeout value to 60 seconds. It will continue on when it receives a dhcp lease or times out.

                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                      1 Reply Last reply Reply Quote 0
                      • A
                        Adam Taylor
                        last edited by

                        This fixed it. It now gives our network enough time to actualy grab a DHCP address.

                        I really appreciate your help!

                        1 Reply Last reply Reply Quote 0
                        • A
                          axel12
                          last edited by

                          Speaking of switches - I’ve noticed that the delay will also lengthen due to managed switches. We have Dell Powerconnect switches at my school … the floor switches are unamanged - (i.e. plug and go variety), some of the bigger switches are managed - which means they have a static IP address that you can change, as well as configure any sort of LAGs or VLANS. I’ve noticed that the more managed switches between the fog client and dhcp server - will dramatically increase the waiting time.

                          1 Reply Last reply Reply Quote 0
                          • A
                            Adam Taylor
                            last edited by

                            That is spanning tree making sure there are no loops before allowing data to pass on the port. All managed switches have it turned on by default (which is a good thing). All our switches on campus have it, so if you have 3-4 swithces in a row that have to learn its new path, it does take longer then usual to come up then on unmanaged switches, hense our issue we were having.

                            1 Reply Last reply Reply Quote 0
                            • 1 / 1
                            • First post
                              Last post

                            176

                            Online

                            12.0k

                            Users

                            17.3k

                            Topics

                            155.2k

                            Posts
                            Copyright © 2012-2024 FOG Project