• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    SVN 4380 Cloud 5419 (on Ubuntu 14.04.3) Fog not consistently tftp booting from location

    Scheduled Pinned Locked Moved Solved
    General
    4
    11
    3.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Tom ElliottT
      Tom Elliott
      last edited by

      Are you using the location plugin? Tftp enabled is only for locations and it is not truly tftp directing. All that option does is tell the host to get its bzImage and init.xz from its specified location. What plugins do you have and what are their relevant entries?

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

      M 1 Reply Last reply Reply Quote 0
      • M
        Malos @Tom Elliott
        last edited by Malos

        @Tom-Elliott Yes, the location plugin is installed. bzImage and init.xz being pulled from the specified location is exactly something that I desire, so I’m not off my rocker (yet) so far, even if my initial understanding of what Tftp enabled was a bit off!

        Each storage nodes referenced by a given location are local to all clients in that location, unfortunately sometimes the client does boot and pulls down bzimage and init (and then the image) from the master server which is not local to that client.

        Location plugin is the only one installed, and it lists its location as “…/lib/plugins/location/”

        1 Reply Last reply Reply Quote 0
        • S
          Sebastian Roth Moderator
          last edited by

          @Malos said:

          … sometimes … sometimes … sometimes …

          Are you able to reproduce under which circumstances clients boot from the right/wrong server? To me this sounds like there are several DHCP servers offering information to the clients. Sometimes they get the “correct” info first but sometimes not.
          Are you willing and able to hook a hub in front of one of your clients and capture the traffic using wireshark/tcpdump? I’d be really interested to see the packet dump. Hopefully we can figure things out this way.

          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

          M 2 Replies Last reply Reply Quote 1
          • M
            Malos @Sebastian Roth
            last edited by

            @Sebastian-Roth

            Got something concrete that I picked up on finally!

            When a host with a pending task boots, it pulls down bzimage and init (and then the image) from the master server, and then if I shut off the host halfway through the task and reboot it, it boots and pulls bzimage and init/image from the correct storage node location, rinse repeat and pulls bzimage and init/image from master, rinse repeat from node etc.

            This flipping action happens very consistently once the task has started.

            OK! Now, if I power off the host before the bzimage and init pulldown finishes (so, before the screen flashes and clears over to the imaging process itself), booting the host again will pull everything down from the same server just as before. So it’s almost like something gets toggled in the database side of things, perhaps in the task itself right as the image kicks off that might be causing this?

            1 Reply Last reply Reply Quote 0
            • M
              Malos @Sebastian Roth
              last edited by

              @Sebastian-Roth
              It looks like whatever is causing the flip is changing the taskNFSMemberID column in the task (in tasks table) to 1 (the master server in my case) or to 3 (the correct location storage node)

              I would be willing to capture a dump somehow if you feel it would be helpful, but I’m very certain that there are not multiple DHCP servers, as there’s no other noted issues in my environment.

              Wayne WorkmanW 1 Reply Last reply Reply Quote 0
              • Wayne WorkmanW
                Wayne Workman @Malos
                last edited by

                @Malos said:

                I would be willing to capture a dump somehow if you feel it would be helpful, but I’m very certain that there are not multiple DHCP servers, as there’s no other noted issues in my environment.

                https://wiki.fogproject.org/wiki/index.php/Troubleshoot_TFTP
                There are steps in there for doing a capture on the fog server.

                But, since we’re looking at DHCP specifically - you can simply do a capture with Wireshark using any computer that is connected to the same network that you’ll be booting the trouble-host on. The capturing computer will hear all of the broadcast messages on the network and that’s what Sebastian was wanting to look at.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                Daily Clean Installation Results:
                https://fogtesting.fogproject.us/
                FOG Reporting:
                https://fog-external-reporting-results.fogproject.us/

                1 Reply Last reply Reply Quote 0
                • S
                  Sebastian Roth Moderator
                  last edited by Sebastian Roth

                  @Wayne-Workman Thanks for explaining and pointing this out. We’d actually see all the broadcasts and don’t really need a hub. You are right. But @Malos’s findings sound pretty reasonable (reproduceable) and I think we better have a look down that alley before picking up the big gun. 😉

                  Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                  Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                  1 Reply Last reply Reply Quote 0
                  • S
                    Sebastian Roth Moderator
                    last edited by

                    I’ve poked through the code a little and to me it seams like things might go wrong here: lib/reg-task/TaskQueue.class.php
                    But I don’t know enough about the PHP code and @Tom-Elliott needs to have a look I suppose.

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    Tom ElliottT 1 Reply Last reply Reply Quote 0
                    • Tom ElliottT
                      Tom Elliott @Sebastian Roth
                      last edited by

                      @Sebastian-Roth I found and fixed the issue last night. Thanks for pointing out but for this particular problem it was related to the change items hook of the location plugin and the location association class. The problem was I was trying to get the storage node from the association which doesn’t maintain the node or group information. The other half of it was the storage group was getting the list of all enabled nodes, not all enabled nodes that are within its group. This should be fully fixed now.

                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        Malos @Tom Elliott
                        last edited by

                        @Tom-Elliott Confirmed, tasks are pulling down the boot files and image data from the correct node consistently, and updating nodes pulls from the newly set node as well. Awesome work, thanks!

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post

                        211

                        Online

                        12.0k

                        Users

                        17.3k

                        Topics

                        155.2k

                        Posts
                        Copyright © 2012-2024 FOG Project