• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    FOS checkin time

    Scheduled Pinned Locked Moved Unsolved
    Feature Request
    3
    9
    2.2k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Wayne WorkmanW
      Wayne Workman
      last edited by

      This is a needed feature.

      FOS checkin time should be a kernel argument, that defines how often a host waiting in line to image should check in with the FOG Server for an open slot.

      Right now, it’s set to 3 seconds. I recommend that the default be 30 seconds - but if this feature is implemented I’d gladly change whatever is default to 30 seconds.

      Here’s my reasoning for 30.

      hosts boot and checkin at different times. They aren’t all at once exactly usually. a 30 second interval doesn’t mean when one computer is done, it’s 30 seconds before the next starts. It just means that COULD happen. You can also say the next computer to begin imaging COULD happen in the next second!

      What I’m asking for is to allow the FOS checkin interval to be user-controlled from the web interface.

      Thanks,
      Wayne

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
      Daily Clean Installation Results:
      https://fogtesting.fogproject.us/
      FOG Reporting:
      https://fog-external-reporting-results.fogproject.us/

      Tom ElliottT 1 Reply Last reply Reply Quote 0
      • Tom ElliottT
        Tom Elliott @Wayne Workman
        last edited by

        @Wayne-Workman Checkin “checks” every 5 seconds.

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        1 Reply Last reply Reply Quote 0
        • Tom ElliottT
          Tom Elliott
          last edited by

          I disagree that this should be a user definable element. We keep record of the how long the queued tasks are waiting so as to prevent one system being unplugged removed and preventing the rest of the queue from moving along.

          I am working to fix the progress reporter as I believe it is set to checkin every three seconds. While it does create a bit of a poll on the server, this is typically very minimal even at 3 second intervals.

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

          Wayne WorkmanW 1 Reply Last reply Reply Quote 0
          • Wayne WorkmanW
            Wayne Workman @Tom Elliott
            last edited by Wayne Workman

            @Tom-Elliott I don’t understand how the timekeeping couldn’t all be dynamically based on a user-adjustable number. How would one computer being unplugged keep others from continuing - and how is that tied to the check in time?

            I’m specifically talking about the FOS checkin time that waits for a slot. Even 5 seconds is a lot, when there could potentially be 500 in queue. 50 in queue when replication to 14 nodes is happening, and one upload is happening simultaneously - renders MySQL unable to keep up. Anything at all to reduce the load on MySQL would help - and making this user definable would solve it for us.

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
            Daily Clean Installation Results:
            https://fogtesting.fogproject.us/
            FOG Reporting:
            https://fog-external-reporting-results.fogproject.us/

            1 Reply Last reply Reply Quote 0
            • I
              ITSolutions Testers
              last edited by

              I guess I am not really understanding what this would solve? The check in is a small little packet, that shouldn’t slow down the imaging process even with a hundred machines checking in I don’t see where this would make much impact on the system.

              If I am wrong could you explain what purpose this would serve?

              Wayne WorkmanW 1 Reply Last reply Reply Quote 0
              • Wayne WorkmanW
                Wayne Workman @ITSolutions
                last edited by Wayne Workman

                @ITSolutions It’s when the system is under very heavy load. The little things DO count. This is how Linux developers have been forever, little things count. Little efficiencies here and there add up. With the mindset of “Oh, that little thing, I don’t care”, soon with many of those you have a bloated inefficient system - like Windows.

                But anyways, Yesterday we had 1 image capturing, 6 computers deploying, and replication for a snapin and a new image (uploaded earlier) that was replicating to 14 storage nodes. To put the cherry on the cake, 50 hosts queued waiting for image deployment.

                Those 50 hosts, every 5 seconds, check for a slot. The server was under tremendous load already. I think 5 seconds is excessive, I want to set a custom value of 30 (and I will). MySQL could not keep up. I was getting the “Update Schema” page left and right, and intermittently when trying to do anything during this time. Also my custom status reporting script quit working too during this time - it kept erroring with “too many connections”. I did up the MySQL max connections to 500 and that seemed to help, but still anything at all to reduce load and improve efficiency is a good thing.

                And - I’d argue very strongly that 30 seconds is absolutely acceptable. It doesn’t mean there’s a 30 second wait between one host getting done and another starting. It means there will be a 0 - 30 second wait, and the more computers there are waiting at a time, the less chance the wait will be anywhere near 30 seconds. Plus 30 seconds is not a long time. I want a 30 second checkin time for the waiting phase.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                Daily Clean Installation Results:
                https://fogtesting.fogproject.us/
                FOG Reporting:
                https://fog-external-reporting-results.fogproject.us/

                1 Reply Last reply Reply Quote 0
                • Wayne WorkmanW
                  Wayne Workman
                  last edited by Wayne Workman

                  Server
                  • FOG Version: 1.3.0 RC-20 svn 6011
                  • OS: CentOS 7
                  Description

                  I’m again experiencing very high CPU usage due to Apache and MySQL being slammed.

                  I figured out this is due to a mere 12 hosts that are queued for imaging. The server doing the imaging isn’t even the main server, it’s a remote node. And these 12 hosts reporting in over and over so often with probably inefficient sql and methodology is killing the main server’s 8 cores.

                  Again, 12 hosts queued for imaging is doing this, maxing out a 8-core server that isn’t even the server dolling out the images.

                  This sort of load makes the FOG system as a whole - almost unusable. 12 hosts queued for imaging and waiting for an open slot is causing a 15-server system to be almost unusable.

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                  Daily Clean Installation Results:
                  https://fogtesting.fogproject.us/
                  FOG Reporting:
                  https://fog-external-reporting-results.fogproject.us/

                  Tom ElliottT 1 Reply Last reply Reply Quote 0
                  • Tom ElliottT
                    Tom Elliott @Wayne Workman
                    last edited by

                    @Wayne-Workman The issue isn’t the reporting in. There’s a literally a delay of 3 seconds every checkin. This is unlikely what’s causing your high load.

                    Progress is only updated per each host every 3 seconds. This is why there’s a 5 second delay on the task management page.

                    More likely, 12 hosts imaging means that you have 12 open connections to the db (by proxy of the node). The transfer of the data to the db is nearly instant, (what ever the delay would be to update 12 individual sql statements).

                    This (also) is unlikely causing a high load.

                    The fact that you had a capture going (writing), and 6 deploy’s going (6 different reads) was a portion of what the load by the server is being caused from.

                    If you want to disable persistent connections (which should prevent your ‘too many connections’ issue) Edit the file:
                    /var/www/fog/lib/db/pdodb.class.php at line 64 and change the true to false. This would tell you quite quickly that things are working properly. I use persistent connections in an attempt to help speed connections to the server as many times the data being requested is coming from a “continuous” source.

                    If you want to set a different timeout, feel free to edit the inits. Particularly the file: in the FOS filesystem located at:

                    /bin/fog.statusreporter

                    Line numbers 6 and 14.

                    Change them from:
                    usleep 3000000
                    to
                    sleep 30 or usleep 30000000 changing the 30 to whatever you want the value to be. I doubt it will help the scenario unless you disable the persistent connections though.

                    (Either way, the connections will have to be made, but the load is not likely coming from the updating.)

                    You can have a look at your /var/www/fog/service/progress.php file as well if you’d like.

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    Wayne WorkmanW 1 Reply Last reply Reply Quote 1
                    • Wayne WorkmanW
                      Wayne Workman @Tom Elliott
                      last edited by

                      @Tom-Elliott said in FOS checkin time:

                      The fact that you had a capture going (writing), and 6 deploy’s going (6 different reads) was a portion of what the load by the server is being caused from.

                      That wasn’t the case yesterday.

                      Yesterday, no captures were going. 3 computers were imaging, 12 were waiting in queue to image. The 3 were imaging from a remote server, not even the main server. And the main server’s CPU was maxed out.

                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                      Daily Clean Installation Results:
                      https://fogtesting.fogproject.us/
                      FOG Reporting:
                      https://fog-external-reporting-results.fogproject.us/

                      1 Reply Last reply Reply Quote 0
                      • 1 / 1
                      • First post
                        Last post

                      212

                      Online

                      12.0k

                      Users

                      17.3k

                      Topics

                      155.2k

                      Posts
                      Copyright © 2012-2024 FOG Project