• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Very slow cloning speed on specific model

Scheduled Pinned Locked Moved Solved
FOG Problems
18
145
57.4k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    Duncan @Duncan
    last edited by Dec 5, 2019, 11:26 AM

    @Duncan

    Iv built a few more laptops now.

    2 built straight out the box, no kernels or inits needed.

    One had the slowness issue. In the host page i just added the kernel setting.

    Deployed the image and away it went. Full speed. building about 8gb/min

    Q 1 Reply Last reply Dec 5, 2019, 1:10 PM Reply Quote 0
    • S
      Sebastian Roth Moderator
      last edited by Dec 5, 2019, 12:42 PM

      @Duncan Can’t believe it but if it’s the way you saying (and showing in the pictures) - what can I say… 🙂

      @Quazz gave me a good hint on kernel 4.10 or 4.11 introducing APST. We kind of expect this to be causing the problem. See some information on this here: https://wiki.archlinux.org/index.php/Solid_state_drive/NVMe#Power_Saving_APST

      He also just added the nvme cli tools to the FOS initrds so we could try to work on debugging more of this with more recent kernel versions.

      It’s all up to you. If you are happy with the old 4.9.51 kernel we can just leave it like that. Though I don’t think it’s a great solution.

      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

      1 Reply Last reply Reply Quote 0
      • Q
        Quazz Moderator @Duncan
        last edited by Quazz Dec 5, 2019, 8:13 AM Dec 5, 2019, 1:10 PM

        @Duncan This suggests that it is indeed a kernel issue. Interesting that it ran at all with the newer inits since they’re only slated for backwards compatibility to 4.14 I believe.

        My best guess at the moment is that the APST feature introduced in 4.10 is either the problem in its entirety or related to it somehow.

        It’s still building, but when it’s done, there will be an init available at https://dev.fogproject.org/blue/organizations/jenkins/fos/detail/master/107/artifacts

        EDIT: That build failed due to unrelated error, here is a different link https://drive.google.com/open?id=1u_HuN5NSpzb7YmQBAsrzDELteNmlWUWU

        This will include an NVME cli utility that will give some info and allow some management over the NVME device.

        I’d be interested in seeing a debug deploy on this init (use kernel 4.19 as well). If you could schedule one for a problematic host and run the following commands that would help a ton.

        sudo nvme get-feature -f 0x0c -H /dev/nvme0
        

        That will list out some info, the one I’m interested in is whether APST is enabled or not.

        If it’s enabled you can disable it by doing

        sudo nvme set-feature -f 0x0c -v=0 /dev/nvme0
        

        Then type

        fog
        

        Press enter (you’ll have to do this a couple more times until it starts partclone and such)

        I’m hoping that this will resolve the issue entirely and if so we can add to the inits if an NVME device is detected. APST is unneeded in FOS environment since we don’t care about power consumption of the storage device since it just needs to get captured or deployed and then the system takes it from there.

        D 1 Reply Last reply Dec 5, 2019, 3:03 PM Reply Quote 1
        • D
          Duncan @Quazz
          last edited by Dec 5, 2019, 3:03 PM

          @Quazz said in Very slow cloning speed on specific model:

          Downloaded new Init_partclone added to host. Set the Kernel to 4.19.

          Ran commands, APST was enabled. I disabled it and started to image.

          Now its hung on Restoring Partition Tables GPT…

          Rebooted, and tried again. Its now building at 2.7gb/min.

          One thing i did notice was my storage nodes where on an old kernel 4.11.0. I have now copied over the latest ones to the nodes.

          Q 1 Reply Last reply Dec 5, 2019, 3:05 PM Reply Quote 0
          • Q
            Quazz Moderator @Duncan
            last edited by Quazz Dec 5, 2019, 9:05 AM Dec 5, 2019, 3:05 PM

            @Duncan Thank you for trying it out. Very interesting results!

            Much better than before, though not quite the speed you’d expect either.

            @Sebastian-Roth What do you think? Should we investigate further?

            D 1 Reply Last reply Dec 5, 2019, 3:07 PM Reply Quote 0
            • D
              Duncan @Quazz
              last edited by Dec 5, 2019, 3:07 PM

              @Quazz

              i can live with these speeds. image is only 70gb.

              Alot faster than three weeks.

              Im going to test this on my other sites now and see what speeds i get.

              Seems to be that APST though. I wonder if some have it enabled out of the box and others dont. Im going to run the command to check on a “working” laptop and see if its disabled by default.

              Q D 2 Replies Last reply Dec 5, 2019, 3:14 PM Reply Quote 0
              • Q
                Quazz Moderator @Duncan
                last edited by Quazz Dec 5, 2019, 9:15 AM Dec 5, 2019, 3:14 PM

                @Duncan As far as I understand it’s only for specific drives on specific laptops (even amongst the same model), but it’s relatively widespread regardless.

                Potentially slight firmware differences or the like.

                Using that init file, you could add the command line that disables APST to images/dev/postinitscripts

                G 1 Reply Last reply Dec 5, 2019, 3:38 PM Reply Quote 0
                • G
                  george1421 Moderator @Quazz
                  last edited by george1421 Dec 5, 2019, 9:38 AM Dec 5, 2019, 3:38 PM

                  @Quazz Do you see any issue with just disabling it for all nvme drives? I don’t know the impact if we did. FOS Linux is not a general purpose OS so we don’t really want or need any sleep functions at all. We really want the OS and the hardware to run as fast as possible and not be concerned about any power savings.

                  You are right about the postinit scripts. If we had the raw data, I’m sure we could come up with a script to disable this function on certain detected drives or just turn it off all together. Comments??

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                  Q 1 Reply Last reply Dec 5, 2019, 3:55 PM Reply Quote 0
                  • D
                    Duncan @Duncan
                    last edited by Dec 5, 2019, 3:42 PM

                    @Duncan

                    On a working laptop APST was enabled also.

                    So i guess it is i firmware or slight hardware difference.

                    With the APST disabled on this one again im seeing speeds of 2.8 - 3.0gb/min

                    1 Reply Last reply Reply Quote 0
                    • Q
                      Quazz Moderator @george1421
                      last edited by Dec 5, 2019, 3:55 PM

                      @george1421 As far as I’m aware, all disabling APST does is lock the drive to its “highest power state”. Which for the purposes of FOS isn’t a bad choice if it would otherwise malfunction.

                      I don’t foresee a problem doing this for all NVME devices, but of course there might be instances we are unaware about currently where it does matter for something.

                      That said, FOS only runs for a little while, so odds of it being bad are very low.

                      1 Reply Last reply Reply Quote 0
                      • S
                        Sebastian Roth Moderator
                        last edited by Dec 5, 2019, 4:04 PM

                        @Duncan said in Very slow cloning speed on specific model:

                        Kernel 4.9.51 … Deployed the image and away it went. Full speed. building about 8gb/min

                        Is this all the way through or just top speed? Maybe it’s better you note down the full deploy time to compare the different situations more appropriately?!

                        latest kernel with APST disabled… Its now building at 2.7gb/min.

                        Does this really mean it’s that much slower than using the 4.9.51 kernel or is it more just a top speed thing? As I said, better we compare the time it takes to deploy the full drive.

                        @george1421 @Quazz I’d vote for disabling APST in FOS as we don’t need to save energy. The drive should go at full speed.

                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                        D G 2 Replies Last reply Dec 5, 2019, 4:17 PM Reply Quote 0
                        • D
                          Duncan @Sebastian Roth
                          last edited by Duncan Dec 5, 2019, 10:18 AM Dec 5, 2019, 4:17 PM

                          @Sebastian-Roth

                          Definatly a difference in speeds.

                          Using bzimage 4.19 and init_partclone.xz got an average of 3gb/min

                          Using bzimage-4.9.51 and init.xz started at 7gb/min and dropped and hanging around 6.6(ish)gb/min

                          both tests on the same laptop

                          1 Reply Last reply Reply Quote 0
                          • G
                            george1421 Moderator @Sebastian Roth
                            last edited by george1421 Dec 5, 2019, 11:03 AM Dec 5, 2019, 5:02 PM

                            @Sebastian-Roth So I’m wondering 2 things.

                            1. Before 1.5.8 comes out, could/should we create a post init script with the logic that might go into FOS Linux for 1.5.8 that would test the impact of this proposed change? This way if the change caused problems, deleting the script would fix it. (know I worded that a bit funny. But the idea is to test it with an approved post init script before its coded into 1.5.8. So if people have this issue, we can say place this script here and test. This would be for 1.5.7 and lower versions)
                            2. Does the kernel parameter nvme_core.default_ps_max_latency_us=0 have any impact on shutting off this feature right at the disk level? Better/worse/nochange? If it had a positive impact then that could be integrated into the post init script and then into FOS Linux 1.5.8.

                            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                            1 Reply Last reply Reply Quote 0
                            • S
                              Sebastian Roth Moderator
                              last edited by Dec 6, 2019, 7:47 AM

                              @george1421 Yes, good points:

                              1. It’s a good idea to provide a post init script right now for people to test. I am not exactly sure what part is doing it. I think it’s nvme set-feature -f 0x0c -v=0 /dev/nvme0 right? @Duncan @Quazz - Would you like to help testing as well, @oleg-knysh?
                              2. I have thought about the nvme_core.default_ps_max_latency_us parameter as well. Not sure if that sort of doing the same thing?! Probably a bit different but might have the same outcome?! The parameter is mentioned in that ARCH Linux wiki I posted below already. @Duncan Would you please test this kernel parameter for us on that problematic laptop? Go to the host’s settings in the web UI and set nvme_core.default_ps_max_latency_us=0 as Kernel Parameter but using the default kernel (4.15.x). See what speed you get. As well try nvme_core.default_ps_max_latency_us=5500 (as described in the wiki) also using default kernel. Thanks!

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              D 1 Reply Last reply Dec 6, 2019, 11:35 AM Reply Quote 0
                              • D
                                Duncan @Sebastian Roth
                                last edited by Duncan Dec 6, 2019, 6:45 AM Dec 6, 2019, 11:35 AM

                                @Sebastian-Roth

                                Ok so i ran some tests, i hope it make sense to you all.

                                These where all ran on the same original slow laptop i have been using since the start.

                                Build1:

                                Host Kernel: Blank
                                Host Kernel Arguments:BLank
                                Host Init: Blank

                                build speed slow

                                Build2:

                                Host Kernel: bzImage-4.9.51
                                Host Kernel Arguments:BLank
                                Host Init: Blank

                                build speed - 6.5gb - 7gb/min (ish)

                                build3:

                                Host Kernel:bzimage
                                Host Kernel Argument: nvme_core.default_ps_max_latency_us=0
                                Host Init: Blank

                                nvme_core.default_ps_max_latency_us=0 - not a valid identifier
                                build speed fast - 6.5gb - 7gb/min (ish)

                                Build4:

                                Host Kernel: bzImage-4.15.2
                                Host Kernel Arguments: nvme_core.default_ps_max_latency_us=0
                                Host Init: Blank

                                nvme_core.default_ps_max_latency_us=0 - not a valid identifier
                                build speed fast - 6.5gb - 7gb/min (ish)

                                Build5:

                                Host Kernel: bzImage-4.9.51
                                Host Kernel Arguments: nvme_core.default_ps_max_latency_us=0
                                Host Init: Blank

                                nvme_core.default_ps_max_latency_us=0 - not a valid identifier
                                build speed fast - 6.5gb - 7gb/min (ish)

                                build 6:

                                Host Kernel: bzImage
                                Host Kernel Arguments: nvme_core.default_ps_max_latency_us=5500
                                Host Init: Blank

                                nvme_core.default_ps_max_latency_us=5500- not a valid identifier
                                build speed slow

                                build7:

                                Host Kernel: bzImage-4.15.2
                                Host Kernel Arguments: nvme_core.default_ps_max_latency_us=5500
                                Host Init: Blank

                                nvme_core.default_ps_max_latency_us=5500 - not a valid identifier
                                build speed slow

                                build8:
                                Host Kernel: bzImage-4.9.51
                                Host Kernel Arguments: nvme_core.default_ps_max_latency_us=5500
                                Host Init: Blank

                                nvme_core.default_ps_max_latency_us=5500- not a valid identifier
                                build speed fast - 6.5gb - 7gb/min (ish)

                                G 1 Reply Last reply Dec 6, 2019, 12:26 PM Reply Quote 1
                                • G
                                  george1421 Moderator @Duncan
                                  last edited by Dec 6, 2019, 12:26 PM

                                  @Duncan said in Very slow cloning speed on specific model:

                                  vme_core.default_ps_max_latency_us=0 - not a valid identifier

                                  First of all let me say excellent matrix. It looks like the latency of 0 does the trick without having to use the nvme-cli command.

                                  Second thing the above error message is not really an error, its a spurious message because of the way FOG converts kernel parameters into variables. The kernel parameter apparently does its job, but throws that warning which can be ignored.

                                  Again, well done with the truth table matrix. So it looks like you can go back to using the standard fog kernel but just place the kernel argument nvme_core.default_ps_max_latency_us=0 in the global kernel parameters in the FOG Configuration -> FOG Settings menu.

                                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                  D 1 Reply Last reply Dec 6, 2019, 12:58 PM Reply Quote 0
                                  • D
                                    Duncan @george1421
                                    last edited by Duncan Dec 6, 2019, 6:58 AM Dec 6, 2019, 12:58 PM

                                    @george1421

                                    Setting now set, my original laptop is now building at the 6.5gb/min i expected.

                                    Will set a load more off soon and report back.

                                    Again many thanks to everyone that has helped me out over the last few weeks.

                                    1 Reply Last reply Reply Quote 1
                                    • S
                                      Sebastian Roth Moderator
                                      last edited by Dec 6, 2019, 5:42 PM

                                      @Duncan Many thanks to you too!! Great work on the testing you’ve done here, awesome. I think this has given us a great set of recipes we can give people in case they run into that issue. We might even think about sending the kernel parameter nvme_core.default_ps_max_latency_us=0 as default. @Tom-Elliott @Quazz @george1421 Do you see any issue with that?

                                      nvme_core.default_ps_max_latency_us=0 - not a valid identifier

                                      As George already said, this is not an issue but more a warning. I was hoping to find some time and fix that at some point. Will do so now.

                                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                      G Q 2 Replies Last reply Dec 6, 2019, 5:52 PM Reply Quote 0
                                      • G
                                        george1421 Moderator @Sebastian Roth
                                        last edited by Dec 6, 2019, 5:52 PM

                                        @Sebastian-Roth said in Very slow cloning speed on specific model:

                                        nvme_core.default_ps_max_latency_us=0

                                        I don’t see an issue with just adding into sysctl inside FOS and not worry about passing it. That way the variable conversion won’t have an issue. Also since its a nvme specific kernel tweak, if nvme isn’t use (i.e. sata disk) then the kernel “should” ignore it. I only say “should” because we don’t have a large enough sample population to say yes or no yet. But that is just my opinion.

                                        As I said before the OP did a great job helping us come up with a sound solution. Without having the troubled hardware in front of us it would have been impossible to find a solution.

                                        I still think adding the nvme-cli tool to FOS will add value in trying to debug issues later on too.

                                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                        Tom ElliottT 1 Reply Last reply Dec 6, 2019, 6:04 PM Reply Quote 0
                                        • Tom ElliottT
                                          Tom Elliott @george1421
                                          last edited by Dec 6, 2019, 6:04 PM

                                          @george1421 I agree with it all.

                                          I can’t imagine a need for latency being enabled by default. I added it to 1.6 for safety. Shouldn’t be hard to port to 1.5.x

                                          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                          G 1 Reply Last reply Dec 6, 2019, 6:08 PM Reply Quote 0
                                          • 1
                                          • 2
                                          • 3
                                          • 4
                                          • 5
                                          • 6
                                          • 7
                                          • 8
                                          • 5 / 8
                                          5 / 8
                                          • First post
                                            87/145
                                            Last post

                                          179

                                          Online

                                          12.0k

                                          Users

                                          17.3k

                                          Topics

                                          155.2k

                                          Posts
                                          Copyright © 2012-2024 FOG Project