• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Extremely Slow Deploy to NVME drives

    Scheduled Pinned Locked Moved Solved
    FOG Problems
    9
    35
    7.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      Sebastian Roth Moderator
      last edited by george1421

      @george1421 @robbit As for the commands to get the very latest develpment version (called dev-branch) you need to add one more command to what was posted below:

      sudo -i
      cd /root/fogproject/bin
      git checkout dev-branch
      ./installfog.sh
      

      Leaving that one command out doesn’t breach anything but you’ll end up with current master 1.5.7 again.

      About the slow speed. Do you have Toshiba drives? https://forums.fogproject.org/topic/13620/very-slow-cloning-speed-on-specific-model

      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

      R 1 Reply Last reply Reply Quote 1
      • R
        robbit @Sebastian Roth
        last edited by

        @Sebastian-Roth

        Thank you for that. Our HP laptops were equipped with Western Digital PC SN520 NVMe SSD, but also tested with Toshiba NVMe THNSSN5256GPUK with same results. After the update, it still went to a crawl HOWEVER

        When I went to change the Kernal parameter under FOG Configuration -> FOG Settings -> General -> Kernel args -> nvme_core.default_ps_max_latency_us=5500 -> it’s working now. I was able to deploy the image to a NVMe drive via unicast.

        Test#1 w/ Fog Server 1.5.7.3

        • Same results with the NVMe drives where the rate goes to a crawl of about 10MB/min

        Test#2 w/ nvme_core.default_ps_max_latency_us=5500 + Fog 1.5.7.3

        • Deploying at a solid speed ~5GB/min on isolated networrk

        Whatever it is, the combination of both of those have fixed the issue.

        I want to thank both you @george1421 and @Sebastian-Roth for chiming in! It’s good to know there’s a very active community for this.

        1 Reply Last reply Reply Quote 1
        • ?
          A Former User
          last edited by

          I am having this same issue. I posted about this a month ago. I tried the nvme_core.default_ps_max_latency_us=5500 kernel argument but it returned an error about it being not a valid identifier after booting. I’ve updated to 1.5.7.4. Any other suggestions?

          george1421G 1 Reply Last reply Reply Quote 0
          • george1421G
            george1421 Moderator @A Former User
            last edited by

            @Shad0wguy said in Extremely Slow Deploy to NVME drives:

            nvme_core.default_ps_max_latency_us=5500

            This needs to go into the kernel parameters field either the global one in FOG Settings->FOG Configuration field or in the host specific field kernel parameter.

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

            1 Reply Last reply Reply Quote 0
            • K
              KSiig
              last edited by

              I’m beginning to run into the same problem, also with an HP 840 G6. I’ve tried the kernel arguments, however it just says ‘not a valid indentifier’. I have a Samsung 850 evo M.2 Sata 6GB/s drive laying around which I tested with, along with an HP 830 G5:

              830 + 850 EVO: Success
              830 + Original NVMe SSD: Success
              840 + Original NVMe SSD: Failure
              840 + 850 Evo: Success

              I compared the SSDs from the 840 G6 and the 830 G5, and they are the exact same model. So while it’s of course a very small sample size, it’s pretty clear that’s it has something to do with the 840 G6 combined with NVMe that’s causing the failure.

              One thing that also only happens with the 840 G6 is that it shows the following message:

              udevd[3088]: inotify_add_watch(6, /dev/nvme0n1p2, 10) failed: no such file or directory
              udevd[3089]: inotify_add_watch(6, /dev/nvme0n1p1, 10) failed: no such file or directory
              
              1 Reply Last reply Reply Quote 0
              • S
                Sebastian Roth Moderator
                last edited by

                @KSiig said in Extremely Slow Deploy to NVME drives:

                however it just says ‘not a valid indentifier’

                Please take a picture of the error and post here!!

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                M 1 Reply Last reply Reply Quote 0
                • M
                  Middle @Sebastian Roth
                  last edited by

                  We’re getting the same slow deployment speeds when deploying images to HP Elitebook 840 Gen 6 laptops. All other laptops are fine. Before the deploy starts (at very slow speeds), it will hang at the initial Partclone screen for 10 - 15 mins.

                  2019-10-03 12_11_02.jpg

                  @Sebastian-Roth & @george1421 - Below is a photo of the the ‘not a valid identifier’ error we see. It will also hang at this stage from time-to-time.

                  2019-10-03 11_59_33-Untitled - Message (HTML).jpg

                  Below is the setting applied to the host in Fog.

                  2019-10-03 12_01_16-VMware Remote Console.jpg

                  We’re running CentOS 7.6 with Fog v1.5.7.4 and Kernel 5.1.16. I’ve tried with and without nvme_core.default_ps_max_latency_us=5500 set as the Host Kernel Arguments and Partclone 0.3.12 which @Quazz provided in another post.

                  The disks appear to be Toshiba NVMe kbg30zmv256g.

                  All suggestions welcome and happy to provide anymore info that might help troubleshoot.

                  george1421G 1 Reply Last reply Reply Quote 0
                  • george1421G
                    george1421 Moderator @Middle
                    last edited by george1421

                    @Middle That error message (while its a valid error message) isn’t important in this case. After testing the dot after nvme_core is at issue with the bash shell for variables to be used during image deployment. What IS important is that the linux kernel see that parameter and understand it. To test this you can create a debug deployment (capture or deploy) by ticking the debug checkbox just before scheduling the task. PXE boot the target computer and after a few screens of text you will be dropped to the FOS Linux command prompt. Key in sysctl -a | grep nvme If that nvme_core parameter is set then its job is done.

                    Now with that said, in another thread (I need to locate and link here) the developers are testing a new init (FOS Linux virtual hard drive) with updated version of partclone. The updated version of partclone along with the nvme_core kernel parameter seemed to fix the slow speeds with these specific nvme drives.

                    Edit: here is the link in @Quazz post. Understand this is an experimental init that hasn’t been fully tested, but has shown promise on these nvme drives. https://forums.fogproject.org/topic/13620/very-slow-cloning-speed-on-specific-model/10

                    Edit2: Wait, I see from your picture you are already using the new inits because you have partclone 3.12. Hmmmm there must be something else going on here.

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                    M 1 Reply Last reply Reply Quote 0
                    • M
                      Middle @george1421
                      last edited by

                      @george1421 entered debug and the nvme_core parameter doesn’t appear to be set.

                      2019-10-03 13_28_23.jpg

                      george1421G Q 2 Replies Last reply Reply Quote 0
                      • george1421G
                        george1421 Moderator @Middle
                        last edited by

                        @Middle make sure that sysctl exists on the FOS Linux system. Just run sysctl -a and that will print out all kernel parameters. You can pipe that to more if you want to see it a page at a time.

                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                        M 1 Reply Last reply Reply Quote 0
                        • M
                          Middle @george1421
                          last edited by

                          @george1421 sysctl exists and returns results. Using sysctl -a | more we still don’t see anything that’s setting the latency parameter.

                          george1421G 1 Reply Last reply Reply Quote 0
                          • george1421G
                            george1421 Moderator @Middle
                            last edited by

                            @Middle OK so sysctl exists lets try this.

                            1. Setup a debug deploy (tick the debug checkbox when you go to schedule the task)
                            2. PXE boot the target computer.
                            3. After several screens of text you need to clear by pressing the enter key you will be dropped to the FOS Linux command prompt.
                            4. At the FOS Linux command prompt key in sysctll -w nvme_core.default_ps_max_latency_us=5500 and press enter
                            5. Confirm the setting is in place with sysctl -a | grep nvme
                            6. If everything checks out OK then start the FOG master script with fog

                            In debug mode you will need to press enter after every step but imaging should proceed. Lets see if by manually entering the kernel parameter it images correctly. If it works we can do an automated process later. Right now I want to see if there is a change.

                            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                            M 1 Reply Last reply Reply Quote 0
                            • M
                              Middle @george1421
                              last edited by

                              @george1421 Back in the office and just tried this. We get unknown key for the latency parameter.

                              2019-10-04 12_05_38-Untitled - Message (HTML).jpg

                              grep nvme doesn’t return the value either and running fog gets stuck at the ‘Restoring Partition Tables (GPT)’ section, so didn’t get as far as Partclone. Note we’re using Kernel 5.1.16.

                              Thanks for helping out.

                              1 Reply Last reply Reply Quote 0
                              • Q
                                Quazz Moderator @Middle
                                last edited by Quazz

                                @Middle Try setting the kernel argument as a global setting instead of on the host page. (FOG Configuration -> FOG Settings -> General -> Kernel args)

                                This problem may also be resolved with SSD firmware updates if available.

                                I’d also be interested in the results of kernel args pcie_aspm=off and pcie_aspm=force (do not set the latter as global)

                                Only set one of the 3 kernel arguments.

                                This problem is caused by ASPM and how certain devices interact with it. The reason it’s a problem specifically for NVME devices is because of their PCIE connection. A lot of these drives have buggy implementations (sometimes fixed in firmware updates)

                                M 1 Reply Last reply Reply Quote 1
                                • M
                                  Middle @Quazz
                                  last edited by

                                  @Quazz No change I’m afraid with the pcie_aspm args (slow transfer rate). I didn’t spot any errors like we get with the latency one, however I do receive ‘is an unknown key’ when trying to add in debug mode. I’ve tired with both the 5.1.16 and 4.19.64 kernels.

                                  Incidentally, if I have a kernel args set and I use debug mode, it always seems to stop at the ‘Restoring Partition Tables (GPT)’ section. Running a normal deploy at least moves onto the Partclone screen and eventually to a slow transfer rate.

                                  I’ve also installed the Sept 27th HP BIOS and Firmware pack. I’m still looking for a firmware update specifically for the disk.

                                  Q 1 Reply Last reply Reply Quote 0
                                  • Q
                                    Quazz Moderator @Middle
                                    last edited by

                                    @Middle Unfortunately, aside from the latency kernel argument there isn’t anything else we can do from our side as far as I’m aware.

                                    Unfortunately manufacturers don’t always check how their stuff works on linux…

                                    D 1 Reply Last reply Reply Quote 0
                                    • D
                                      DeRo93 @Quazz
                                      last edited by

                                      @Quazz

                                      Same here with the “nvme_core.default_ps_max_latency_us=5500 not a valid identifier” problem…

                                      1 Reply Last reply Reply Quote 0
                                      • S
                                        Sebastian Roth Moderator
                                        last edited by

                                        @DeRo93 said in Extremely Slow Deploy to NVME drives:

                                        Same here with the “nvme_core.default_ps_max_latency_us=5500 not a valid identifier” problem…

                                        The message is more a warning that the variable couldn’t be used in the FOS environment but it’s still properly setting the kernel parameter. So it should make a difference if that option is of any help in your case.

                                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                        D 1 Reply Last reply Reply Quote 0
                                        • D
                                          DeRo93 @Sebastian Roth
                                          last edited by

                                          @Sebastian-Roth

                                          ah okay. Unfortunately this does not helped and im not able to deploy images on the HP Elitebook 840 G6.

                                          Do you have any other suggestions :/?

                                          Q 1 Reply Last reply Reply Quote 0
                                          • Q
                                            Quazz Moderator @DeRo93
                                            last edited by

                                            @DeRo93 If available, install firmware updates, BIOS updates and such.

                                            @Developers Looking over FOS, it seems that sector size is always assumed to be 512. Could this be involved in the slow speeds? (as it would cause missalignment, potentially)

                                            Additionally, it seems sector size isn’t always correctly reported by tools such as fdisk (possibly hardware manufacturers fault; dont know). So even if software is generally clever enough to handle it on its own, if it assumes the wrong value, we can assume worse performance (even after deployment)

                                            Tom ElliottT D 2 Replies Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 2 / 2
                                            • First post
                                              Last post

                                            153

                                            Online

                                            12.0k

                                            Users

                                            17.3k

                                            Topics

                                            155.2k

                                            Posts
                                            Copyright © 2012-2024 FOG Project