• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    slow speed and timeout issues

    Scheduled Pinned Locked Moved
    FOG Problems
    4
    17
    1.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mmoore5553
      last edited by

      I having issues when i try to image a dell 7070 . It has hard drive Model – mz-9lq256a
      Part Number – MZ9LQ256HAJD-000D1. It is slow and sometimes disconnects. If i switch the hard drive to Kxg50znv256 NVME Toshiba 256gb everything is fast and works again. I have seen this post but the fix does not work.

      https://forums.fogproject.org/topic/13620/very-slow-cloning-speed-on-specific-model/20?lang=en-US&page=1

      https://forums.fogproject.org/topic/13777/extremely-slow-deploy-to-nvme-drives/17?lang=en-US&page=1

      I know for a fact and can replicate the issue if i put back the hard drive that it came with mz-9lq256a.

      does anyone know how to fix this ? I am working with dell on getting the hard drives replaced but would love to find out a way to work around it. We have a bunch of 7070 in stock.

      george1421G 1 Reply Last reply Reply Quote 0
      • george1421G
        george1421 Moderator @mmoore5553
        last edited by george1421

        @mmoore5553 What version of fog are you using to start with?

        What have you done in the two posted links to debug this issue?

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

        1 Reply Last reply Reply Quote 0
        • M
          mmoore5553
          last edited by

          i am sorry but i applied nvme_core.default_ps_max_latency_us=5500 and nvme_core.default_ps_max_latency_us=0 in the settings.

          I updated to the latest version last night - 1.5.8.28.

          this is as far as i have went since it appears to be hard drive issue or how the software is seeing the hard drive. I did not know what else to try.

          1 Reply Last reply Reply Quote 0
          • george1421G
            george1421 Moderator
            last edited by

            I have an update on this. will follow up a bit later with the status for the FOG Devs

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

            1 Reply Last reply Reply Quote 0
            • S
              Sebastian Roth Moderator
              last edited by

              @mmoore5553 Just to make we are hunting down the right rabbit… Do you have more than one hard drive of that mz-9lq256a model? Test at least on two or three to make sure this is reproducible on all of them. As well test on two or three different DELL 7070 machines to make sure!

              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

              M george1421G 2 Replies Last reply Reply Quote 0
              • george1421G
                george1421 Moderator
                last edited by

                Here is the executive brief on this issue.

                With the MZ9LQ256HAJD-000D1 installed iPXE is having an issue downloading the background image of the iPXE menu. Switching to the Toshiba nvme drive it works normally. There is something that iPXE is looking at/needs from local storage on this 7070.

                TLDR version;
                I hand an extended chat session with the OP. On the same computer without changing anything but the nvme drive the system was failing to download the background image for the iPXE menu. After about 20 seconds it had only downloaded 5% of the image (according to the video). I had the OP update the firmware just to rule out the uefi firmware being at fault. We next setup booting FOS Linux from a usb flash drive. Once the usb flash drive was setup the OP attempted to image with the MZ9LQ256 drive. FOS did image the drive but the OP said it was slower than normal. Sustained speed was about 4.2GB/m. I had the OP repeat the same steps with the toshiba drive. The toshiba drive imaged at 8.3GB/m, what the OP called normal. I had the OP add nvme_core.default_ps_max_latency_us=0 to the kernel parameters on the usb flash drive and attempt to reimage the slow drive again. This time the slow drive imaged at the normal rate of 8.3GB/m. So in FOS Linux the slow drive needed the latency parameter where the toshiba drive did not need this parameter. These tests were done on the same hardware with only the nvme drive changing. So it appears that iPXE is trying to do something with that slow disk. Once FOS Linux boots it images fine.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                1 Reply Last reply Reply Quote 0
                • M
                  mmoore5553 @Sebastian Roth
                  last edited by mmoore5553

                  @Sebastian-Roth this happen on every model that we had on the 7070. They all had the same hard drive. If it switched it to the bigger nvme hard drive everything work as expected. I worked with george to try to track down more issues.

                  I am willing to test in anyway that you would want me too. I can be a guinea pig

                  if you want to see the videos that i took then i will be more than happy to share with you. I can take more if needed.

                  I am more than happy to make a donation.

                  1 Reply Last reply Reply Quote 0
                  • george1421G
                    george1421 Moderator @Sebastian Roth
                    last edited by

                    @Sebastian-Roth My initial thoughts were a bit off. Initially I thought, what the heck does iPXE care about disk subsystem. It transfers files from the fog server to memory and then executes them…

                    Then I looked at the code and “SANBOOT” popped out. it needs to init the disk subsystem in case ipxe needs to chain to boot from the local hard drive. Now I don’t know what more that means other than the disk subsystem IS of the things it inits.

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                    1 Reply Last reply Reply Quote 0
                    • S
                      Sebastian Roth Moderator
                      last edited by

                      @george1421 As far as I understand the iPXE code I would expect sanboot code only to be used/initialized when you really use the sanboot command. The PXE environment is somehow restricted in terms of memory and so I reckon iPXE would not init stuff that is not needed explicitly.

                      Searching the forums for 7070 we seem to have a few posts on that:
                      https://forums.fogproject.org/topic/13851/massive-packet-loss-nic-issues-with-new-dell-7070-ultra-in-fog/
                      https://forums.fogproject.org/topic/13933/issues-with-optiplex-7070/

                      @mmoore5553 Please take a close look at the last topic (issues with optiplex 7070) and give that a try!

                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        mmoore5553 @Sebastian Roth
                        last edited by

                        @Sebastian-Roth @george1421

                        good news. I had some time to play today and found in the bios if i go to advanced configuration. Then to ASPM. I had to disable that . This controls the handshake between the device and pci express hub to determin the best aspm mode supported by the device. Once that was disabled everything was fast again and i could use the new hard drive and onboard nic.

                        1 Reply Last reply Reply Quote 0
                        • S
                          Sebastian Roth Moderator
                          last edited by

                          @mmoore5553 Thanks for getting back to us with this information. I am sure this will be helpful for others as well. So the issue seems to be caused by some energy saving mode.

                          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                          M 1 Reply Last reply Reply Quote 0
                          • M
                            mmoore5553 @Sebastian Roth
                            last edited by

                            @Sebastian-Roth yes. this is new in the upcoming models. I had to reach out to dell once i found it and let them know about it. They said the newer bios will have them. they had no clue that this would cause an issue.

                            1 Reply Last reply Reply Quote 0
                            • S
                              Sebastian Roth Moderator
                              last edited by

                              @mmoore5553 I suppose the Linux kernel developers are onto this issue as well. Possibly it’s already fixed in one of the more recent kernel lines (5.4.x or 5.6.x).

                              If you are really keen you could compile your own custom kernel using a newer version and see if it’s fixed upstream already. Just let us know if you need help with that.

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              george1421G Q 2 Replies Last reply Reply Quote 0
                              • george1421G
                                george1421 Moderator @Sebastian Roth
                                last edited by george1421

                                @Sebastian-Roth Here is a one-off kernel v5.5.3 that I created for some reason in Feb 2020… https://drive.google.com/open?id=1thopskSYJd7ueDQeFg_VT4eeNcrNHvIx

                                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                1 Reply Last reply Reply Quote 0
                                • Q
                                  Quazz Moderator @Sebastian Roth
                                  last edited by Quazz

                                  @Sebastian-Roth I mean, this seems to be the very same problematic code (or closely related to) that has already been given us various issues, primarily low speeds, which was subsequentially partially (or fully for some disks) addressed by setting the latency kernel parameter to 0 by default.

                                  That said, as we then discovered, in some cases it is not sufficient and we had to disable ASPM using the NVME cli utility for them to work normally. I don’t believe that was ever integrated into FOS since we don’t fully know if this could cause issues in otherwise properaly working drives.

                                  For NVME disks specifically it’s APST, a subset of ASPM.

                                  sudo nvme set-feature -f 0x0c -v=0 /dev/nvme0
                                  

                                  That line should disable it (assuming disk name is nvme0)

                                  1 Reply Last reply Reply Quote 1
                                  • S
                                    Sebastian Roth Moderator
                                    last edited by Sebastian Roth

                                    @Quazz Good point but it might not help in this case. If I remember correctly @george1421 sent me a link to a video where we saw that the symptom in this case was that it took literally minutes to download the kernel binary on PXE booting a task - so even before the kernel is even loaded.

                                    @Sebastian-Roth said:

                                    I suppose the Linux kernel developers are onto this issue as well. Possibly it’s already fixed in one of the more recent kernel lines (5.4.x or 5.6.x).

                                    Now that makes my last comment sound really stupid! 😄
                                    Probably something we’d need to dig into with the iPXE developers. But they seem very busy and unresponsive in the last months and I don’t think we’ll get very far with this.

                                    To really pin that down one of us devs would need the mentioned hardware to test on. But I am wondering if it’s worth it as we can’t promise to get it fixed. Could be a firmware bug really.

                                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                    george1421G 1 Reply Last reply Reply Quote 0
                                    • george1421G
                                      george1421 Moderator @Sebastian Roth
                                      last edited by

                                      @Sebastian-Roth Yes it was on the iPXE side. During testing we booted FOS Linux off a usb flash drive and it imaged fine. Well with some clarity, it imaged fine once we added nvme_core.default_ps_max_latency_us=0 to the usb/grub boot parameters. SO that kind of points to ipxe/hardware/uefi firmware that had a conflict with this specific nvme disk.

                                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                      1 Reply Last reply Reply Quote 1
                                      • 1 / 1
                                      • First post
                                        Last post

                                      198

                                      Online

                                      12.0k

                                      Users

                                      17.3k

                                      Topics

                                      155.2k

                                      Posts
                                      Copyright © 2012-2024 FOG Project