• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Performance decrease using Hyper-V Win10 clients

Scheduled Pinned Locked Moved Solved
Bug Reports
6
56
26.7k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    jkozee Testers
    last edited by Feb 21, 2016, 7:13 AM

    So, adding that line to 4.3.2 results in performance degradation and removing it from 4.4.2 results in performance increase. Guess all that’s left is to figure out (understand) what it actually does…

    1 Reply Last reply Reply Quote 1
    • J
      jkozee Testers
      last edited by Feb 21, 2016, 7:47 AM

      Maybe this makes more sense: blk_queue_virt_boundary(sdevice->request_queue, sdevice->page_size - 1);

      1 Reply Last reply Reply Quote 1
      • S
        Sebastian Roth Moderator
        last edited by Sebastian Roth Feb 21, 2016, 5:48 AM Feb 21, 2016, 11:16 AM

        @jkozee You are great man! This is what I love about open source and the people knowing how to go with it… 🙂

        I didn’t even know that there is a storage driver for Hyper-V (and VMware by the way) right in the linux kernel. And we have it enabled: CONFIG_HYPERV_STORAGE

        Looking at the scsi_device (sdevice) struct I don’t see page_size. So I don’t think your change is gonna work. Probably wouldn’t compile at all. Looking through the driver code I see PAGE_SIZE used several times. So I feel like this seams ok - although I don’t know much about this particular driver!

        You just might want to get in contact with the authors of the storvsc_drv driver (line 18ff) and as well the author of the patch! Tell them that you bisected a major slowdown issue to that particular patch. I guess that their input is a lot more helpful than trying to understand the whole driver by yourself! Please keep us posted.

        Edit: Seams like the function blk_queue_virt_boundary was added only a few months back - intended to be used with NVMe devices from what it looks to me (I don’t think this has anything to do with you seeing slower speeds on the SSD backend though).

        Edit2: Further interesting things to read on this are here and here - infiniband driver using this as well, interesting part is:

        (Very) nice cleanup – so what’s the actual deal here? is that as long
        as we plant a slave alloc callback into out scsi host template which
        further invokes a
        blk_queue_virt_boundary(sdev->request_queue, ~MASK_4K) call, we’re
        100% safe/sure what all SGs we get meet the alignment criteria?

        Correct, the nvme driver has the same alignment constraints for its PRPs
        and uses the queue virt_boundary to have the block layer enforce the
        SG alignment.

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        J 1 Reply Last reply Feb 21, 2016, 10:26 PM Reply Quote 1
        • J
          jkozee Testers @Sebastian Roth
          last edited by jkozee Feb 21, 2016, 4:33 PM Feb 21, 2016, 10:26 PM

          @Sebastian-Roth Oops, yes page_size is not part of the sdevice struct. It would probably be more appropriate to rollback the 81988a0e6b031bc80da15257201810ddcf989e64 anyhow. Leaving blk_queue_virt_boundary set to 0, rather than setting it to PAGE_SIZE-1 appears to fix the slowdown, but it would to take some research to determine what other impact that might have. I’ll probably just revert to 4.3.2 for my VMs until I have more time to investigate the issue.

          Edit:
          In fact, it looks like “Linux Integration Services for Microsoft Hyper-V” also diverge from the Bounce buffer commit: https://github.com/LIS/lis-next/blob/master/hv-rhel6.x/hv/storvsc_drv.c

          Looks like at least one of the authors of storvsc_drv.c is on the project, but not active.

          1 Reply Last reply Reply Quote 0
          • J
            jkozee Testers
            last edited by Feb 21, 2016, 10:50 PM

            I reached out to the author of the patch. I’ll post if any new information becomes available.

            T 1 Reply Last reply Feb 23, 2016, 11:03 PM Reply Quote 1
            • T
              Tom Elliott @jkozee
              last edited by Tom Elliott Feb 24, 2016, 11:43 AM Feb 23, 2016, 11:03 PM

              @jkozee I’ve added a bit of code to the found file that seems to be causing the issue.

              If you would like to try it, it can be downloaded at:

              http://mastacontrola.com/bzImage (64bit)

              I only built the 64 bit kernel.

              The adjusted code is:

              --- a/linux/drivers/scsi/storvsc_drv.c 2016-02-19 09:46:33.272075454 -0500
              +++ b/linux/drivers/scsi/storvsc_drv.c 2016-02-23 17:23:12.868518253 -0500
              @@ -1231,7 +1231,8 @@
                      blk_queue_rq_timeout(sdevice->request_queue, (storvsc_timeout * HZ));
               
                      /* Ensure there are no gaps in presented sgls */
              -       blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
              +    if (PAGE_SIZE - 1 < 0) blk_queue_virt_boundary(sdevice->request_queue, 0);
              +    else blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
               
                      sdevice->no_write_same = 1;
              

              It’s all on theory that what’s happening is it’s putting the disk to a negative number, and this causes the slowdown. Again, probably won’t work, but still would be nice to know for sure.

              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

              1 Reply Last reply Reply Quote 0
              • T
                Tom Elliott
                last edited by Feb 24, 2016, 7:54 PM

                I’ve solved this thread as we now know this is a bug in the kernel and not a bug due to FOG. Of course we can still document stuff here.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                J 1 Reply Last reply Feb 29, 2016, 4:26 PM Reply Quote 1
                • J
                  jkozee Testers @Tom Elliott
                  last edited by jkozee Feb 29, 2016, 10:29 AM Feb 29, 2016, 4:26 PM

                  @Tom-Elliott Sorry for not seeing this sooner. PAGE_SIZE is defined as 4096, so the mask is being set to 4095, which is the same value that iscsi_iser.c uses (~MASK_4K).

                  From the notes in LIS, I suspect that setting blk_queue_virt_boundary is supposed to insure that there are no gaps in th sg list, but they are still present, so the bounce buffer needs to be put back in place or the gaps need to be eliminated elsewhere.

                  The patch author responded this morning and is looking into the slowdown report. I’ll post any updates as I hear them.

                  1 Reply Last reply Reply Quote 2
                  • S
                    sudburr
                    last edited by Mar 11, 2016, 5:00 PM

                    Any update on this?

                    [ Standing in between extinction in the cold and explosive radiating growth ]

                    1 Reply Last reply Reply Quote 1
                    • S
                      Sebastian Roth Moderator
                      last edited by Mar 14, 2016, 10:43 AM

                      @jkozee Yes, please let us know if you have any news on this! As well I’d be interested in general information on using FOG with Hyper-V! I started to work on improving the wiki documentation and it would be great if you would put in your knowledge on this topic.

                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                      J 1 Reply Last reply Mar 17, 2016, 3:19 AM Reply Quote 0
                      • J
                        jkozee Testers
                        last edited by Mar 17, 2016, 3:10 AM

                        No resolution on this issue yet. (One of?) the author of the patch has confirmed the behavior and is investigating a kernel solution that doesn’t re-introduce the bounce buffers. No indication on how long this might take.

                        1 Reply Last reply Reply Quote 1
                        • J
                          jkozee Testers @Sebastian Roth
                          last edited by Mar 17, 2016, 3:19 AM

                          @Sebastian-Roth Sure, I’ll help out if I can. Do you have links to the wiki pages you’re working on?

                          1 Reply Last reply Reply Quote 0
                          • S
                            Sebastian Roth Moderator
                            last edited by Mar 17, 2016, 8:04 AM

                            @jkozee Awesome! Good to hear that you got a confirmation on this… please keep us posted. As well I sent you a chat message about the wiki stuff. Thanks!

                            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                            1 Reply Last reply Reply Quote 0
                            • T
                              Tom Elliott
                              last edited by Aug 17, 2016, 10:37 AM

                              Just tagging this once again. I realize there’s been 5-6 months of “quiet” on this, but any news yet? My patch, as far as I can tell, isn’t working so wondering if there was any progress on the status.

                              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              1 Reply Last reply Reply Quote 0
                              • T
                                Tom Elliott
                                last edited by Tom Elliott Jan 19, 2017, 3:48 PM Jan 19, 2017, 9:39 PM

                                Replying to this topic as I too have seen a severe (at least in my eyes) degradation to the speed of resizing.

                                While my original patch work was just a guess as to a problem, I decided to go outside of my own train of thought and followed, (I think) more specifically in regards to the 4096 rule.

                                While I have no idea what the real page_size will be, it would seem to me that this scsi storage control is intended more for the nvme and potentially the virtual scsi spaces. On this idea, I decided to have the page essentially run:

                                Adjusted patch work:

                                    if (PAGE_SIZE - 1 < 4096) {
                                        blk_queue_virt_boundary(sdevice->request_queue, 4096);
                                    } else {
                                        blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
                                    }
                                

                                Where my original patch work was:

                                    if (PAGE_SIZE - 1 < 0) {
                                        blk_queue_virt_boundary(sdevice->request_queue, 0);
                                    } else {
                                        blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
                                    }
                                

                                In the original patch, I never really saw an improvement in speed and chalked it up to NTFS just being a pain, or VMWare. It was off this idea that while I had a patch in place, it wasn’t really helping or hurting anything.

                                To give some scope. Using the default file or the patched up file a VMWare system with Windows XP 50GB started taking nearly 2 minutes to resize (and the device was being resized 10 fold (50 gb to about 5 gb)) so I figured, meh not too bad I suppose. As this thread specifies Hyper-V I wasn’t focused on VMWare and just assumed my slow issues was due to VMWare itself, or the way the disk was laid out. (BOY WAS I WRONG).

                                I decided to see if I could do anything to speed up the NTFS resize and thought about this thread for a bit. Throwing the whole idea of the original patch I tried out the window and just thinking, hmm what items would really be impacted by this from what I have seen, I thought about NVMe potentially (4k), and the SCSI volumes typically used by VM’s (Hyper-V or VMWare (possibly others)). So on the idea the NVMe is far more important I just decided to use 4096 as the base page_size. Using the now “new” patch the Same system being imaged only takes about 10 seconds of resize.

                                So I don’t know who we need to report this too (as I’m pretty sure my assumptions aren’t very nice) but it is very much something in this blk_queue_virt_boundary thing.

                                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                1 Reply Last reply Reply Quote 0
                                • S
                                  Sebastian Roth Moderator
                                  last edited by Sebastian Roth Jan 24, 2021, 5:32 PM Jan 24, 2021, 11:00 PM

                                  @jkozee @Tom-Elliott Sorry for bringing up such an old topic again. Working on moving towards the new 5.10.x kernel I was looking at the patches we still apply to our kernel. Most are part of the upstream kernel but not the fix discussed in this topic.

                                  Though the kernel code has changed a bit and I am wondering if we’d still see the slowness without our fix? Would anyone of you be able to replicate the issue with a 5.10.x kernel (with and without fix)?

                                  Searching the web a little more I stumbled upon this patch that made it into the official kernel not long ago: https://patchwork.kernel.org/project/linux-input/patch/20200910143455.109293-12-boqun.feng@gmail.com/

                                  Not sure but could play a role in this case. Anyway it would be great to see if the issue can still be replicated with the newer kernel - without fix.

                                  Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                  Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                  1 Reply Last reply Reply Quote 0
                                  • 1
                                  • 2
                                  • 3
                                  • 3 / 3
                                  • First post
                                    Last post

                                  176

                                  Online

                                  12.0k

                                  Users

                                  17.3k

                                  Topics

                                  155.2k

                                  Posts
                                  Copyright © 2012-2024 FOG Project