• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Performance decrease using Hyper-V Win10 clients

Scheduled Pinned Locked Moved Solved
Bug Reports
6
56
26.6k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T
    Tom Elliott @jkozee
    last edited by Feb 19, 2016, 8:12 PM

    @jkozee 3.3 is very old. I thought 4.3 worked and 4.4 doesn’t so I would suspect somewhere between those would be enough to start to figure out.

    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

    J 2 Replies Last reply Feb 19, 2016, 8:14 PM Reply Quote 0
    • J
      jkozee Testers @Tom Elliott
      last edited by jkozee Feb 19, 2016, 2:14 PM Feb 19, 2016, 8:14 PM

      @Tom-Elliott Um, yeah. That’s what I get for trying to multitask and trying to script the builds. Let me see what I’m actually doing. Sorry for wasting space here…

      1 Reply Last reply Reply Quote 0
      • J
        jkozee Testers @Tom Elliott
        last edited by Feb 19, 2016, 8:20 PM

        @Tom-Elliott Ok, I think I scripted builds 4.3.2 to 4.3.5 and 4.4.1, but I’ll start over just to be sure. I see the config for 4.3 on the repo at r4316. Let me start there and see what I get…

        T 1 Reply Last reply Feb 19, 2016, 8:24 PM Reply Quote 0
        • T
          Tom Elliott @jkozee
          last edited by Feb 19, 2016, 8:24 PM

          @jkozee look on wiki for build tomelliott kernel

          Follow instructions and please test with the additional patches. Speed up build time by adding -j $(nproc) to the make commands

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

          1 Reply Last reply Reply Quote 0
          • S
            Sebastian Roth Moderator
            last edited by Sebastian Roth Feb 19, 2016, 2:44 PM Feb 19, 2016, 8:42 PM

            @jkozee said:

            I see the config for 4.3 on the repo at r4316. Let me start there and see what I get…

            Don’t bother too much about getting the exact config Tom used for a particular version. I’d suggest using the newest config for all the builds. As far as I know - hope this is correct - make oldconfig will ask you on the console if there are settings missing. Older ones will just be tossed.

            As well, using the same config as a base is wise to properly compare the different kernels versions. Otherwise you end up wondering if a change in config made the difference!

            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

            J 1 Reply Last reply Feb 19, 2016, 9:23 PM Reply Quote 0
            • J
              jkozee Testers @Sebastian Roth
              last edited by Feb 19, 2016, 9:23 PM

              @Sebastian-Roth
              Looks like my script wasn’t copying the .config file, so I was building with the defaults.

              I updated it and it built 4.3.2 and it boots fine now. I used the latest config and my script does “yes ‘’ | make oldconfig”. I’ll let it build the ones I mentioned earlier and test them. I’m about out of time for now, so I’ll post the results later.

              @Tom-Elliott
              I did not have time to write a sed script to include the additional patches from the wiki, but I can do that later or apply them by hand, once I have a chance to test the scripted builds.

              Sound reasonable?

              1 Reply Last reply Reply Quote 1
              • J
                jkozee Testers
                last edited by Feb 19, 2016, 9:37 PM

                Build script finished quicker than I expected. Looks like it was introduced between 4.3.5 and 4.4.1. I’ll look at git bisect when I can make the time.

                1 Reply Last reply Reply Quote 1
                • S
                  Sebastian Roth Moderator
                  last edited by Feb 19, 2016, 10:14 PM

                  @jkozee Great work! I am sure you will see what’s exactly causing it and when it was introduced! bisect is your friend. 🙂

                  Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                  Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                  1 Reply Last reply Reply Quote 0
                  • J
                    jkozee Testers
                    last edited by Feb 21, 2016, 3:39 AM

                    @Sebastian-Roth and @Tom-Elliott

                    The change to the kernel is actually in the scsi driver.

                    The commit that introduced the delay is 81988a0e6b031bc80da15257201810ddcf989e64, which applies changes to drivers/scsi/storvsc_drv.c.

                    I can confirm that reverting the diff on 4.4.2 brings the performance on the hyper-v client on par with 4.3.2. I can’t speak to the commit itself, as I just blindly reverted it and didn’t spend any time on digesting the patch itself.

                    My timings on the patched 4.4.2 was 2:14 for the deploy and 18:20 for the capture. That means the deploy is 50% faster and the capture is 27% slower than my tests for 4.3.2. @Tom-Elliott I did not include the additional patches you mentioned either, so I would need to retest both kernels under the same server conditions (and with the additional patches applied to 4.4.2) for more accurate results.

                    Hope this proves useful.

                    1 Reply Last reply Reply Quote 1
                    • J
                      jkozee Testers
                      last edited by Feb 21, 2016, 6:36 AM

                      Still seems more like the issue should be with the block device, rather than the scsi driver. Seems like it would be related to caching or block size/block alignment of the ssd.

                      1 Reply Last reply Reply Quote 1
                      • J
                        jkozee Testers
                        last edited by Feb 21, 2016, 6:55 AM

                        To me, these lines from the commit look most interesting:
                        /* Ensure there are no gaps in presented sgls */
                        blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);

                        1 Reply Last reply Reply Quote 1
                        • J
                          jkozee Testers
                          last edited by Feb 21, 2016, 7:13 AM

                          So, adding that line to 4.3.2 results in performance degradation and removing it from 4.4.2 results in performance increase. Guess all that’s left is to figure out (understand) what it actually does…

                          1 Reply Last reply Reply Quote 1
                          • J
                            jkozee Testers
                            last edited by Feb 21, 2016, 7:47 AM

                            Maybe this makes more sense: blk_queue_virt_boundary(sdevice->request_queue, sdevice->page_size - 1);

                            1 Reply Last reply Reply Quote 1
                            • S
                              Sebastian Roth Moderator
                              last edited by Sebastian Roth Feb 21, 2016, 5:48 AM Feb 21, 2016, 11:16 AM

                              @jkozee You are great man! This is what I love about open source and the people knowing how to go with it… 🙂

                              I didn’t even know that there is a storage driver for Hyper-V (and VMware by the way) right in the linux kernel. And we have it enabled: CONFIG_HYPERV_STORAGE

                              Looking at the scsi_device (sdevice) struct I don’t see page_size. So I don’t think your change is gonna work. Probably wouldn’t compile at all. Looking through the driver code I see PAGE_SIZE used several times. So I feel like this seams ok - although I don’t know much about this particular driver!

                              You just might want to get in contact with the authors of the storvsc_drv driver (line 18ff) and as well the author of the patch! Tell them that you bisected a major slowdown issue to that particular patch. I guess that their input is a lot more helpful than trying to understand the whole driver by yourself! Please keep us posted.

                              Edit: Seams like the function blk_queue_virt_boundary was added only a few months back - intended to be used with NVMe devices from what it looks to me (I don’t think this has anything to do with you seeing slower speeds on the SSD backend though).

                              Edit2: Further interesting things to read on this are here and here - infiniband driver using this as well, interesting part is:

                              (Very) nice cleanup – so what’s the actual deal here? is that as long
                              as we plant a slave alloc callback into out scsi host template which
                              further invokes a
                              blk_queue_virt_boundary(sdev->request_queue, ~MASK_4K) call, we’re
                              100% safe/sure what all SGs we get meet the alignment criteria?

                              Correct, the nvme driver has the same alignment constraints for its PRPs
                              and uses the queue virt_boundary to have the block layer enforce the
                              SG alignment.

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              J 1 Reply Last reply Feb 21, 2016, 10:26 PM Reply Quote 1
                              • J
                                jkozee Testers @Sebastian Roth
                                last edited by jkozee Feb 21, 2016, 4:33 PM Feb 21, 2016, 10:26 PM

                                @Sebastian-Roth Oops, yes page_size is not part of the sdevice struct. It would probably be more appropriate to rollback the 81988a0e6b031bc80da15257201810ddcf989e64 anyhow. Leaving blk_queue_virt_boundary set to 0, rather than setting it to PAGE_SIZE-1 appears to fix the slowdown, but it would to take some research to determine what other impact that might have. I’ll probably just revert to 4.3.2 for my VMs until I have more time to investigate the issue.

                                Edit:
                                In fact, it looks like “Linux Integration Services for Microsoft Hyper-V” also diverge from the Bounce buffer commit: https://github.com/LIS/lis-next/blob/master/hv-rhel6.x/hv/storvsc_drv.c

                                Looks like at least one of the authors of storvsc_drv.c is on the project, but not active.

                                1 Reply Last reply Reply Quote 0
                                • J
                                  jkozee Testers
                                  last edited by Feb 21, 2016, 10:50 PM

                                  I reached out to the author of the patch. I’ll post if any new information becomes available.

                                  T 1 Reply Last reply Feb 23, 2016, 11:03 PM Reply Quote 1
                                  • T
                                    Tom Elliott @jkozee
                                    last edited by Tom Elliott Feb 24, 2016, 11:43 AM Feb 23, 2016, 11:03 PM

                                    @jkozee I’ve added a bit of code to the found file that seems to be causing the issue.

                                    If you would like to try it, it can be downloaded at:

                                    http://mastacontrola.com/bzImage (64bit)

                                    I only built the 64 bit kernel.

                                    The adjusted code is:

                                    --- a/linux/drivers/scsi/storvsc_drv.c 2016-02-19 09:46:33.272075454 -0500
                                    +++ b/linux/drivers/scsi/storvsc_drv.c 2016-02-23 17:23:12.868518253 -0500
                                    @@ -1231,7 +1231,8 @@
                                            blk_queue_rq_timeout(sdevice->request_queue, (storvsc_timeout * HZ));
                                     
                                            /* Ensure there are no gaps in presented sgls */
                                    -       blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
                                    +    if (PAGE_SIZE - 1 < 0) blk_queue_virt_boundary(sdevice->request_queue, 0);
                                    +    else blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
                                     
                                            sdevice->no_write_same = 1;
                                    

                                    It’s all on theory that what’s happening is it’s putting the disk to a negative number, and this causes the slowdown. Again, probably won’t work, but still would be nice to know for sure.

                                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                    1 Reply Last reply Reply Quote 0
                                    • T
                                      Tom Elliott
                                      last edited by Feb 24, 2016, 7:54 PM

                                      I’ve solved this thread as we now know this is a bug in the kernel and not a bug due to FOG. Of course we can still document stuff here.

                                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                      J 1 Reply Last reply Feb 29, 2016, 4:26 PM Reply Quote 1
                                      • J
                                        jkozee Testers @Tom Elliott
                                        last edited by jkozee Feb 29, 2016, 10:29 AM Feb 29, 2016, 4:26 PM

                                        @Tom-Elliott Sorry for not seeing this sooner. PAGE_SIZE is defined as 4096, so the mask is being set to 4095, which is the same value that iscsi_iser.c uses (~MASK_4K).

                                        From the notes in LIS, I suspect that setting blk_queue_virt_boundary is supposed to insure that there are no gaps in th sg list, but they are still present, so the bounce buffer needs to be put back in place or the gaps need to be eliminated elsewhere.

                                        The patch author responded this morning and is looking into the slowdown report. I’ll post any updates as I hear them.

                                        1 Reply Last reply Reply Quote 2
                                        • sudburrS
                                          sudburr
                                          last edited by Mar 11, 2016, 5:00 PM

                                          Any update on this?

                                          [ Standing in between extinction in the cold and explosive radiating growth ]

                                          1 Reply Last reply Reply Quote 1
                                          • 1
                                          • 2
                                          • 3
                                          • 2 / 3
                                          • First post
                                            Last post

                                          155

                                          Online

                                          12.0k

                                          Users

                                          17.3k

                                          Topics

                                          155.2k

                                          Posts
                                          Copyright © 2012-2024 FOG Project