@Sebastian-Roth Sure, I’ll help out if I can. Do you have links to the wiki pages you’re working on?
Posts made by jkozee
-
RE: Performance decrease using Hyper-V Win10 clients
-
RE: Performance decrease using Hyper-V Win10 clients
No resolution on this issue yet. (One of?) the author of the patch has confirmed the behavior and is investigating a kernel solution that doesn’t re-introduce the bounce buffers. No indication on how long this might take.
-
RE: Performance decrease using Hyper-V Win10 clients
@Tom-Elliott Sorry for not seeing this sooner. PAGE_SIZE is defined as 4096, so the mask is being set to 4095, which is the same value that iscsi_iser.c uses (~MASK_4K).
From the notes in LIS, I suspect that setting blk_queue_virt_boundary is supposed to insure that there are no gaps in th sg list, but they are still present, so the bounce buffer needs to be put back in place or the gaps need to be eliminated elsewhere.
The patch author responded this morning and is looking into the slowdown report. I’ll post any updates as I hear them.
-
RE: Performance decrease using Hyper-V Win10 clients
I reached out to the author of the patch. I’ll post if any new information becomes available.
-
RE: Performance decrease using Hyper-V Win10 clients
@Sebastian-Roth Oops, yes page_size is not part of the sdevice struct. It would probably be more appropriate to rollback the 81988a0e6b031bc80da15257201810ddcf989e64 anyhow. Leaving blk_queue_virt_boundary set to 0, rather than setting it to PAGE_SIZE-1 appears to fix the slowdown, but it would to take some research to determine what other impact that might have. I’ll probably just revert to 4.3.2 for my VMs until I have more time to investigate the issue.
Edit:
In fact, it looks like “Linux Integration Services for Microsoft Hyper-V” also diverge from the Bounce buffer commit: https://github.com/LIS/lis-next/blob/master/hv-rhel6.x/hv/storvsc_drv.cLooks like at least one of the authors of storvsc_drv.c is on the project, but not active.
-
RE: Performance decrease using Hyper-V Win10 clients
Maybe this makes more sense: blk_queue_virt_boundary(sdevice->request_queue, sdevice->page_size - 1);
-
RE: Performance decrease using Hyper-V Win10 clients
So, adding that line to 4.3.2 results in performance degradation and removing it from 4.4.2 results in performance increase. Guess all that’s left is to figure out (understand) what it actually does…
-
RE: Performance decrease using Hyper-V Win10 clients
To me, these lines from the commit look most interesting:
/* Ensure there are no gaps in presented sgls */
blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1); -
RE: Performance decrease using Hyper-V Win10 clients
Still seems more like the issue should be with the block device, rather than the scsi driver. Seems like it would be related to caching or block size/block alignment of the ssd.
-
RE: Fatal Error: Unknown mode :: onlydebug 6303 (kernel 4.4.1)
@Tom-Elliott Thanks! Works as expected. And I see the confusion regarding this bug. I didn’t report it very clearly, and you took it mean the the result of the “fog” script within the debug session vs the actual start of the task. I’ll take care to report more completely in the future. Thanks again for all your efforts here!
-
RE: Fatal Error: Unknown mode :: onlydebug 6303 (kernel 4.4.1)
@Tom-Elliott I’m not sure what you mean about replicating the “task cancel”, as it cancels fine for me. Unless it was from my quoting the Debug task: [Debug mode will load the boot image and load a prompt so you can run any commands you wish. When you are done, you must remember to remove the PXE file, by clicking on “Active Tasks” and clicking on the “Kill Task” button]. I’ll pull and confirm…
-
RE: Fatal Error: Unknown mode :: onlydebug 6303 (kernel 4.4.1)
@Tom-Elliott Sorry for my confusion. And I’m not trying to be dense, just confirming this is intended behavior, as it still seems counter-intuitive to me.
In version 5315, when I do a “Host->Basic Tasks->Advanced->Debug”, the client boots to linux shell. In the latest version the client reports the error, and sits idle as it waits to reboot in 60 seconds. So, I never get to the linux shell.
The “Debug” task still reports the same intended behavior “Debug mode will load the boot image and load a prompt so you can run any commands you wish. When you are done, you must remember to remove the PXE file, by clicking on “Active Tasks” and clicking on the “Kill Task” button.”, but doesn’t actually perform this task.
So, what should the debug task do, or is it being deprecated in favor of the new “Deploy-Debug”, “Capture-Debug”, or “Schedule task as a debug task” option for any other task? I assumed it was a separate option, when the intention is to poke around a bit and not to perform any other available task.
Thanks again and sorry if I’m being a PITA…
-
RE: Fatal Error: Unknown mode :: onlydebug 6303 (kernel 4.4.1)
@Tom-Elliott I think a variant of this bug may still exist in 6363. The error is now “Fatal Error: Unknown request type :: Null”
I get this error when booting the client after creating a “Debug” task for a it. So, “Host->Basic Tasks->Advanced->Debug”, then start the client.
I don’t think “Capture-Debug” or “Deploy-Debug” has this problem (even in 6303), as I was able to get to a debug console by using those options.
-
RE: Performance decrease using Hyper-V Win10 clients
@Sebastian-Roth and @Tom-Elliott
The change to the kernel is actually in the scsi driver.
The commit that introduced the delay is 81988a0e6b031bc80da15257201810ddcf989e64, which applies changes to drivers/scsi/storvsc_drv.c.
I can confirm that reverting the diff on 4.4.2 brings the performance on the hyper-v client on par with 4.3.2. I can’t speak to the commit itself, as I just blindly reverted it and didn’t spend any time on digesting the patch itself.
My timings on the patched 4.4.2 was 2:14 for the deploy and 18:20 for the capture. That means the deploy is 50% faster and the capture is 27% slower than my tests for 4.3.2. @Tom-Elliott I did not include the additional patches you mentioned either, so I would need to retest both kernels under the same server conditions (and with the additional patches applied to 4.4.2) for more accurate results.
Hope this proves useful.
-
RE: Performance decrease using Hyper-V Win10 clients
Build script finished quicker than I expected. Looks like it was introduced between 4.3.5 and 4.4.1. I’ll look at git bisect when I can make the time.
-
RE: Performance decrease using Hyper-V Win10 clients
@Sebastian-Roth
Looks like my script wasn’t copying the .config file, so I was building with the defaults.I updated it and it built 4.3.2 and it boots fine now. I used the latest config and my script does “yes ‘’ | make oldconfig”. I’ll let it build the ones I mentioned earlier and test them. I’m about out of time for now, so I’ll post the results later.
@Tom-Elliott
I did not have time to write a sed script to include the additional patches from the wiki, but I can do that later or apply them by hand, once I have a chance to test the scripted builds.Sound reasonable?
-
RE: Performance decrease using Hyper-V Win10 clients
@Tom-Elliott Ok, I think I scripted builds 4.3.2 to 4.3.5 and 4.4.1, but I’ll start over just to be sure. I see the config for 4.3 on the repo at r4316. Let me start there and see what I get…
-
RE: Performance decrease using Hyper-V Win10 clients
@Tom-Elliott Um, yeah. That’s what I get for trying to multitask and trying to script the builds. Let me see what I’m actually doing. Sorry for wasting space here…
-
RE: Performance decrease using Hyper-V Win10 clients
@Tom-Elliott Hmm. 3.3.2 built but wouldn’t boot. I got a kernel panic, not sycning VFS. I used the config from https://svn.code.sf.net/p/freeghost/code/trunk/kernel/TomElliott.config.64. Is there another one I should use for 3.3.2?
-
RE: Performance decrease using Hyper-V Win10 clients
@Tom-Elliott I kicked off a script to build the kernels. Assuming they build and boot, I’ll report my findings.