Unable to capture Windows 10 Image
-
@sudburr Thanks for the reply, I realise that i didn’t add this but I did move the fog server to a different virtual host and tried to capture the windows 10 virtual and it had the same issue.
Also one of my colleagues (about 20 minutes ago) copied the windows 10 VM and tried to capture it to his physical fog box, I don’t have the specs for that server but it did exactly the same as the virtual Fog servers
Also we have been using fog for years on a virtual server (with limited speeds 300mb/min, which is totally acceptable for our setup and how often we capture images, its much faster when deploying) we have only started having issues with Windows 10 Edu Pro
-
@Notalot I’m trying to build a truth table in my head on this and I’m still not seeing the combinations.
During all FOG operations the target computer does (really) all of the work. It takes the image from the local disk, compresses it and sends it to the FOG server. The fog server takes the stream from the network adapter and writes it to disk and then manages the entire process.
For a 100MB/s network, I would expect to see about 700MB/min transfer rates, and for 1GbE about 6GB/min to modern target hardware.
While this is a bit off point, I get about 1.2GB/s transfer rates to my FOG-Pi3 server at home. The point is the FOG server doesn’t need much horse power during imaging.
I seemed to remember a thread in the FOG forums a while ago that talked about cruddy hyper-v performance because of the disk controller selected on the target VM. I seem to remember something about IDE/SATA vs SCSI and one gave a lot better performance than the other. But since I don’t use hyper-v, I only half remember it and of course I can’t find it now in the forums now.
-
For comparison, my stuff running on Hyper-V 2019 host, although same stuff when I had it running on Hyper-2016.
My development server sitting right beside me.
- a desktop i7-7700
- 32 GB RAM
- onboard Intel 1 Gb NIC
- 256 GB NVMe boot drive
- 2 TB SATA-III SSD
- 2 TB 7200rpm SATA-III HDD
- running Server 2019 Standard with Hyper-V
The FOG server VM
- disk 1 = 20 GB .vhdx which is stored on the NVMe boot drive above.
- disk 2 = the 2 TB HDD above (for /images storage)
- Gen1 machine
- 2 GB Memory
- 4 virtual processors (marginal differences going higher, noticeable difference going lower)
- Network adapter (not LEGACY, connected to the onboard Intel 1 Gb NIC)
- running CentOS 7.x minimal
- 1.4.4 Fog Server
- bzimage/32 4.15.2
The VMs I use to build images are Gen1 or Gen2, and built on the 2 TB SSD.
Captures are saved to the 2 TB HDD.
Our field servers for deployment are ancient, 11year old Lenovo M58 with Pentium e2200 CPU, 2 GB memory, and 500-2000 GB HDD also running CentOS 7.x but on bare metal.
You really, really want to get off the legacy adapter, it’s only 100 Mb.
-
Hey sorry for the delay in getting back to you yesterday.
So I installed the same version of windows 10 to a laptop this morning and captured it at 1.2gb/m but it did still do the pausing on the blocks (just not as often or for as long) (about 10 mins for a 12gb fresh install of windows).
My Colleague lowered the CPU count on the windows 10 VM from 12 to 4 and re ran the job on his physical FOG server and it captured in 2 hours at 250mb/m, I did the same on the virtual host which didn’t make any change.
I’ve also just tested unteaming 1 network connection from each of the virtual hosts and setting up a dedicated switch for imaging. That also didn’t make any difference.
-
@Notalot said in Unable to capture Windows 10 Image:
My Colleague lowered the CPU count on the windows 10 VM from 12 to 4
How many physical cores does the host have on it?
On the vm host server where the fog server is running, what does the disk subsystem look like? Is it a raid array, ssd. nvme, hdd?
If I remember right the FOS Linux kernel is capped at 8 CPUs for some reason. So your capture/deploy will only use 8 (v)CPUs even if you give it more.
-
The virtual host has 2x 16 core processors (32 cores total) for both the server hosting the VM as well as fog server.
The disk setup is 2 raid5 arrays, the system is on 3x15k 300gb SAS drive and the storage (where the VHD is) is on 3x10k 1tb SAS drives.
Good to know about the CPU cap.
-
@Notalot I was concerned about over provisioning the vm host by promising more vCPUs to the vm client than the vm host had available. That is always a recipe for a crappy vm experience. But in your case that’s not it.
The 3 drive raid 5 on spinning disks are not the best solution, but at least they are better than a single spindle hdd for a vm host server. That 3 disk raid-5 probably isn’t your speed issue.
Its still not clear of the pausing is on the target vm end or the fog server since they are both running under hyper-v
-
So over the weekend I’ve been playing around, I cleared some space on our hyper V Dev server which has 2 SSD’s in Raid 0 and moved the windows 10 VM across to it.
It has been capturing steady at 35mb/m for 10 hours, the stuttering on the block count is much better still happening but not for very long each time.
-
@george1421 So I spoke too soon…
The image is still running but looks like its hung, see attached (I’ve blanked out the identifying info)
-
@Notalot Tell me about your hyper-v environment, what is the host OS for both the fog server as well as the target system? That performance is pretty bad no matter how you look at it.
-
The hosts are all Windows server 2012R2 Datacentre.
2x16 core 2.1ghz processors (AMD Opteron 6272)
64gb Ram
3x 300gb 15k sas drives in Raid 5 for the OS
3x 1tb 10k sas drives in raid 5 for the storage of the VM’s
4 1gb network links teamed together shared with the OSThe Dev server I’m working with:
Windows server 2012R2 Datacentre
2x 6 core 2.4ghz Processors (AMD Opteron 2431)
32gb Ram
2x 250gb SSD’s in Raid 0
2x 1gb network links teamed together shared with the OS.Below are the specs for the individual VM’s
-
@Notalot I have a hyper-v install on a system at our hot site for veeam replication. I’m in the process of spinning up a new Win10 install on that host. Its a vm under vSphere running 2016 Datacenter. So this should be interesting a Win10 vm inside a 2016 VM. I want to see what the capture rates are to my production FOG server running at my local site. I suspect the bottleneck should be the WAN link between the two servers.
Then I’ll spin up a fog server at the hot site and see what the same server image capture rates are. I just can’t believe that hyper-v is only able to capture at less than 100MB/min, at that rate you might as well be using floppy disks…
-
@george1421 Hey thanks, So today I accessed a known good Fog server running 1.5.4 (fog1.5.4 from now on) on hyper-v with the same specs above (I used the settings as a guide for setting up the new ones, but it was a fresh install each time)
fog1.5.4 historically would capture at 300mb/m and would images at 1.2gb/m (limited by the network connection), I’m currently uploading the image at 21mb/m (this is the same if I attempt to capture the last known good image, whilst writing this I got the rcu_sched self-detected stall error). I’ve updated its network connection to the synthetic connection and it will deploy images at 7-8gb/m now.
I’m about to upload a previously imaged computer in to fog1.5.4 to see what it does.
When uploading a windows 7 image from hyper V to fog 1.5.4 (its running currently) I’m getting 220mb/m but that’s fluctuating currently (between 210 and 230 mb/m due to the hang on the current block as described before).
-
So the image finished from the physical laptop at 2.1gb/m which would suggest that the issue is on the windows 10 VM side.
-
@Notalot I’m currently installing fog under hyper-v instance. I have a second hyper-v instance with the disk connected to the scsi adapter instead of the ide adapter. I want to see if there is a speed difference on the fog side between the two adapters with all else being equal.
MDT just finished building the vm target system under hyper-v. So I’m ready to capture with my production fog server soon. I should be able to get benchmark numbers in the next few hours.
-
I understand this information will not help you with your capture performance but it does give is a baseline to contrast and compare against.
Hardware:
Both virtualization servers are Dell R540 servers with 2x 14 core processors running on a Dell raid-10 8 disk array. The link between the primary site and the hot site is 1GbE. The hypervisor is vSphere 6.5 at both sites. The Hyper-V server is running as a VM at the hot site on 2016 Datacenter server. It has 4vCPU and 16GB of ram allocated with 2 virtual hard drives. One for the OS and one for the disks for hyper-v. For the hyper-v host I installed hyper-v along side the existing windows 2016 data center. I did not install hyper-v on “bare metal” so I don’t know if that would have any performance impact or not (sorry not a hyper-v admin).Disk image /compression setup as windows 10 zstd level 6. I picked zstd because its a bit more cpu intensive than gzip.
Test #1
FOG Server: Main site running as a vSphere client.
FOG target: Hot site running a hyper-v client (inside a vSphere client).
Test #2
FOG Server: Main site running as vSphere client.
FOG Target: Main site running as vSphere client (on same server as fog server)
Test #3
FOG Server: Hot site running as hyper-v client (inside a vSphere client)
FOG target: Hot site running a hyper-v client (inside a vSphere client).
Test #4
FOG Server: Hot site running as hyper-v client (inside a vSphere client)
FOG Target: Main site running as vSphere client.
Conclusion:
It appears that the vm client has a bigger impact on the fog capture performance than the fog server. Which is understandable because the target computers does all of the heavy lifting during image capture and deployment, the fog server does very little other than take the image stream from the network and write it to the local hard drive on the fog server and also manager the overall imaging process. So adding 12 vCPUs to your fog server will not make imaging go faster. For the hyper-v target I had to use the legacy network adapter which allowed for pxe booting. The native network adapter would not pxe boot. I’m suspecting some of the slowness was in the legacy network adapter as @sudburr posted already. Never the less I still can explain the 32MB/m transfer rates you are seeing. I’m almost half tempted to install virtual box here and see what kind of performance I get out that slow poke type-2 hypervisor. -
Sooooo as stupid as this sounds but I have a solution / workaround.
Reduce the number of CPU’s on the VM to 1
Enable migrate to a physical computer with a different processor versionChanging either of these setting causes the speed to bottom out at 21mb/m leaving the setting at this results in 230mb/m capture speed.