Deploying images from virtual FOG-server to VM's on the same host takes forever
We have a setup where both the FOG Server and the machines we want to deploy are all virtual and on the same Hyper-V host, on the same hypervisor, and the deployment takes forever.
The speeds starts at 7Gb/min and steadily falls down into the low hundreds mb/min. Deploying one clean W10 machine of 15gb takes a couple of hours. Another thing I have noticed that I have not seen before when I have used this service on physical machines is that Partclone stays on Syncing for a couple of minutes between every partition.
Does anyone have experience with deploying virtual machines with FOG?
Are there any known problems or any Kernel that is preferred? Network between VM’s are +1Gbits measured with iperf3 and the NIC on the server is 2x1Gbits Multiplexor Adapter.
The host is brand new and has 2x AMD Epyc 7451 24core CPU, 320gb RAM and total disk area for storage of VM’s is 3.24GB
Happy for any suggestions!
Thanks in advance!
@f-an Is this still an issue or did you solve it?
It’s an raid array of 6x900gb 15k SAS drives connected to an HP P408i-a SR Gen10 Controller. All of it combined to a single drive in Windows.
@f-an what does your storage subsystem look like?
@george1421 I tried the init_partclone.xz and set the parameter in tftp settings. The boot dialog specifies that it is using that file, but now the deploy process is even slower. Still with FOG 1.5.7 tho.
So if you introduce a physical machine to capture or deploy to what are your transfer rates? The physical machine doesn’t need to boot for this test, only to deploy the image to get the speed.
Unfortunately I cannot get a physical machine connected to the FOG server. It is installed in a datacenter and I cannot connect a PC to that DHCP server from where I’m at.
As for your VM is it a type 1 or type 2 (not the right words) hyper-v vm?
They are Generation 2. I might try a Generation 1…
Is the vm in uefi or bios mode?
UEFI. Generation 2 only uses UEFI. Generation 1 uses BIOS.
Also just for reference what OS is the hyper-v host running on?
It is running on Windows Server 2016
Did you install FOG using the git method or the tarball method?
It was installed with the Git method.
Read the entire thread so you can see how to install the inits without breaking your fog deployment: https://forums.fogproject.org/topic/13620/very-slow-cloning-speed-on-specific-model/12 Upgrading to 220.127.116.11 is not required to test these inits.
Okay! Thank you for your help. I will give it a try. There is nothing to break right now so I might just upgrade to 18.104.22.168 as well.
@f-an Did you install FOG using the git method or the tarball method? Because if you use the git method you can switch to the working branch and install fog 22.214.171.124 which addresses some issues since 1.5.7 GA was released. I’m not saying that solves anything also @Quazz was working on updated inits (FOS Linux virtual hard drive) with the updated version of partclone. That updated version I seem to remember fixed a slow nvme issue. Let me see if I can find the link he posted.
Edit: Read the entire thread so you can see how to install the inits without breaking your fog deployment: https://forums.fogproject.org/topic/13620/very-slow-cloning-speed-on-specific-model/12 Upgrading to 126.96.36.199 is not required to test these inits.
The FOG server is right now running with 48GB RAM and 8 vCPU
Just for reference 4GB ram and 2 vCPU is all that is required, a bit more if you have many fog client checking in.
With 1.5.7 your FOS Linux kernel and inits are pretty new so that probably isn’t the issue. So if you introduce a physical machine to capture or deploy to what are your transfer rates? The physical machine doesn’t need to boot for this test, only to deploy the image to get the speed.
As for your VM is it a type 1 or type 2 (not the right words) hyper-v vm? Is the vm in uefi or bios mode? If we find out its the VM (based on the results of the physical machine deploy) we can run some tests to see if its the target machine disk or network that’s causing the issue.
Also just for reference what OS is the hyper-v host running on?
Just had another try at it with the following results:
- Multiple Partition Image - Single Disk (Not Resizable)
- Partition - Everything
- Compression - 3
- PartClone Gzip
Avg. speed 2.3GB/min, Image size 20GB
- Tried the lates Kernel - 5.1.16_mac-nvme-fix
- First partition (550mb NTFS) took 3 sec at 8GB/min
- PartClone froze at “Syncing…” after first partition for 4-5min
- PartClone goes on to second (FAT32) and third (RAW) partition with consistent speed 6-8Gb/min and does not freeze at “Syncing…”
- PartClone starts main partition at 6Gb/min and keeps that for 50 secounds before it starts slowing down and after 3min its down past 1Gb/min and now as I’m typing this 15minutes has passed and 15% of the process has been done and we are down to 355mb/min.
I am really baffled by this to be honest…
Thank you both for your replies!
@Junkhacker The purpose of this is to set up one VM as a Golden Config that get captured once the newest build of our software is installed and then deploy 10-12 VM’s for QA to test on.
@george1421 The FOG server is v1.5.7, installed fresh last week and I’ve been troubleshooting ever since. The FOG server is right now running with 48GB RAM and 8 vCPU just for error elimination and the clients I have tested with are configured with 4 vCPU and between 8-32GB RAM. They are all on the same virtual switch and the DHCP scope is on the host server.
What version of FOG are you using? I remember a while ago the FOS Linux engine would stall out during the partclone process then take off again.
As for the speed, I’ve heard about hyper-v being slow before, but not by first hand experience.
In can say in our environment we use vSphere and capture from a vsphere vm to a vshpere FOG server is about 6GB/m and deployment is in the area of 13GB/min. I can push out a 25GB (fat) golden image in about 2 minutes. I’m only saying this to show what is possible.
So what would be interesting to know in your setup is the hyper-v FOG server slow or is it the target computer (vm)? I can say during a fog image deployment cycle the target computer does almost all of the work in imaging, the FOG server only manages the process. In my home lab I have a fog server running on a raspberry PI3 and I get 4.5 to 5.0GB/min transfer rates so the point is the fog server isn’t a major factor in imaging speed its the target computer.
@f-an i know the benefits of creating master images on VMs and understand the use of VMs to test image deployments, but it sounds like you’re trying to as your method of creating pre-configured VMs. what’s the benefit of using fog for this purpose instead of using a VM template?