Fog Deployments are Slow
-
Fog deployments slowly go from 14GB/min down to 100mb/min and end up freezing up. How can I fix this?
-
@nerdstburns We need way more information to be able to help you!!
What you describe is not a general FOG issue and therefore we can’t provide a simple “do this” solution.
Speed depends on many things and you need to go through each part of the chain to find what is causing the slowness.
Start by deploying unicast to a single host. Make sure there are no other tasks running at the same time. If that is going slow I suggest you use a few different hosts (various make and models, different network cards and hard drives) and so the same single unicast deploy test.
Either they are all slow, then you need to look at the network equipment used. Or they are not all going slow, then you should be able to find the specific model or maybe even a single machine running at slow speed.
Get back to us with your findings. The more information you provide in this the more we are able to help.
-
@sebastian-roth I’ve been having trouble deploying images using unicast too. They tend to start off fast (14GB/min) and just slowly drop until they freeze around 30%. The image size is 700GB. We can deploy small images about 30-60GB with no problem, but bigger images slow down and freeze up.
Image Type = Single Disk - Resizable
Partition = Everything
Compression = 6
Image Manager = Partclone ZstdWe have 5 different physical locations that are experiencing similar issues, 3 have one PC build and 2 have a different PC build. We use CAT6 cabling and the fog server + clients are on the same VLAN.
-
@nerdstburns Once again, you need to do more testing to rule out things possibly causing this.
You seem to have only two different models (PC builds), right? Take two of each and connect it to the very same switch your FOG server is on to rule out network issues. Do several tests with each PC as single unicast to make sure.
700 GB is an unusual huge image. While it’s not a problem in general it does increase the risk of a flaky network driver (Linux kernel) or some other component to fail after a certain amount of time/data transfered. But once again, you need to try and rule out as many things as possible. Not something we can help you with other than pointing the way.
Think about getting a couple different PCs (even borrowed) to do further tests.
-
@sebastian-roth the server and PCs are connected to the same unifi switch. The server is running Ubuntu 18 LTS within VMware ESXi 6.7. We run a LAN center so I can’t really test out other PCs. We were able to fog in the past, but now it doesn’t want to go through with unicast or multicast. I can try updating Ubuntu to see if it works. I’m on Fog version 1.5.9, which I think is the latest version. Any other tips or tricks? If this doesn’t work today I may just blow out the fog server and start it over again.
Again, capturing is no problem, it sits at a steady 7GB/min, but deploying causes the rate to constantly fall until it freezes.
The image needs to be that big since it contains PC games and apps (Both Call of Duty games alone are around 300-400GB)
-
@sebastian-roth now I updated the ubuntu version to 20 instead of 18 and I can’t access my fog web portal. Seems like the folders are still there, but not the web portal. I tried reinstalling it and it gave me an error
-
@Sebastian-Roth scratch that I got the fog server working. testing the multicasting again
-
@sebastian-roth updated to the latest version of ubuntu and fog. Same issue occurs. I highly doubt it’s the NIC on the PC since we were able to fog with no problems in the past. Not sure where else the problem can lie. I will try blowing out the server and starting again to see if anything changes
-
@nerdstburns said in Fog Deployments are Slow:
I highly doubt it’s the NIC on the PC since we were able to fog with no problems in the past.
When was that? Using an older version? While it’s not likely it’s still possible a newer FOS Linux kernel NIC driver issue can cause what you describe. FOG 1.5.9 comes with a 4.19.x kernel but you could also try updating to a even newer 5.10.x kernel and see if that makes any difference.
In you posts you seem to be jumping between unicast and multicast. The later can be more tricky as the slowest PC will cause all others to slow down even if they could go faster. If you want to find out which component causes the issue I would stick to unicast for testing to not have one machine cause the slowdown on multicast and wreck your test results.
-
@sebastian-roth I have been on fog 1.5.9 since the beginning. I was able to unicast a 1.2TB image 3 months ago, but now unicast slows down on all PCs just like multicast. Tested on multiple PCs. Nothing on the network should have changed. We have had VLANs set up since the beginning. No change in the switch or the firewall, just the ISP has changed, which shouldn’t effect it
-
@nerdstburns If you are serious about debugging this (not implying you are not, it will just take some work on your part).
The speed rating you get during partclone is a composite throughput. So it can be a bit misleading when its slow. During imaging the target computer does all of the heavy lifting. So if everything else is the same a computer with a quad core gen 5 processor with a sata ssd drive will have slower throughput than a quad core gen 8 processor with an nvme drive. The performance of the FOG server (to a point) has very little impact on imaging speeds. Case in point I can use a raspberry pi 4 as a fog server and get about 5.6GB/m of transfer rate. The only thing the FOG server is doing is doing during an unicast image is moving the disk blocks from local storage to the network interface using the NFS protocol and monitoring the overall imaging process. All of the data decompression, checksum calculations and writing to local media is done by the target computer.
So the speed you see in partclone is actually made up of fog server disk subsystem to network interface + network transit time + target computer image decompression + target computer writing the disk image to local storage + storage speed write speed. Another variable may also be the FOS Linux kernel being used 4.19.x vs 5.10.x. A major rewrite of the linux kernel happened at 5.0.0 and then again at 5.5.x. Any one of those rewrites could have introduced a performance hit.
Several years ago I working on bench marking FOG imaging process in testing FOG server, network throughput and target system write speed here: https://forums.fogproject.org/topic/10459/can-you-make-fog-imaging-go-fast
If you are interested we can start to debug where this is falling down. We should test both fog server and target computer disk subsystems as well as network throughput.
Also we should collect some background information about your environment and health of the fog server.
How many clients do you have in your network with the FOG Client service installed?
On the fog server, when you run the
top
program and sort it byP
processor, what is the top 3 services running? Is the top MYSQL? -
Hi,
Sorry to revive an old thread but it is still happening.
For example with ThinkPad T450, T460, T470 it goes very fast (6-8GB/min). With ThinkPad E14 G2 it only goes around 600MB/min.
Thank you