Performace testing slow FOG Imaging
The intended audience of this thread is for those FOG admins who have one or more models of computers with slow imaging rates, where otherwise most of their campuses are imaging at a normal rate. The “normal rate” is a bit of a moving target because FOG imaging relies on a health network, well managed FOG server and modern target computers.
The easiest bench marking method a FOG admin has to access is the speed rating listed on the blue partclone screen, seen during imaging. This rating is measured in data volume per minute of transfer. We need to be mindful that this “speed rating” is a composite score of the entire imaging process and not specifically network throughput. This composite score is the combination of the fog server moving data to and from the network interface (plus) network throughput (plus) target computer ingest and image decompression in memory (plus) the target writing the expanded image to local storage media. Any one of these components not functioning optimally will cause a lower than “normal” score in partclone.
<Editor note: add in screen shot of partclone screen during imaging>
Some FOG admins equate this partclone score directly to network throughput. This an incorrect assumption. I’ve seen comments like “I’m getting partclone speeds faster than physically possible with my network, how is that possible?”. In this case the poster was getting 8.2GB/min according to partclone over a 1GbE network. A 1GbE network has a theoretical throughput of 7.5GB/min, yet the poster was seeing 8.2GB/min according to partclone. As I mentioned earlier the partclone score is a composite score of the entire process, where network throughput is only one component. So that 8.2GB/min score is telling me the poster’s network is running very well in that the target computer is receiving the image at a rate to keep the input buffer full and the target computer is performing well in that it is ingesting, decompressing the image, and writing the image to local storage at top speed.
I can speak for the benchmarks I see on my campus.
I do have to mention a caveat here in that on my campus I don't use the FOG Client, so from this perspective my FOG server is only used for imaging and not for system management. The FOG Client adds its own overhead to the FOG server that may skew your results if you have a large campus with all target computers running the FOG client.For a well managed pure 1GbE network with a modern (contemporary) target computer, I typically see 6.1GB/min score in partclone. Using our enterprise infrastructure with a 10GbE core network we typically see 13GB/min score on the target computers. Just to contrast this, my FOG-Pi3 server on a 1 GbE network I’ve seen 5GB/min partclone scores. The point is the FOG server has minimal impact on FOG imaging rates. All of the heavy lifting (so to speak) during imaging is done by the target computer. To say it a different way, the target computer performance has a bigger impact on imaging than the FOG server.
One thing I need to point out here is that as you read through this thread be mindful of the unit of scale being used. Some tools report out in bits per second, others in Bytes per second, and others in MB per minute. I will try my best to keep everything straight myself. For example for a 1GbE network that is 1 gigabits per second, or 125 Mega Bytes per second or 7.5 gigabytes per minute. They all mean the same speed but at a different unit of time.
The remainder of this thread is going to assume your campus is imaging at your normal speed except one specific model of computer. We will go through the steps to try to determine which leg of imaging is causing the partclone score to be lower than expected.
This thread is based on the work I did several years ago in this thread: https://forums.fogproject.org/topic/10459/can-you-make-fog-imaging-go-fast We will take some of the lessons learned in that thread and apply them below.
Target system setup for testing
- Register your target system with the FOG server. If you can’t use the built in registration process, manually register the target computer with the FOG server.
- Connect the target computer’s configuration to an image definition.
- Schedule a debug capture/deploy (doesn’t matter which). Before you hit the Schedule Task button, tick the Debug checkbox then schedule the task.
- PXE boot the target computer into FOG. You will see several screens of text on the target computer that you need to clear using the Enter key. At this point you should be at the FOS Linux command prompt.
- Follow the testing procedures below.
For tips on remote debugging the target computer check out this link: <Editor note: add in link to remote debugging article when its written>
Target disk subsystem
In this section we are going to test the target computer’s performance to create a 1 GB file on local storage using the linux
ddcommand will create this 1GB file and time the creation process for us. Just be aware that this is a data distructive test. The contents of your local storage device will be erased during the test. Don’t perform this storage bandwidth test on a disk where you can not afford to lose the data.
The hardest step in the process is finding the local storage device name, removing all partitions on the disk, and then creating a new partition for our testing.
First lets find the name of your local storage disk. We will use the
lsblkcommand to locate the linux device name. In the figure below you see the linux device name is
sdafor a sata attached disk, It has 2 partitions
# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 238.5G 0 disk ├─sda1 8:1 0 512M 0 part /boot/efi └─sda2 8:2 0 238G 0 part / sr0 11:0 1 1024M 0 rom
Below is an example of an NVMe disk. In this case the device name is
nume0n1and the partition numbers are
p1 p2 p3 p4.
# lsblk NAME MAT:MIN RM SIZE RO TYPE MOUNTPOINT nume0n1 259:0 0 4776 0 disk |-nume0n1p1 259:1 0 100M 0 part |-nume0n1p2 259:2 0 16M 0 part |-nume0n1p3 259:3 0 476.3G 0 part |-nume0n1p4 259:4 0 508M 0 part
For the rest of this section we will assume you have a NVMe drive so we will use that naming convention. So we know the NVMe device name is nume0n1. Lets use the
fdiskutility to remove all of the existing partitions on the disk. Don’t forget I mentioned this is a data destructive test.
dcommand to remove all of the existing partitions on the disk. Then use the
wcommand to write the blank partition table to disk. You can confirm the partitions are gone with the
pcommand. Now finally create a new partition using the
1first partition and then pick the defaults for the remainder. Now use the
wwrite command to write the partitions to disk and the
qcommand to quit fdisk. Finally ensure the OS is in sync with the disk by keying in
synctwice at the FOS Linux command prompt.
You can confirm your changes my once again using the
# lsblk NAME MAT:MIN RM SIZE RO TYPE MOUNTPOINT nume0n1 259:0 0 4776 0 disk |-nume0n1p1 259:1 0 477.6G 0 part
Now that we have our test partition we need to format it. Lets format this nvme first partition using this command.
mkfs -t ext4 /dev/nvme0n1p1
The output of this command should look similar to this
# mkfs -t ext4 /dev/nvme0n1p1 nke2fs 1.45.6 (20-Mar-2020) Discarding device blocks: done Creating filesysten with 124866880 4k blocks and 31219712 inodes Filesysten UUID: 5652bad-814c-4a2d-811a-fd5fb50a6dc4 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done
Hang on we are almost done with the setup. The next step is to create a directory mount point and to connect the nvme partition to the directory mount point.
mkdir /ntfs mount -t ext4 /dev/nvme0n1p1 /ntfs
Issue the following command to confirm the partition is mounted.
df -h Filesystem Size Used Avail Use% Mounted on /dev/root 248M 97M 139M 42% / /dev/nvme0n1p1 477G 26G 452G 6% /ntfs
The line we are looking for is this one. It shows that the device
/dev/nvme0n1p1is connected to the
/dev/nvme0n1p1 477G 26G 452G 6% /ntfs
Finally we’ve made it to the benchmarking point. Now we will use the
ddcommand to create a 1GB file on the local disk.
dd if=/dev/zero of=/ntfs/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0GiB) copied, 0.546232 s, 2.0 GB/s
In this case the
ddcommand created the 1GB file in about a 1/2 second at a rate of 2.0 GB/s. This results is withing the expected range.
I can give you a few numbers off the top of my head that are reasonable results.
SATA HDD (spinning disk) 40-90MB/s
SATA SSD 350-520MB/s
If your results are within the above ranges for the selected storage device this part of the test was successful.
In this section we are going to test the network bandwidth performance between the target computer and the FOG server. This test will involve both sending and receiving data to and from the FOG server from the target computer. The uitlity we will use for this test is called iperf3. The FOS Linux OS running on the target computer already has this utility installed, you will need to install this program on the FOG server because its not installed by default.
For example if your FOG server is running Ubuntu you would install iperf using this command
sudo apt-get install iperf3. That command should work for any debian/ubuntu variant OS.
Once iperf is installed, lets setup the server process. On the FOG server from a linux console key in the following command:
# iperf3 -s ----------------------------------------------------------- Server listening on <fog_server_ip> -----------------------------------------------------------
This will startup the server service running on the FOG server. For the rest of the testing you will not need to interact with the FOG server until its time to stop the
iperf3service on the FOG server using Ctrl-C command.
If the target computer is not already in debug mode, put the target computer into debug mode following the process in the first post. Now lets proceed with testing the network connection.
On the target computer’s FOS Linux command prompt key in the following command:
# iperf3 -c <fog_server_ip>
The output of the command will be presented as in the following chart
# iperf3 -c <fog_server_ip> Connecting to host <fog_server_ip>, port 5201 [ 5] local <target_computer_ip> port 43302 connected to <fog_server_ip> port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 112 MBytes 935 Mbits/sec 0 362 KBytes [ 5] 1.00-2.00 sec 112 MBytes 936 Mbits/sec 0 362 KBytes [ 5] 2.00-3.00 sec 111 MBytes 935 Mbits/sec 0 362 KBytes [ 5] 3.00-4.00 sec 111 MBytes 933 Mbits/sec 0 362 KBytes [ 5] 4.00-5.00 sec 111 MBytes 935 Mbits/sec 0 362 KBytes [ 5] 5.00-6.00 sec 112 MBytes 936 Mbits/sec 0 362 KBytes [ 5] 6.00-7.00 sec 111 MBytes 933 Mbits/sec 0 362 KBytes [ 5] 7.00-8.00 sec 112 MBytes 937 Mbits/sec 0 362 KBytes [ 5] 8.00-9.00 sec 111 MBytes 934 Mbits/sec 0 362 KBytes [ 5] 9.00-10.00 sec 111 MBytes 934 Mbits/sec 0 362 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.09 GBytes 935 Mbits/sec 0 sender [ 5] 0.00-10.02 sec 1.09 GBytes 932 Mbits/sec receiver
The above chart is what I would expect a typical network flow to look like. The important columns to pay attention to is Bitrate and Retr.
The Bitrate shows the throughput speed. For a 1GbE network 1000Mb/s is the theoretical maximum speed. The Retr column shows the number of times a data packet needed to be retransmitted. Ideally you should have 0 retransmissions on a well designed network.
Below is an example you might see on a congested network.
iperf3 -c 192.168.1.205 -p 5202 -i 1 -t 30 Connecting to host 192.168.1.205, port 5202 [ 5] local 192.168.1.210 port 56804 connected to 192.168.1.205 port 5202 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 112 MBytes 938 Mbits/sec 45 181 KBytes [ 5] 1.00-2.00 sec 80.3 MBytes 673 Mbits/sec 234 48.1 KBytes [ 5] 2.00-3.00 sec 54.3 MBytes 456 Mbits/sec 304 18.4 KBytes [ 5] 3.00-4.00 sec 55.9 MBytes 469 Mbits/sec 313 26.9 KBytes [ 5] 4.00-5.00 sec 56.1 MBytes 470 Mbits/sec 332 33.9 KBytes [ 5] 5.00-6.00 sec 60.2 MBytes 505 Mbits/sec 268 43.8 KBytes [ 5] 6.00-7.00 sec 70.5 MBytes 591 Mbits/sec 284 46.7 KBytes [ 5] 7.00-8.00 sec 63.7 MBytes 534 Mbits/sec 232 48.1 KBytes [ 5] 8.00-9.00 sec 49.5 MBytes 415 Mbits/sec 274 50.9 KBytes [ 5] 9.00-10.00 sec 63.4 MBytes 532 Mbits/sec 269 43.8 KBytes [ 5] 10.00-11.00 sec 69.2 MBytes 580 Mbits/sec 246 253 KBytes [ 5] 11.00-12.00 sec 111 MBytes 932 Mbits/sec 0 355 KBytes [ 5] 12.00-13.00 sec 111 MBytes 935 Mbits/sec 0 356 KBytes [ 5] 13.00-14.00 sec 111 MBytes 931 Mbits/sec 0 358 KBytes [ 5] 14.00-15.00 sec 111 MBytes 935 Mbits/sec 0 358 KBytes [ 5] 15.00-16.00 sec 112 MBytes 936 Mbits/sec 0 358 KBytes [ 5] 16.00-17.00 sec 111 MBytes 933 Mbits/sec 0 358 KBytes [ 5] 17.00-17.11 sec 12.6 MBytes 932 Mbits/sec 0 358 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-17.11 sec 1.38 GBytes 694 Mbits/sec 2801 sender [ 5] 0.00-17.11 sec 0.00 Bytes 0.00 bits/sec receiver
Note the Bitrate speed is impacted by the number of times the data packet needed to be Retr (retransmitted) during the test. Remember this process tests the entire data path between the target computer and FOG server.The transmitted files are all created in memory to memory so no part of the disk subsystem is being used here. While the above chart shows network congestion it really doesn’t tell us where in the data path its congested. For the context of this thread the network congestion could be the cause of the slower than normal imaging performance.
In this section we will test the file copy performance over the network. We will use the FOG server to host a file for use to copy. Later in this section we will use a previously captured disk image to deploy to our test system.
To start off this section we will assume that you have already connected the
/ntfsdirectory to your local hard drive partition. This step was carried out in the previous section regarding [ Target Dsk Subsystem ] so we will continue on from there.
Create a new directory off the root so we can connect the FOG server’s NFS share to our target test system.
Now we will connect the FOG Server’s share for image capture to the directory we just created.
mount -o nolock,proto=tcp,rsize=32768,wsize=32768,intr,noatime "<fog_server_ip>:/images/dev" /images
This next step we will create a working directory on the FOG server and then use
ddto create our 1GB test file similar to what we did to test the local hard drive write speed.
mkdir /images/test dd if=/dev/zero of=/images/test/test1.img bs=1G count=1 oflag=direct
The output should look similar to this:
# dd if=/dev/zero of=/images/test/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.64448 s, 140 MB/s
You might want to repeat this process a few times to confirm you have a consistent performance number.
Next we will turn around and read back in that 1GB file we created to see what our read performance is. The first thing we need to do is tell linux to not cache any reads then read in and time the reads. These two commands below need to be executed one right after the other.
echo 3 | tee /proc/sys/vm/drop_caches time dd if=/images/test/test1.img of=/dev/null bs=8k
The output of these commands should look like this:
# time dd if=/images/test/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.56148 s, 235 MB/s real 0m4.566s user 0m0.072s sys 0m1.092s
So for these two VMs I’m using I have roughly 140MB/s write and 245MB/s read performance running on the same proxmox host server. Note the real time stat. It took 4.5 seconds to read in that 1GB file from the FOG server.
The next step will be for us to time the copy rate between the FOG server and local hard drive. We will do that with these two command. Again we need to tell linux not to cache the file copy or read ahead any.
echo 3 | tee /proc/sys/vm/drop_caches time cp /images/teset/test1.img /ntfs
The output should look like this:
# echo 3 | tee /proc/sys/vm/drop_caches # time cp /images/teset/test1.img /ntfs real 0m7.445s user 0m0.009s sys 0m1.088s
While the copy results were not given to us in MB/s, we see the copy took about 7.4 seconds. This is just a bit slower than our read speeds from the FOG Server NFS share. For a quick comparison I ran the command to create the 1GB file on this test vm and this is the results.
# dd if=/dev/zero of=/ntfs/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.39809 s, 199 MB/s
So based on these two rating we can tell our bottleneck is reading the file from the FOG server.
The next test we will need to get a partition image from another captured image on your FOG server. We will test the download, expand and write process using partclone to send a partition to your test hard drive.
To do this
Summary of results