Can you make FOG imaging go fast?
-
After reading through this thread: https://forums.fogproject.org/topic/10456/performance-issues-with-big-images I started wondering if there were performance tuning that could be done to the FOG environment to allow faster image deployment and captures. Maybe the linux distros system defaults ARE the most appropriate for FOG, maybe they are not.
This thread started me thinking:
-
In the referenced thread the OP’s image size was 160GB single disk raw images. This is a huge single 160GB blob file. Could we see better deployment times by breaking that 160GB file into multiple 2GB files or with the 160GB blob file/ Remember we need to decompress this file as its deployed. Would there be any performance gains by having multiple 2GB files, where a 2GB file would typically fit in RAM and the 160GB file not?
-
Would NFS be happier with a smaller file? Is there any performance tuning we can do on the NFS side. In a typical FOG configuration 85% of the data is read from the FOG server with 15% of the data is written to the FOG server. Are there any nfs tuning parameters we can do with this type of split between reading and writing?
-
Is there anything we can do from a disk subsystem standpoint to allow the NFS server to be able to read faster from disk? What type if disk configuration is better? Is disk caching (ram) an option? What about read ahead cache? What is the impact of a FOG server with a single sata disk verses a raid configuration? Does SSD drives make a solid investment for the FOG server?
-
Is there anything we can do from the networking side to better performance? Will more than one network adapter help and under what situations (<–hint: I worked for many years as a network engineer, I already know the answer to this one). Would increasing the the MTU size from 1500 to 9000 really make an impact on deployment times?
My idea is to create a test setup in the LAB and see if I can improve on the stock linux distribution in each of the 4 areas. I might fight the magic bullet to make FOG go faster or I might find that the linux distributions default setting are correct and tweaking this or that adds no real value. I can say from my production FOG server running 2 vCPUs on a 24 core vSphere server, I can achieve about 6.2GB/min transfer rates (yes I know this number is a bit misleading since it also include decompression times, but its a relative number that we all can see) for a single unicast image. I know others are able to get 10GB/min transfer rates with their setup. My plan is to use 4 older Dell 790s for this test setup (1 as FOG server and 3 as target computers). I want to remove any virtualization layers for this testing, so I will be installing Centos 7 on physical hardware.
My intent is to document the process I find here.
{Day 2}
As you think about the FOG imaging process there are 3 performance domains involved here.- Server
- Network
- Client computer
All three have a direct impact on the ability to image fast. For this thread I want to focus on the first two (Server and network) because those we should have the most control over.
Within the Server performance domain there are several sub classes that have an impact on imaging fast.
- Disk subsystem
- NFS subsystem
- RAM memory
- Network (to the boundary of the ethernet jack)
For fog imaging to achieve its top transfer rates each sub component must be configured to move data at its fastest rate.
For the first three sub components (disk, ram and nfs) I can see two different types of workloads we need to consider.- Single unicast or multicast stream
- Multiple simultaneous unicast or multicast streams
The single unicast / multicast stream can take advantage of linear disk reads and onboard read ahead disk caching.
The multiple data streams are a bit more complex because of the randomness of the data requests for information on the disk.
Both workloads need to be taken into consideration.
{Day 3}
Well after burning [wasting] a whole day of trying to get a PoC RocketRaid 640 to work with Centos 7 by attempting to recompile the driver for the linux 3.10 kernel. I’ve given up trying to bench mark the differences between a single disk and a 4 disk raid 0 array for now. I may circle back to this if I can dig up another “working” raid controller that fits into my test setup.{Day 4}
Well Day 4 was a bit more productive. While this isn’t a true representation of an actual workload but I set up several tests to baseline a few different server disk configurations. First lets cover the materials used in the testing.
For the “FOG Server” I’m using a Dell Optiplex 790 with 8GB of ram. This is a mini desktop version so I can add full size expansion cards (like that PoC RocketRaid card). I also plan on testing the LOM network adapter as well as an intel dual port server adapter in a future test. So the desktop case is required. See the disk testing results here.{Day 5}
Testing disk performances between hdd and ssd drives was not surprising. A single ssd is about 6 times faster than a hdd running on a the same hardware. Because of that PoC RocketRaid being a bust, I decided to use linux’s built in software raid to support the raid configuration part of my testing. I was really surprised on how fast the linux software raid really was with the hdd topping out at 380MB/s with the ssd maxing out with 718MB/s (only about twice as fast). This is only speculation, but I probably could get a bit more performance out of the hdd software raid array by adding a few more disks to the array. As for the ssd drives, I feel they are about maxing out the 3Gb/s sata channel on the 790. I wouldn’t expect to see much better performance out of the ssd array by adding 1 or 2 more ssd drives to the array because of this. One might consider why is disk speed important, especially because a single GbE network adapter can only move 125MB/s (theoretical max)? Remember we have 2 workloads we need to consider both a single unicast stream (linear read) and multiple unicast streams (random disk reads). The faster disks subsystem will allow faster data retrieval during the multiple unicast deployment. As we get into the networking part of the test we will see which is a better value, or has the greatest impact on speed for the money [ssd vs hdd]. I have a feeling we will find that our disk subsystem isn’t our choke point in our fog deployment server. I’m going to speculate having a full SSD array may not be of much value.{Day 6 to 8}
Other activities kept my attention{Day 9}
Network performance testing. In this section I tried to find a suitable tool to measure total capable bandwidth. I settled on iperf3. I compiled iperf3 from source code and with static linking to the libraries. This allowed me to copy the compiled version to both the test fog server and pxe target computers without needing to worry about library dependencies. On the test fog server I set up the receiver and then tested each pxe target computer one by one to ensure all had comparable bandwidth reading before testing in groups. My test setup is still for the FOG server a Dell 790 mini tower and then for the pxe target computers Dell 790 SFF computers. The networking switch is an older Linksis/Cisco SRW2008 8 port switch. Just as a reminder I’m picking older hardware to get realistic testing results. I’m sure I can get really impressive results with new hardware, but I want real numbers. The fog server disk subsystem is using the 3 constellation hdd in a linux software raid-0 configuration.More to come
-
-
Post place holder
-
Post place holder
-
Part 4 NFS subsystem testing
This part builds on the baseline network settings from part 3. In this test I ran the same command used for local hard drive testing on the pxe target computer to the nfs share on the fog server (/images/dev).
[Wed Jul 26 root@fogclient /images]# dd if=/dev/zero of=/images/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.68355 s, 111 MB/s [Wed Jul 26 root@fogclient /images]# dd if=/dev/zero of=/images/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.67013 s, 111 MB/s
The results of a single nfs sequential file write is 111 MB/s (6.66GB/m)
I also performed the same commands for disk read over NFS
[Wed Jul 26 root@fogclient /images]# echo 3 | tee /proc/sys/vm/drop_caches [Wed Jul 26 root@fogclient /images]# time dd if=/images/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.69505 s, 111 MB/s (6.66GB/m) real 0m9.697s user 0m0.025s sys 0m0.352s
Again we had about 111MB/s image transfer rates.
This test I started 2 of the pxe target computers creating this sequential file on the nfs share. Here is the results from each pxe target computers.
#1 host [Wed Jul 26 root@fogclient /images]# dd if=/dev/zero of=/images/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 17.7051 s, 60.6 MB/s #2 host [Wed Jul 26 root@fogclient /images]# dd if=/dev/zero of=/images/test2.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 17.0664 s, 62.9 MB/s
As you can see the overall speed dropped to about 61 MB/s or (3.66 GB/m). So that is pretty linear.
Then I tried 3 pxe target computers creating the sequential image at the same time.
host #1 [Wed Jul 26 root@fogclient /images]# dd if=/dev/zero of=/images/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 26.362 s, 40.7 MB/s host #2 [Wed Jul 26 root@fogclient /images]# dd if=/dev/zero of=/images/test2.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 27.1975 s, 39.5 MB/s host #3 [Mon Jul 24 root@fogclient /images]# dd if=/dev/zero of=/images/test3.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 26.0602 s, 41.2 MB/s
Again the overall speed dropped to 40MB/s (2.4GB/m), which is still pretty linear.
-
Part 3 Network subsystem testing
IPerf test between single target computer and FOG server
[Wed Jul 26 root@fogclient /images]# ./iperf3 -c 192.168.1.205 -p 5201 Connecting to host 192.168.1.205, port 5201 [ 5] local 192.168.1.207 port 43302 connected to 192.168.1.205 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 112 MBytes 935 Mbits/sec 0 362 KBytes [ 5] 1.00-2.00 sec 112 MBytes 936 Mbits/sec 0 362 KBytes [ 5] 2.00-3.00 sec 111 MBytes 935 Mbits/sec 0 362 KBytes [ 5] 3.00-4.00 sec 111 MBytes 933 Mbits/sec 0 362 KBytes [ 5] 4.00-5.00 sec 111 MBytes 935 Mbits/sec 0 362 KBytes [ 5] 5.00-6.00 sec 112 MBytes 936 Mbits/sec 0 362 KBytes [ 5] 6.00-7.00 sec 111 MBytes 933 Mbits/sec 0 362 KBytes [ 5] 7.00-8.00 sec 112 MBytes 937 Mbits/sec 0 362 KBytes [ 5] 8.00-9.00 sec 111 MBytes 934 Mbits/sec 0 362 KBytes [ 5] 9.00-10.00 sec 111 MBytes 934 Mbits/sec 0 362 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.09 GBytes 935 Mbits/sec 0 sender [ 5] 0.00-10.02 sec 1.09 GBytes 932 Mbits/sec receiver iperf Done
IPerf traffic test between 2 simultaneous pxe target computers and the fog server
[Wed Jul 26 root@fogclient /images]# ./iperf3 -c 192.168.1.205 -p 5202 -i 1 -t 30 Connecting to host 192.168.1.205, port 5202 [ 5] local 192.168.1.210 port 56804 connected to 192.168.1.205 port 5202 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 112 MBytes 938 Mbits/sec 45 181 KBytes [ 5] 1.00-2.00 sec 80.3 MBytes 673 Mbits/sec 234 48.1 KBytes [ 5] 2.00-3.00 sec 54.3 MBytes 456 Mbits/sec 304 18.4 KBytes [ 5] 3.00-4.00 sec 55.9 MBytes 469 Mbits/sec 313 26.9 KBytes [ 5] 4.00-5.00 sec 56.1 MBytes 470 Mbits/sec 332 33.9 KBytes [ 5] 5.00-6.00 sec 60.2 MBytes 505 Mbits/sec 268 43.8 KBytes [ 5] 6.00-7.00 sec 70.5 MBytes 591 Mbits/sec 284 46.7 KBytes [ 5] 7.00-8.00 sec 63.7 MBytes 534 Mbits/sec 232 48.1 KBytes [ 5] 8.00-9.00 sec 49.5 MBytes 415 Mbits/sec 274 50.9 KBytes [ 5] 9.00-10.00 sec 63.4 MBytes 532 Mbits/sec 269 43.8 KBytes [ 5] 10.00-11.00 sec 69.2 MBytes 580 Mbits/sec 246 253 KBytes [ 5] 11.00-12.00 sec 111 MBytes 932 Mbits/sec 0 355 KBytes [ 5] 12.00-13.00 sec 111 MBytes 935 Mbits/sec 0 356 KBytes [ 5] 13.00-14.00 sec 111 MBytes 931 Mbits/sec 0 358 KBytes [ 5] 14.00-15.00 sec 111 MBytes 935 Mbits/sec 0 358 KBytes [ 5] 15.00-16.00 sec 112 MBytes 936 Mbits/sec 0 358 KBytes [ 5] 16.00-17.00 sec 111 MBytes 933 Mbits/sec 0 358 KBytes ^C[ 5] 17.00-17.11 sec 12.6 MBytes 932 Mbits/sec 0 358 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-17.11 sec 1.38 GBytes 694 Mbits/sec 2801 sender [ 5] 0.00-17.11 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated
Notable output here is at 11 seconds notice that the retrans drops to 0 that is when the first of the pair of target computers completed its run
IPerf with 3 target computers
Connecting to host 192.168.1.205, port 5202 [ 5] local 192.168.1.210 port 56816 connected to 192.168.1.205 port 5202 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 112 MBytes 937 Mbits/sec 0 356 KBytes [ 5] 1.00-2.00 sec 111 MBytes 934 Mbits/sec 0 356 KBytes [ 5] 2.00-3.00 sec 111 MBytes 935 Mbits/sec 0 356 KBytes [ 5] 3.00-4.00 sec 111 MBytes 933 Mbits/sec 0 356 KBytes [ 5] 4.00-5.00 sec 111 MBytes 935 Mbits/sec 0 372 KBytes [ 5] 5.00-6.00 sec 110 MBytes 925 Mbits/sec 62 70.7 KBytes [ 5] 6.00-7.00 sec 51.3 MBytes 431 Mbits/sec 404 17.0 KBytes [ 5] 7.00-8.00 sec 52.0 MBytes 436 Mbits/sec 261 28.3 KBytes [ 5] 8.00-9.00 sec 56.1 MBytes 471 Mbits/sec 282 9.90 KBytes [ 5] 9.00-10.00 sec 52.1 MBytes 437 Mbits/sec 301 21.2 KBytes [ 5] 10.00-11.00 sec 71.8 MBytes 603 Mbits/sec 176 197 KBytes [ 5] 11.00-12.00 sec 55.2 MBytes 463 Mbits/sec 271 29.7 KBytes [ 5] 12.00-13.00 sec 47.9 MBytes 402 Mbits/sec 270 53.7 KBytes [ 5] 13.00-14.00 sec 34.1 MBytes 286 Mbits/sec 264 5.66 KBytes [ 5] 14.00-15.00 sec 39.1 MBytes 328 Mbits/sec 240 53.7 KBytes [ 5] 15.00-16.00 sec 52.3 MBytes 439 Mbits/sec 229 49.5 KBytes [ 5] 16.00-17.00 sec 60.6 MBytes 508 Mbits/sec 225 106 KBytes [ 5] 17.00-18.00 sec 54.1 MBytes 454 Mbits/sec 336 26.9 KBytes [ 5] 18.00-19.00 sec 50.9 MBytes 427 Mbits/sec 259 56.6 KBytes [ 5] 19.00-20.00 sec 74.1 MBytes 622 Mbits/sec 209 198 KBytes [ 5] 20.00-21.00 sec 75.1 MBytes 630 Mbits/sec 276 46.7 KBytes [ 5] 21.00-22.00 sec 44.4 MBytes 372 Mbits/sec 282 29.7 KBytes [ 5] 22.00-23.00 sec 103 MBytes 861 Mbits/sec 13 354 KBytes [ 5] 23.00-24.00 sec 111 MBytes 934 Mbits/sec 0 358 KBytes [ 5] 24.00-25.00 sec 111 MBytes 934 Mbits/sec 0 358 KBytes [ 5] 25.00-26.00 sec 111 MBytes 934 Mbits/sec 0 359 KBytes [ 5] 26.00-27.00 sec 111 MBytes 934 Mbits/sec 0 359 KBytes [ 5] 27.00-28.00 sec 111 MBytes 934 Mbits/sec 0 359 KBytes [ 5] 28.00-29.00 sec 111 MBytes 934 Mbits/sec 0 359 KBytes [ 5] 29.00-30.00 sec 111 MBytes 934 Mbits/sec 0 359 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-30.00 sec 2.36 GBytes 677 Mbits/sec 4360 sender [ 5] 0.00-30.02 sec 2.36 GBytes 676 Mbits/sec receiver
Notable infor here is that I started the first target computer sending waited 5 seconds and started the second and then about 5 seconds and started the third. You can almost see in the MB/s transfer rates when these target computers stopped and started.
So what did this tell us? Don’t try to run 3 simultaneous all out image transfers or you will saturate that single nic to the server. The above tests were done with the LOM network adapter.
-
Part 2 Disk subsystem testing
To start this off I wanted to do a simple baseline comparison between installing FOG on a single sata disk using the onboard sata controller, a single sata hdd disk (same) on a raid controller as a JBOD disk, then setup a 4 disk raid 0 on the raid controller. The next steps are to replace the sata hdd with sata sdd drives and repeat the steps as with the hdd.
The the simple disk baseline I’m using the following linux command to create a sequential 1GB file on disk and then to read it back. This process is designed to simulate the single unicast workload. The command used to write the 1GB file is this:
dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct
The command to read it back is:
echo 3 | tee /proc/sys/vm/drop_caches && time dd if=/tmp/test1.img of=/dev/null bs=8k
The echo command is intended to disable the read cache so we get a true read back value.The disks I used are as follows
- (3) Dell Constellation ES 1TB server hard drives [hdd] (what I had in my magic box of extra bits).
- (3) Crucial MX300 275GB SDD
I used 3 because that is what I had of the ssd drives in my no so magic box from amazon.
Test Process:
- Install the test drives into the 790 and installed Centos 7 1611
- No updates were applied, the install image was straight off usb.
- Log in as root to the linux command prompt
- Run the sequential write command 3 times (avg results)
5, Run the sequential read command 3 times (avg results) - Shutdown and prep for next test.
Test 1: Single Constellation (hdd) attached to on board sata
[root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 13.9599 s, 76.9 MB/s [root@localhost ~]# [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 13.9033 s, 77.2 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 13.7618 s, 78.0 MB/s [root@localhost ~]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 13.6594 s, 78.6 MB/s [root@localhost ~]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 13.5738 s, 79.1 MB/s real 0m13.577s user 0m0.040s sys 0m0.888s
Average speed write 77MB/s (4.7 GB/m) read 78MB/s
Test 2: Single MX300 (ssd) attached to on board sata
[root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.24173 s, 479 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.24117 s, 479 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.24441 s, 478 MB/s [root@localhost ~]# echo 3 | tee /proc/sys/vm/drop_caches [root@localhost ~]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 2.10576 s, 510 MB/s real 0m2.109s user 0m0.018s sys 0m0.664s
Average speed write 478MB/s and read 510MB/s
Test 3: 3 Constellations (hdd) in software raid-0 configuration to on board sata
[root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.90412 s, 370 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.78557 s, 385 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.75433 s, 390 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.802 s, 383 MB/s [root@localhost ~]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 2.75442 s, 390 MB/s real 0m2.967s user 0m0.016s sys 0m0.461s
Average speed write 380MB/s and read 390MB/s
* since this was a software raid, I feel the runs after the very first one may be tainted because of some buffering in the software raid driver in linux
Test 4: 3 MX300 (ssd) in software raid-0 configuration to on board sata
[root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 1.4921 s, 720 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 1.50214 s, 715 MB/s [root@localhost ~]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 1.49913 s, 716 MB/s [root@localhost ~]# echo 3 | tee /proc/sys/vm/drop_caches [root@localhost ~]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 1.33486 s, 804 MB/s real 0m1.343s user 0m0.016s sys 0m0.385s [root@localhost ~]# echo 3 | tee /proc/sys/vm/drop_caches [root@localhost ~]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 1.31937 s, 814 MB/s real 0m1.323s user 0m0.013s sys 0m0.322s
Average speed write 718MB/s and read 800MB/s
* since this was a software raid, I feel the runs after the very first one may be tainted because of some buffering in the software raid driver in linux
Test 5: Dell PE2950 6i Raid with 6 x WD RE drives (hdd) in Raid-10 configuration. (just a comparison test)
[root@localhost /]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.96148 s, 363 MB/s [root@localhost /]# dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 2.86738 s, 374 MB/s [root@localhost /]# echo 3 | tee /proc/sys/vm/drop_caches [root@localhost /]# time dd if=/tmp/test1.img of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes (1.1 GB) copied, 3.199 s, 336 MB/s real 0m3.367s user 0m0.024s sys 0m0.861s
Average speed write 368MB/s and read 336MB/s
* performance values may be tainted by current workload on the server. The intent of this test was to identify a ball park number with production server vs Dell 790 desktop
-
@george1421 said in Can you make FOG imaging go fast?:
I can say from my production FOG server running 2 vCPUs on a 24 core vSphere server, I can achieve about 6.2GB/min transfer rates (yes I know this number is a bit misleading since it also include decompression times, but its a relative number that we all can see) for a single unicast image.
That figure is not network transfer speed or compression/decompression speed nor is it an aggrigate, it is simply write speed to the host’s disk.
It doesn’t represent or reflect network transfer speed or decompression speeds. These things are very loosely related to the write speed just as the disk you’re using is related to the write speed - but this figure does not tell where any bottleneck is.
Trying to use this figure to gauge network transfer speed would be like trying to gauge the mail man’s speed based on how long it takes me to go check my mailbox (if the post office used that as their metric, the mailman would be fired because I check my mail every few days).
Further, your bottleneck is probably not the next person’s bottleneck. My experience with multiple FOG servers on multiple types of hardware has shown that tuning FOG is a matter of balancing network throughput with a host’s ability to decompress. We cannot speed up how fast a host’s disk can write, it’s maximum write speed is still it’s maximum write speed no matter what we do with CPU or Network or Compression or RAM - the idea is simply to always have data waiting to be written to disk without delay, and how to balance the CPU’s ability to decompress with the network’s ability to transmit to many clients at once, and the FOG server’s ability to serve many clients at once. This all comes back to two simple things I think:
Max Clients
andcompression rate
.It’s a balancing act of these two things. Of course, ZSTD is the most superior compression algorithm, which is why it’s not one of the two simple things. But it’s compression rate is.
The FOG Server’s disk does play a role - but at my last job, I was clearly hitting the network’s maximum throughput bottleneck - so a solid state disk would not have helped.
At any rate, the script below is an example of how to automate the monitoring & collecting of things from FOS: https://github.com/FOGProject/fog-community-scripts/blob/master/fogAutomatedTesting/postinit.sh
That’s what I’d use to collect any custom metrics you want to monitor more quickly, instead of doing a debug every time and manually monitoring. -
-