Include partclone 3.20?
-
It seems to me that when it comes to increasing the performance of FOG, server configuration makes a big difference.
The previous server was based on Debian 10 with FOG 1.5.9, and the /images directory was on the EXT4 partition. The server was virtualised with VMware ESXi and files were kept on an array of HP 7200RPM SAS drives. The configuration was the default - as it was after the FOG installation. That’s when I was achieving 7-8GB/min on PCs (with NVMe drives) for capture and deployment and 12GB/min for multicast. We are talking about a Windows 10 image (NTFS). With an image with Ubuntu 20.04 (EXT4), the speeds were definitely higher - 18GB/min with Multicast and 12GB/min with Unicast.
However, I wondered if it would be possible to squeeze more out of the whole thing, so I set up a virtual machine based on Fedora 36 with the XFS file system (which supposedly handles large files well, which disk images certainly are). I installed the latest development version of FOG. After installation, I manually compiled Udpcast version 28.03.2020 (the latest one doesn’t work with FOG - it crashes as soon as a multicast session starts). I added the following options to the sysctl.conf file:
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.core.rmem_default = 312144
net.core.rmem_max = 312144
net.core.wmem_default = 312144
net.core.wmem_max = 312144
to disable IPv6 (which I do not use) and to increase the socket buffer size. Additionally, I set 128 threads in the NFS settings, as the default is a small number (and the server is running on a vSphere cluster). Still in the FOG itself in the Storage Node settings I set the bitrate to 1000M. I’ve also compiled the latest kernel to make sure the latest patches are working (I’ve disabled SPECULATION_MITIGATIONS as I’ve read that this can affect CPU performance, and sometimes I find myself imaging computers with really weak CPUs). I also prepared my own init.xz with partclone 0.3.20, and then added these patches to the scripts and updated Buildroot to not have problems with BTRFS.Also, this is how my current FOG configuration looks - as if anyone was curious what exactly I changed that made me achieve the speeds I did. A bit of an offtopic, but maybe it will be of use to someone.
-
@piotr86pl said in Include partclone 3.20?:
That’s when I was achieving 7-8GB/min on PCs (with NVMe drives)
This is what I would expect on a 1 GbE network with a contemporary workstation and a well managed server. For a 1GbE network that is 1Gb/s (theoretical) transfer rate. 1Gb/s == 125MB/s == 7.5GB/min so the 6-8GB/min is a reasonable number meaning that you have great throughput on your network and your network is the primary bottleneck.
So how are you getting 12GB/min? That is because the number in partclone is a bit unclear. That number is the total throughput as written to disk. So it is a score of (FOG Server) disk subsystem transferring to the network adapter, network transit times, (client computer) moving the image from the network adapter into memory, decompressing the image in memory and then finally writing the image to the local media. Note the FOG server doesn’t do any computational actions here it just moves image files to and from local storage to the network adapter. The client computer does all of the heavy lifting during imaging.
So what does 12MB/min tell me.
- your network adapter and network infrastructure is working at optimal conditions.
- The target computer, CPU wise is not maxed out, but the limitation is how fast the image can be written to local storage.
Remember the image coming across the LAN is a compressed image, so as its decompressed it grows in size giving a feels like experience of being able to send more data over a 1GbE network connection than its theoretically possible.
FWIW I can run FOG on a raspberry pi 4 and image over ethernet at at 5GB/min from the onboard sd card. So the FOG server really doesn’t have a large impact on the imaging process as long as it can move data out the 1GbE network adapter at 100MB/s
-
Here you are right. Partclone shows the speed of writing to the disk, not the speed of downloading the image from the server. I wouldn’t look for the increase in write speed in improving the network parameters either, but more in the storage parameters of my machine. I switched /images to XFS and on top of that I made sure that in Fedora, no unnecessary background services were running. I also forgot to add that I tweaked the parameters of the machine itself - from 6 cores to 12 and from 8GB of RAM to 12GB. Previously, the machine may simply not have been able to keep up with writes and reads from the array.
I also don’t rule out that changes to the kernel and init also had an impact on the speed increase. It is likely that the SPECULATION MITIGATIONS option may have had some bearing on the matter. The iperf tests showed, full 1Gbps speed from the FOG server to the FOS, both with the old setup and the new one. So by saying that the server setup matters, I’m referring specifically to the server’s hardware configuration and disk configuration. I’ve made too many changes to the FOG to say unequivocally what specifically caused such an increase in speed - I’m no less pleased that I was able to squeeze out something more. I am happy with what I have and am unlikely to try to get more. The important thing for me is that the current speeds are stable and nothing has crashed yet.
-
@george1421 I would argue 3.20 has been tested quite a bit considering Clonezilla. Something that would be cool is - during the FOG install it would ask if the user would like to create ISO/IMG in order to create bootable media for devices that can not PXE boot (Macs). It could also take the settings that were plugged in during the install, even add the T2 kernel etc etc.
I digress, I can’t code and can barely build unless there is a ‘recipe’ containing the commands to build, so I have no idea how complicated (read time consuming) this would be. Maybe it is time for me to learn.
I’m just happy you guys made the inits with 3.20 available.
-
@fog_newb said in Include partclone 3.20?:
I would argue 3.20 has been tested quite a bit considering Clonezilla.
Understand my intent is to argue the the point, but clonezilla uses debian as the base OS which has been tested. What I’m referring to is that partclone 0.3.20 has not been tested with FOS Linux. FOS Linux is its own customized linux distribution and is at the heart of FOG imaging. The developers has to make sure that it functions and is compatible with previous releases of partclone. When partclone moved from 0.2.89 (i think) to 0.3.x branch they changed the disk format of the captured image, this caused issues deploying 0.2.x captured images with partclone 0.3.x. That became a problem for the developers. Anyway the FOG Project needs more testers willing to help test new versions of FOG before they are made generally available. The last thing the developers want to do is release a broken version of FOG that stops people from imaging, they are not Microsoft…
The creating a bootable image bit is possible since I already created a scriptable to do that. The issue is using a usb bootstick is a fringe requirement AND in a way so are mac computers. The developers have a very limited resource and for every new add-on they need to have someone willing to maintain it and support it. That is why there was some discussion about limiting the FOG supported host OS’ to just 3 or 4. Maintaining support for 9 different linux variants is a bit much, IMO. Now what we might get them to do is create a script similar to the one they provided that recompiles iPXE. This script would build the usb boot stick on demand. And I digress too…
Please continue to test the inits with the 0.3.20 partclone and report back if you have issues. That is the best and quickest way to get 0.3.20 into the 1.5.10 (next) FOG release.
-
@Piotr86PL Thank you very much for your work on the FOS inits, your testing and reporting in the forums!! Let me try to answer all the points you made one by one:
commits of mine from github …
Great stuff, merged the two pull requests concerning BTRFS issues and added output messages.
The whole thing was built using Buildroot 2022.02.5, which fixes bugs related to udev (https://github.com/FOGProject/fos/issues/46).
While I can see your point getting that fixed by updating buildroot I am a bit worried to jump to that new version for the planed next FOG release. Same I would say for updating partclone to 0.3.20. Just my way of being a bit more conservative with such a step. On the other hand you provided good evidence this is not causing any harm - at least so far.
On the plus side with updating buildroot and partclone we have fixes for APFS and BTRFS - both not being used by the majority of users I reckon. Don’t get me wrong, I am not saying we should not do it. I just try weighing the pros and cons of this.
updated version of UDPcast to 20200328
That’s a good point you make here. Udpcast has not been updated in buildroot in a long time and I have not had that in mind. The FOG server side still comes with the kind of historic 20120424 version while the current FOS inits use 20200328 already I think. So we should definitely switch to 20200328 on the FOG server side as well. Thanks for bringing this up!
I’ve disabled SPECULATION_MITIGATIONS as I’ve read that this can affect CPU performance
See my comment here: https://forums.fogproject.org/topic/16508/enable-or-disable-speculation_mitigations-in-the-linux-kernel
Sooner or later we will update so it’s pretty much a question of getting more people to test it before an official release.
As we still find bugs in the code related to PHP 8 I am open to update buildroot and partclone, build and upload inits for everyone who runs the latest dev-branch (re-running the installer would pull those new inits).
Would be great to hear some more voices from the community on this. George is right that there was a pretty ugly situation with moving from 0.2.89 to 0.3.x as older images could not be restored anymore. Testing is needed to hopefully find out before a release.
-
There is no need to rush. The most important thing is that everything works stably - I will keep testing and monitoring my setup.
As for this partclone 0.3.20. The images I am operating on were still created by version 0.3.13 and there is no problem with that. The new ones also work. Here I can agree that APFS is somehow not a popular filesystem, but when it comes to BTRFS, I notice that more and more distributions are pushing this as the main filesystem, e.g. Fedora already offers BTRFS at the start. It may not be as widely used at the moment but it could be soon. “Soon” meaning not tomorrow, so there is still plenty of time to test partclone 0.3.20 before releasing it into production.
Regarding SPECULATION_MITIGATIONS, I’ve compiled a kernel with this option enabled and will test it a bit on different hardware platforms and see if it actually affects this performance that much or not. If not, I’ll leave it enabled - it’s always some sort of security layer, and also disabling this option displays a warning message when FOS starts up, and that doesn’t look too good.
-
@piotr86pl said in Include partclone 3.20?:
I will keep testing and monitoring my setup.
While I appreciate you do the testing a lot I would still argue that more people need to use/test it to have some more coverage (filesystems, different image types, really old images created years ago, hardware etc.).
So I think updating the inits used with dev-branch should hopefully give us some more results.
-
Of course - the more variety the better the test results will be. I will slowly start to deploy my kernels and inits in other computer labs at school, where there is also diversity in computers and configurations, so I will know from the lab supervisors if something on their hardware does not work.
Funny thing, because there are still some issues with each lab having a different hardware configuration, but at least now it’s useful for FOG testing.
-
@Fog_Newb @Piotr86PL The inits used with dev-branch now have partclone v0.3.20 (and buildroot 2022.02.6). Hope we get more poeple to test this and feedback to see how well it works.
-
@sebastian-roth Thanks. I just updated to the latest. So far it is capturing an APFS drive on a hackintosh no problem.
-
@george1421 Not much but… I did a little testing at home. 4 different machines, HP Laptop hackintoshed (all drives, non resizable, Other OS) (one SSD NTFS the other APFS), 2 gaming PCs (win 10 single drive all partitions resizable), A 2018 mac mini all boot security turned off.
And… and !!! I finally migrated, well actually, I made a new FOG server from scratch using only the ltsc.conf from the previous FOG server . I installed Ubuntu Server 22.04.1 fresh in Virtual Station on a NAS, (QNAP TS-451+), latest dev branch with the new inits built in, and was able to capture and deploy from/to 4 home computers no problem. I wish I still had older images and more things to test with.
I love this NAS especially since it was given to me. I dropped in 16GB of RAM (from 2) and got cray stupid -configuring all the drives in a RAID 0 stripe, hey I live on the edge , I can’t believe this thing is running a couple VMs no problem on that Celeron CPU.