Slowdown Unicast and Multicast after upgrading FOG Server
-
@Quazz In the other thread (with a similar condition) I have the OP trying the 5.5.3 one-off kernel and then he said he updated from 1.5.7 to 1.5.7.102. As part 2 of that test (assuming the kernel upgrade doesn’t fix the issue) I’m going to have him roll FOS Linux back to 1.5.7 by downloading the binaries for 1.5.7 to see if that restores the speed.
-
@mp12 said in Slowdown Unicast and Multicast after upgrading FOG Server:
Luckily the binaries 1.5.3 boost up to 13 GB/min. Thats the speed we hab before.
For now we will stick to 1.5.3 binaries running behind FOG 1.5.7.112.
If there are improvments please let us know so we can test them.Thanks for testing and updating the topic. Can you please use the 1.5.4 kernel and see if you can deploy using that. What’s it doing speed-wise then?
-
We are running several tests at the moment. Only using the Binaries which can be downloaded from https://fogproject.org/binaries1.5.x.zip.
FOG runningdev-branch 1.5.7.112
Here are some results.
Binaries 1.5.7: deploy speed around 12GB/min.
bzImage-1.5.7: Linux kernel x86 boot executable bzImage, version 4.19.48 (jenkins-agent@Tollana) #1 SMP Sun Jul 14 13:08:14 CDT , RO-rootFS, swap_dev 0x7, Normal VGA
Binaries 1.5.6: deploy speed around 12GB/min.
bzImage-1.5.6: Linux kernel x86 boot executable bzImage, version 4.19.36 (jenkins-agent@Tollana) #1 SMP Sun Apr 28 18:10:07 CDT , RO-rootFS, swap_dev 0x7, Normal VGA
Binaries 1.5.5: deploy speed around 12GB/min.
bzImage-1.5.5: Linux kernel x86 boot executable bzImage, version 4.19.1 (sebastian@Tollana) #1 SMP Fri Feb 22 01:04:27 CST 2019, RO-rootFS, swap_dev 0x8, Normal VGA
Binaries 1.5.4: deploy speed around 12GB/min.
bzImage-1.5.4: Linux kernel x86 boot executable bzImage, version 4.16.6 (builder@4c3c12e8cfd6) #4 SMP Wed May 9 22:08:36 UTC 201, RO-rootFS, swap_dev 0x7, Normal VGA
Binaries 1.5.3: deploy speed around 12GB/min.
bzImage-1.5.3: Linux kernel x86 boot executable bzImage, version 4.15.2 (builder@c38bc0acaeb4) #5 SMP Tue Feb 13 18:30:08 UTC 20, RO-rootFS, swap_dev 0x7, Normal VGA
-
@mp12 Just for clarity you should be downloading the zip file from each release (ONLY). And using that as part of your test. The version of FOG Server should stay at 1.5.7.102 or what ever is the latest release.
The developers are suspecting something in FOS Linux (contained in bzImage and init.xz in each binary zip file) has changed somewhere at some time causing this speed issue. They need to narrow down when the speed changed between FOS Linux 1.5.x and 1.5.xn.
Also based on the data you collected so far, you can skip 1.5.4. I (we) are most interested in the 1.5.7 results.
-
This post is deleted! -
We are still using FOG Server dev-branch 1.5.7.112.
All binaries (1.5.3 up to 1.5.7) used the partclone version 0.2.89. Maybe thats the problem? The binaries from dev-branch where running on partclone 0.3.12.
-
@mp12 Do I get this right? Whichever kernel/init you use from one of the last releases 1.5.3 through to 1.5.7 all show fast deploy speeds?
-
That is correct.
-
@mp12 Perhaps, though I believe the same issue does not occur in Clonezilla which also uses partclone 0.3.
Their partclone commands are relatively similar to ours, though they include a specific Read Write buffer of
-z 10485760
(which is 10 times the default). That said, the default didn’t cause issues before, so it’s unlikely to cause issues now.A greated divergence between FOS and Clonezilla is that Clonezilla uses debian as a base for their live ISO, whereas we build a filesystem using Buildroot and kernel from source.
So more likely there is a problem introduced in that area, whether bug, config issues, or otherwise.
That all said, thank you for helping us narrow it down significantly already.
-
-
@Sebastian-Roth My apologies if this is too forward, but would the latest build with 0.3.13 be able to be installed by myself as well? I would love to test 0.3.13 to see if it fixes my slowness issue as well. I would gladly report back my findings.
-
@Sebastian-Roth Also if this doesn’t lead us to a solution, we could hack the inits and then “borrow” clonezilla’s partclone to see if there is any change in performance.
I have a 9020 here, let me dig it out and see if I can duplicate the results. Only to confirm I can create a broken system.
-
@rogalskij If you have an immediate need, you can install the 1.5.7 version of FOS (not FOG) to get the speed back today. Just grab the 1.5.7 binaries and extract the bzImage* and init*.xz files and drop them into /var/www/html/fog/service/ipxe directory and the pxe boot the target computer. Those binaries will run fine on FOG 1.5.7.102 or later. You can do that until the devs can get things sorted out. Just be aware that if you captured an image using 1.5.7.102 you can not deploy it with FOS 1.5.7.
-
@mp12 Ok, here we go. Please try this init: https://fogproject.org/inits/init_partclone_0.3.13.xz
Make sure you use the kernel delivered with the latest FOG
dev-branch
version. If you are unsure manually re-download here: https://fogproject.org/kernels/Kernel.TomElliott.4.19.101.64If the speed is high then we have ruled out the kernel (from dev-branch) and surely the partclone version 0.3.12 would cause the slowdown in your case. If it’s still slower than expected I would ask you to stick to the init_partclone_0.3.13.xz but go back to kernel from binaries1.5.7.zip. Slow or fast?
My apologies if this is too forward, but would the latest build with 0.3.13 be able to be installed by myself as well?
Sure you can but it’s way more complicated to explain right now then just build it for you. If it turns out to be the issue we might need to go ahead to 0.3.13 for the official binaries anyway.
-
@Sebastian-Roth Can confirm that after testing this in my environment with the new init (partclone 0.3.13) and the latest dev Kernel specified, things are back to fast again. Average in my environment was somewhere around 9/GB per minute.
-
@Quazz said in Slowdown Unicast and Multicast after upgrading FOG Server:
though they include a specific Read Write buffer of -z 10485760 (which is 10 times the default). That said, the default didn’t cause issues before, so it’s unlikely to cause issues now
It would be really interesting to know why they picked such a large write buffer. I wonder what problem were they trying to solve?? Or was this a hold over from a previous release of clonezilla using 0.2.98.
-
We are back on track!
Here a snapshot with dev-branch kernel and the new init_partclone_0.3.13.xz files. The image we are deploying was previosly captured with dev kernel an dev init.xz.At the moment we are also running a multicast with around 12Gb/min - 13GB/min. Maybe everything has even become a little faster.
Again great support and excellent work! Thank you guys! If there are further tests let us know.
-
@mp12 Looks good! I will update the official binaries tomorrow. Thanks for all the testing! Marking as solved.