Slowdown Unicast and Multicast after upgrading FOG Server
-
@mp12 Just curious if you ever figured out the source of your slow down? I too experienced the same major slowdown after upgrading to a newer dev branch to fix some issues we had (I am now on 1.5.7.102). I went from roughly 13GB per minute to 2GB or slower per minute. Very frustrating. I tried capturing the image a couple different times with different compression and a bunch of things. Tried on multiple Dell Optiplex models: 9010, 9020, 7050 and all of them display the same slowness. Got this issue on two separate servers (we have 2 campuses at our College, so two different servers). Almost feels like the different kernels did this. We were on “5.1.16 mac nvmefix” but then upgraded to the “4.19.101” which came with the 1.5.7.102 install.
I am interested in any fix for this as my desktop support team is very frustrated at the moment. I am happy to test out any theories to help this along. Would hate for others to run into this as well.
-
@Sebastian-Roth
We did two re-captures now. One with 1.5.3 binaries and one with current binaries from dev-branch 1.5.7.112.We have no improvments in deploying with 1.5.7. The speed of the binaries 1.5.7.112 is around 5 GB/min.
Luckily the binaries 1.5.3 boost up to 13 GB/min. Thats the speed we had before.For now we will stick to 1.5.3 binaries running behind FOG 1.5.7.112.
If there are improvments please let us know so we can test them. -
@rogalskij said in Slowdown Unicast and Multicast after upgrading FOG Server:
Tried on multiple Dell Optiplex models: 9010, 9020, 7050 and all of them display the same slowness.
May I ask you to open a new topic yourself and post all your hardware specs (hosts, not the FOG server) there? While I am not exactly sure yet this problem seems to be very specific to the SSD used by @mp12 and we should try to not put too much information in on topic as it leads to major confusion and failure to find and fix the issues in the end. If it turns out to be the exact same issue (which I doubt) we can still cross link the topics later on).
-
@Sebastian-Roth Absolutely. Starting new topic now. My apologies folks!
-
@mp12 Would it be possible to test the binaries between 1.5.3 and 1.5.7 (so 1.5.4, 1.5.5, 1.5.6)?
This will help us track down roughly when the problem was introduced. (as there is about 2 years between 1.5.3 and current dev-branch I believe)
-
@Quazz In the other thread (with a similar condition) I have the OP trying the 5.5.3 one-off kernel and then he said he updated from 1.5.7 to 1.5.7.102. As part 2 of that test (assuming the kernel upgrade doesn’t fix the issue) I’m going to have him roll FOS Linux back to 1.5.7 by downloading the binaries for 1.5.7 to see if that restores the speed.
-
@mp12 said in Slowdown Unicast and Multicast after upgrading FOG Server:
Luckily the binaries 1.5.3 boost up to 13 GB/min. Thats the speed we hab before.
For now we will stick to 1.5.3 binaries running behind FOG 1.5.7.112.
If there are improvments please let us know so we can test them.Thanks for testing and updating the topic. Can you please use the 1.5.4 kernel and see if you can deploy using that. What’s it doing speed-wise then?
-
We are running several tests at the moment. Only using the Binaries which can be downloaded from https://fogproject.org/binaries1.5.x.zip.
FOG runningdev-branch 1.5.7.112
Here are some results.
Binaries 1.5.7: deploy speed around 12GB/min.
bzImage-1.5.7: Linux kernel x86 boot executable bzImage, version 4.19.48 (jenkins-agent@Tollana) #1 SMP Sun Jul 14 13:08:14 CDT , RO-rootFS, swap_dev 0x7, Normal VGA
Binaries 1.5.6: deploy speed around 12GB/min.
bzImage-1.5.6: Linux kernel x86 boot executable bzImage, version 4.19.36 (jenkins-agent@Tollana) #1 SMP Sun Apr 28 18:10:07 CDT , RO-rootFS, swap_dev 0x7, Normal VGA
Binaries 1.5.5: deploy speed around 12GB/min.
bzImage-1.5.5: Linux kernel x86 boot executable bzImage, version 4.19.1 (sebastian@Tollana) #1 SMP Fri Feb 22 01:04:27 CST 2019, RO-rootFS, swap_dev 0x8, Normal VGA
Binaries 1.5.4: deploy speed around 12GB/min.
bzImage-1.5.4: Linux kernel x86 boot executable bzImage, version 4.16.6 (builder@4c3c12e8cfd6) #4 SMP Wed May 9 22:08:36 UTC 201, RO-rootFS, swap_dev 0x7, Normal VGA
Binaries 1.5.3: deploy speed around 12GB/min.
bzImage-1.5.3: Linux kernel x86 boot executable bzImage, version 4.15.2 (builder@c38bc0acaeb4) #5 SMP Tue Feb 13 18:30:08 UTC 20, RO-rootFS, swap_dev 0x7, Normal VGA
-
@mp12 Just for clarity you should be downloading the zip file from each release (ONLY). And using that as part of your test. The version of FOG Server should stay at 1.5.7.102 or what ever is the latest release.
The developers are suspecting something in FOS Linux (contained in bzImage and init.xz in each binary zip file) has changed somewhere at some time causing this speed issue. They need to narrow down when the speed changed between FOS Linux 1.5.x and 1.5.xn.
Also based on the data you collected so far, you can skip 1.5.4. I (we) are most interested in the 1.5.7 results.
-
This post is deleted! -
We are still using FOG Server dev-branch 1.5.7.112.
All binaries (1.5.3 up to 1.5.7) used the partclone version 0.2.89. Maybe thats the problem? The binaries from dev-branch where running on partclone 0.3.12.
-
@mp12 Do I get this right? Whichever kernel/init you use from one of the last releases 1.5.3 through to 1.5.7 all show fast deploy speeds?
-
That is correct.
-
@mp12 Perhaps, though I believe the same issue does not occur in Clonezilla which also uses partclone 0.3.
Their partclone commands are relatively similar to ours, though they include a specific Read Write buffer of
-z 10485760
(which is 10 times the default). That said, the default didn’t cause issues before, so it’s unlikely to cause issues now.A greated divergence between FOS and Clonezilla is that Clonezilla uses debian as a base for their live ISO, whereas we build a filesystem using Buildroot and kernel from source.
So more likely there is a problem introduced in that area, whether bug, config issues, or otherwise.
That all said, thank you for helping us narrow it down significantly already.
-
-
@Sebastian-Roth My apologies if this is too forward, but would the latest build with 0.3.13 be able to be installed by myself as well? I would love to test 0.3.13 to see if it fixes my slowness issue as well. I would gladly report back my findings.
-
@Sebastian-Roth Also if this doesn’t lead us to a solution, we could hack the inits and then “borrow” clonezilla’s partclone to see if there is any change in performance.
I have a 9020 here, let me dig it out and see if I can duplicate the results. Only to confirm I can create a broken system.
-
@rogalskij If you have an immediate need, you can install the 1.5.7 version of FOS (not FOG) to get the speed back today. Just grab the 1.5.7 binaries and extract the bzImage* and init*.xz files and drop them into /var/www/html/fog/service/ipxe directory and the pxe boot the target computer. Those binaries will run fine on FOG 1.5.7.102 or later. You can do that until the devs can get things sorted out. Just be aware that if you captured an image using 1.5.7.102 you can not deploy it with FOS 1.5.7.
-
@mp12 Ok, here we go. Please try this init: https://fogproject.org/inits/init_partclone_0.3.13.xz
Make sure you use the kernel delivered with the latest FOG
dev-branch
version. If you are unsure manually re-download here: https://fogproject.org/kernels/Kernel.TomElliott.4.19.101.64If the speed is high then we have ruled out the kernel (from dev-branch) and surely the partclone version 0.3.12 would cause the slowdown in your case. If it’s still slower than expected I would ask you to stick to the init_partclone_0.3.13.xz but go back to kernel from binaries1.5.7.zip. Slow or fast?
My apologies if this is too forward, but would the latest build with 0.3.13 be able to be installed by myself as well?
Sure you can but it’s way more complicated to explain right now then just build it for you. If it turns out to be the issue we might need to go ahead to 0.3.13 for the official binaries anyway.
-
@Sebastian-Roth Can confirm that after testing this in my environment with the new init (partclone 0.3.13) and the latest dev Kernel specified, things are back to fast again. Average in my environment was somewhere around 9/GB per minute.