Slowness after upgrade to 1.5.7.102 (dev branch)
-
We have experienced some major slowdown of deploying images after upgrading to a newer dev branch to fix some issues we had (I am now on 1.5.7.102). I went from roughly 13GB per minute to 2GB or slower per minute. Very frustrating. I tried capturing the image a couple different times with different compression and a bunch of things. Tried on multiple Dell Optiplex models and all of them display the same slowness. Got this issue on two separate servers (we have 2 campuses at our College, so two different servers). Almost feels like the different kernels did this. We were on “5.1.16 mac nvmefix” but then upgraded to the “4.19.101” which came with the 1.5.7.102 install.
I am interested in any fix for this as my desktop support team is very frustrated at the moment. I am happy to test out any theories to help this along. Would hate for others to run into this as well. Model information of the computers used is below:
Dell Optiplex 9010 - 8gb ram, core i5, 500gb crucial SSD drive
Dell Optiplex 9020 - 12gb ram, core i5, 500gb crucial SSD drive
Dell Optiplex 7050 - 16gb ram, core i7, 500gb crucial SSD drive
Dell Optiplex 7060 - 16gb ram, core i7, not sure about the hard drive -
What version of FOG did you upgrade from? You are on the dev branch right now, but where did you start from?
I also have a new one off kernel that is 5.5.3 if you want to test that.
I do have an 9020 that I can test against here.
-
@george1421 I upgraded from 1.5.7 (stable branch). I was having issues, and the dev version did indeed fix my issues but seemed to introduce new issues (the slowness). I would be very interested in testing the 5.5.3 kernel. Anything I can do to assist I am willing to. FOG is an amazing product and our desktop team lives by it here. Thank you!
-
@rogalskij https://drive.google.com/open?id=1thopskSYJd7ueDQeFg_VT4eeNcrNHvIx
Download this as bzImage553 and move it to the fog server in /var/www/html/fog/service/ipxe directory. Then go into the host configuration and add bzImage553 to the kernel field.
Its interesting to hear you have this slowness from 1.5.7 to 1.5.7.102.
If the kernel doesn’t fix the issue lets do similar to the other thread. In your case lets download the 1.5.7 kernel and inits to see if the speed changes between 1.5.7 and 1.5.7.102 as you said.
For part 2 of this test download this zip file to your windows computer. https://fogproject.org/binaries1.5.7.zip
Extract bzImage and init.xz from the zip file (you only need these specific files) and move to /var/www/html/fog/service/ipxe directory on your fog server. This will overwrite the FOS Linux files from 1.5.7.102. Now attempt to image one of those impacted systems. While imaging pay attention to the blue screen that has the bargraph. This is the partclone screen. Please record the version of partclone you see. It may be 0.2.89 or 0.3.11 or something else. I’m curious to see if updating partclone did something unexpected. Does restoring FOS Linux back to 1.5.7 resolve the slowness issue??
To restore the 1.5.7.102 kernel and inits we’ll just rerun the installer. The 5.5.3 kernel will be safe since we called it bzImage553.
-
@rogalskij Just want to add to George’s great instructions on testing that you might need to re-capture the image when using the 1.5.7 kernel/init because the used an older version of partclone with a different image format. Newer partclone can deploy older images but not the other way round.
-
@george1421 Thank you for these great instructions. I am going to try it now. I did see that the partclone version that was on the screen was 0.3.12 and the image pushed at about half the speed it did previously. I am going to try this new Kernel as I haven’t tested it yet, and will report back.
-
@george1421 said in Slowness after upgrade to 1.5.7.102 (dev branch):
/var/www/html/fog/service/ipxe
Just tried the new Kernel, I moved it to: /var/www/html/fog/service/ipxe and for the Optiplex 7050 in question I changed “Host Kernel” to “bzImage533”. Is that all I had to do to make that machine image using that 5.3.3 kernel? I didn’t see a way for me to know if I was using the right kernel or not.
-
@rogalskij in the host definition for that specific computer (via the web ui) go to that target host. On the main page there is a field called kernel. In that field enter the custom kernel name
bzImage553
(watch your case). Then capture/deploy the image. When you pxe boot the computer watch the iPXE screen you should see it transfer bzImage553 to the target computer along with init.xz -
@george1421 I watched during the deploy but didn’t see anything relating to which bzimage version it is using. Odd. I have a feeling it is still using the other bzimage. I am going to test by backing up the current 4.19.101 bzimage and renaming the bzimage533 to “bzimage”. Just a temporary test.
-
@george1421 Oddly, when I deploy it doesn’t show any screen that mentions bzimage or xy.init.xz at all. Is this because I am deploying using ipex.efi for booting? I am actually a bit ignorant to what these files actually do? Does the kernel boot the host client, or does ipxe.efi boot the client?
-
@rogalskij When you deploy an image in the web ui, then pxe boot the target computer, iPXE boots then calls boot.php script on the server. Right after boot.php is called it should transfer bzImage and init.xz. The transfer is super fast so its possible to miss it.
So lets take this route. Go to tasks and close out any open tasks then go back to schedule another deploy, but this time before you hit the schedule task button tick the debug checkbox. Now pxe boot the target computer. After a few screens of text you need to clear with the enter key, you will be dropped to a FOS Linux command prompt. At the command prompt key in
uname -a
That will print out the kernel version and name. The version number should be 5.5.3 if the proper kernel is booting.If that is the case they key in
fog
and press enter at the command prompt. This is FOG in debug mode, you will need to press the enter key at each break point in the program, but you will be able to single step through the deployment. It will get to the partclone screen so you can see the transfer rates. -
@george1421 Tried the debug method, it booted from bzImage533 but it failed to start. Below is the error I recieved while attempting to boot using that bzImage:
-
@rogalskij Please run
file /var/www/html/fog/service/ipxe/bzImage533
on your FOG server command line and post output here.EDIT: Reading the earlier posts I think this is a typo:
bzImage533
vs.bzImage553
-
@Sebastian-Roth said in Slowness after upgrade to 1.5.7.102 (dev branch):
file /var/www/html/fog/service/ipxe/bzImage533
Good catch, found the typo and got it to deploy. The speed is the exact same however. I stepped through all the steps, verified the 5.5.3 Kernel and the speed is around 950mb per minute. Way slower than my usual. Part Clone is .3.12, I tried using the old init.xy file from 1.5.7 but it claimed I didn’t have enough memory. Very very odd. Do I just deal with it until stable version of 1.5.8 comes out maybe?
-
@rogalskij said in Slowness after upgrade to 1.5.7.102 (dev branch):
The speed is the exact same however.
Too bad!
I tried using the old init.xy file from 1.5.7 but it claimed I didn’t have enough memory.
Can you try using init.xz and bzImage from https://fogproject.org/binaries1.5.7.zip as well as from https://fogproject.org/binaries1.5.3.zip - see if any of these will make speed things up again. Though you have to re-capture the image before you can deploy because those use an older version of partclone that is not able to deploy newer style images.
Do I just deal with it until stable version of 1.5.8 comes out maybe?
We can’t fix this without knowing what exactly is causing this. We don’t have your hardware here and so we need your help to test things and report back.
-
@rogalskij In your initial post you said:
Dell Optiplex 9010 - 8gb ram, core i5, 500gb crucial SSD drive
Dell Optiplex 9020 - 12gb ram, core i5, 500gb crucial SSD drive
Dell Optiplex 7050 - 16gb ram, core i7, 500gb crucial SSD drive
Dell Optiplex 7060 - 16gb ram, core i7, not sure about the hard driveSounds pretty much like you have the same SSD in all those machines, right? Are those normal SATA SSDs or NVMe drives?
-
@Sebastian-Roth They are normal SSD sata drives for the most part. I was told by a member of the desktop team today though that they tried to image an Optiplex 7070 with an M2 solid state drive (the ram style drives) and it was around 1.4gb per minute.
I will certainly try to make some changes, but the fixes that came with the dev version far outweigh the speed issues I am having at the moment. Hoping some bug fixes in the 1.5.8 version will do the trick. I will keep testing as I can. Thank you all for your help for now!
-
@rogalskij while I understand the slowdown is problematic some of it seems bound to network speeds. For your tests are you only imaging a single machine, or multiples at the same time? Are they on a separated network or a congested one? No matter how fast an ssd disk you have, these things need to be thought of as well. What are the speeds of the performing network?
Please don’t take this as the only word. I just want to best understand the whole.
-
@Tom-Elliott It is possible it is something in my environment. Typically it is a single machine (imaged either via the console, or directly from the machine itself by a member of our desktop support team). It seemed related to the version because I first noticed it after I upgraded from 1.5.7 to the dev version of 1.5.8 and then later the dev version of 1.5.7.102. It went from like 12GB/min to 4GB/min or 1GB/min. The servers are production boxes so I can’t make too many changes without potentially affecting others but it still seems like it is related to the update I did. Perhaps the update of partclone? Also I noticed that when I pulled an image today for testing, the capture was super quick at 10GB/min but the deploys to the same machine and others like it were 4GB or less. I feel like I am missing something extremely mundane and obvious.
-
@rogalskij what’s the lowest version kernel you’ve tried?