Very slow cloning speed on specific model
-
@Sebastian-Roth I’m guessing that some programs/functionality got coded with a correct glibc version for that kernel version, thus allowing it to boot and such, but that certain ones for whatever reason didn’t??? (partclone perhaps?) I have no clue how or why that would happen though.
That’s the only thing I can think of anyway!
edit: perhaps here https://github.com/Thomas-Tsai/partclone/blob/master/fail-mbr/compile-mbr.sh
It might accidentally call system GCC which might use a newer glibc version, thus causing the mismatch?
-
@Sebastian-Roth said in Very slow cloning speed on specific model:
Is you image set to “Partclone zstd” or “Partclone grip”??
Can confirm I’m using partclone gzip
-
@Quazz said in Very slow cloning speed on specific model:
edit: perhaps here https://github.com/Thomas-Tsai/partclone/blob/master/fail-mbr/compile-mbr.sh
It might accidentally call system GCC which might use a newer glibc version, thus causing the mismatch?Hmm interesting idea!! Looking at this for a bit I can’t think of how this could make partclone hit the wall. The messge “Broken pipe” usually means that the program feeding the pipe crashed. In this case pigz. Though this should actually compile against the proper buildroot glibc (older version): https://github.com/FOGProject/fos/blob/master/Buildroot/package/pigz/pigz.mk
-
@Quazz The error message is definitely thrown by glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/dl-osinfo.h;hb=ef4e158c736d067304164c3daa763e4f425af248#l44
-
@Sebastian-Roth Yes, glibc is generally the problem in kernel mismatches such as these, the question is where exactly? I’m not sure, but could it also say broken pipe if it gets interrupted unexpectedly by its piped data throwing a critical error such as say… kernel too old?
edit: Looking through build logs, ZSTD seems to be getting compiled against system gcc atm, at least from what I can tell. But OP is using pigz, so uhhhh
edit2: On the note of pigz, Buildroot provides a version these days, so we could remove the manually specified package unless we are attached to version 2.3.4 for some reason
edit3: Perhaps the workspace wasn’t cleaned properly for the build? Or alternatively, perhaps the GCC version has to be lowered.
-
@Quazz Thanks!! I will try to build another init for 4.9.x kernels when our build server is back to life.
-
@Sebastian-Roth Im having the same issue We are tryign to image the new dell optiplex ultra 7070s and it has been hit or miss. It’s either really slow to even register the host and To even get the imaging process is slow I have one computer that has been running for 2 hours now and its just stuck on the first part clone screen.
-
@darkxeno Build server is back and I started a fresh 4.9.x init build. Will update here soon.
-
@darkxeno @dylan123 @Middle @oleg-knysh Build is done. Anyone keen to test?
sudo -i cd /var/www/html/fog/service/ipxe wget https://fogproject.org/kernels/bzImage-4.9.51 wget https://fogproject.org/inits/init-4.9.x.xz chown apache:apache bzImage-4.9.51 init-4.9.x.xz
-
@Quazz said:
edit2: On the note of pigz, Buildroot provides a version these days, so we could remove the manually specified package unless we are attached to version 2.3.4 for some reason
Don’t think we are bound to use 2.3.4 - not that I know of. I just pushed on to buildroot 2019.04.8 and their official pigz version 2.4.
-
I’m noticing an issue as well. We just received some HP 840 G6 laptops and they came pre-loaded with bios version 01.03.00 and I was able to image them just fine. But then I decided to upgrade one of them to bios version 01.03.04 before imaging and now the imaging process is super slow. It also looks like bios version 01.03.00 isn’t available to download. So now I’m stuck waiting for a bios update I guess. I’ve talked to HP chat and they didn’t have 01.03.00 bios available to download.
-
@bberret said in Very slow cloning speed on specific model:
I’ve talked to HP chat and they didn’t have 01.03.00 bios available to download.
Probably good to keep on asking HP (email, telephone, …) about this to make them aware of the problem.
-
@bberret Will you try something for us? In the fog configuration -> fog settings -> general settings page there is a KERNEL ARGS parameter . Will you place this value in that field
nvme_core.default_ps_max_latency_us=0
and then try imaging again. You may see a warning about that variable during imaging, but ignore it. It is a spurious error that is fixed in the yet unreleased FOG v1.5.8. This settings tells the nvme drive to not go into low power mode during imaging. -
@george1421 Thanks for the quick reply. I tried what you suggested and didn’t seem to help. And now my theory with it being an issue with only bios version 01.03.00 doesn’t seem to be holding true. Just unboxed another laptop and it seems to be having that exact same issue. This issue isn’t just slow imaging, it is slow loading pretty much everything after it loads ipxe.efi file. when its trying to load bzImage file (which usually takes 1 second) it is taking 15 minutes before that file is loaded.
-
@bberret So two changes (test) come to mind.
-
Does this computer have a bios mode? If so, as a test change it to bios mode to see if the bzImage and init.xz transfer speeds are normal.
-
Do you have an add on card (pci-e) that has pxe booting capabilities. The idea is to remove the onboard pxe firmware and network card. See if bzImage transfer is normal or not.
-
I thought about booting FOS linux from a usb stick, but you are telling me that imaging is not fast too. If imaging was fast but pxe booting was slow then I might point to iPXE as being a problem. But in this case booting for a usb stick will not mask/test the issue.
Off the top of my head I would either say network adapter or pxe / uefi firmware.
-
-
@george1421 one option for eliminating the network card as the source of the problem (not completely, but mostly) is to boot with a usb adapter plugged in as well, both plugged into the network, and after ipxe loads unplug the network from the built in adapter.
-
@Sebastian-Roth said in Very slow cloning speed on specific model:
@darkxeno @dylan123 @Middle @oleg-knysh Build is done. Anyone keen to test?
sudo -i cd /var/www/html/fog/service/ipxe wget https://fogproject.org/kernels/bzImage-4.9.51 wget https://fogproject.org/inits/init-4.9.x.xz chown apache:apache bzImage-4.9.51 init-4.9.x.xz
Thanks @Sebastian-Roth , I’ve just got back from leave. Ended up just manually setting up the device and don’t have another one to test with unfortunately. If I do, I’ll give this a test and see if it makes a difference. Thanks again for your assistance.
-
@Sebastian-Roth said in Very slow cloning speed on specific model:
@darkxeno @dylan123 @Middle @oleg-knysh Build is done. Anyone keen to test?
sudo -i cd /var/www/html/fog/service/ipxe wget https://fogproject.org/kernels/bzImage-4.9.51 wget https://fogproject.org/inits/init-4.9.x.xz chown apache:apache bzImage-4.9.51 init-4.9.x.xz
Sorry for the late update. This didn’t change anything I’m afraid. I’ve been a little reluctant in updating during testing as the results I’ve had have been very inconsistent. I can’t explain it, but the first deployment of the day works without issues - it’s happened too many times now to be a coincidence.
I’ve currently changed back to the master branch to clean-up the server of the test kernels/inits we’ve been using. The only consistent deploy I can get is using the init_partclone.xy that @Quazz posted on the 5th Dec, entering debug mode and running the following:
nvme set-feature -f 0x0c -v=0 /dev/nvme0
This works every time. The average transfer speed is around 3GB/min, rather than >10GB/min that I get from the master branch build when it randomly works, but that’s good enough for us as the result is consistent and the image is small anyway.
This is google drive link Quazz provided for the init: https://drive.google.com/open?id=1u_HuN5NSpzb7YmQBAsrzDELteNmlWUWU
I’ve had a look at the post init script option to see if I can automate this rather than entering debug, but I’m not really sure what I’m doing here.
-
@Middle Can you try the KERNEL ARGS George suggested?
nvme_core.default_ps_max_latency_us=0
-
@Quazz We’ve tried that as well but doesn’t help. I think it’s also included in the Dev branch by default now which we’re tried.