HP ProBook 640 G8 imaging extremely slowly
-
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
… HP ProBook 640 G8s …
Can’t find any valid information on the NIC build into this device. Please schedule a debug deploy (same as you do a normal but in the FOG web UI just before you click the button there is a checkbox for debug). Boot it up and hit ENTER twice to get to the command shell. Now run
lspci -nn | grep -i net
, take a picture and post that here in the forums.Our storage nodes are a mix of versions, but we’ve tried one running 1.5.0 and others running 1.5.9 and it made no difference.
Are those storage nodes connected to each other? Replication would go wild if you use versions older and newer than 1.5.4 in one storage group.
-
@jacob-gallant FWIW: if you are running older versions of the inits they may be missing the NVMe kernel parameter that keeps the NVME drives from going into low power mode during imaging.
You really should have all storage nodes and master node running the latest code. Then be sure to upgrade the FOS linux kernel to 5.8.16 via the FOG Configuration -> Kernel update page.
-
@sebastian-roth Couldn’t upload properly so hopefully this works: https://photos.google.com/share/AF1QipMf8sITfjDD3zQBcV1mXLrayWD2E6BfHazYGu8XSGWsQidx_KZ4L8em9k-yrTGMtA?key=NTluWlR3VDlWMjhBb2NtdUc5bVk1akRtZWY3Z1Jn
And no, we don’t replicate between storage nodes.
-
@george1421 Thanks george, I realize the different versions aren’t a great idea but we just don’t have the cycles to get them upgraded together all the time.
I couldn’t find a “5.6.18” version of the inits, so I copied over the latest from https://fogproject.org/inits but maybe that was the incorrect way to do it…
-
@jacob-gallant I don’t think I would mix kernel and inits. You might want to roll the changes back. Then setup a global kernel parameter (fog configuration->fog Settings) and set this parameter
nvme_core.default_ps_max_latency_us=0
Then reimage and see if your imaging times improve. -
@george1421 Thanks george. I’d prefer not to mix kernels and inits, but I couldn’t even get the devices to register until I upgraded to the 5.6.18 kernel and I couldn’t find an init to match. It’s possible I just don’t know where to look.
I’ll give the global kernel parameter a shot.
-
@george1421 Kernel parameter didn’t make a difference unfortunately.
-
@jacob-gallant This one is a bit troubling.
5-6MB/min is super slow. That’s about half of the bandwidth of a 100 Meg network link.
I think I would deploy again, but tick the debug checkbox before submitting the task, then pxe boot the computer. You will be dropped to the FOS Linux command prompt on the target computer. Key in
cat /proc/cmdline
and make sure that parameter is listed. Also make sure that storageip is pointing to the proper storage node and not trying to image over your wan network.Also we don’t know where the problem is right now. So use the latest kernel and inits appropriate for your installs. Put the target computer on the same subnet ideally the same switch as your master FOG server or storage node. Deploy from there. Look at your speed. This will eliminate as much of your infrastructure as possible.
-
@Jacob-Gallant In the picture we see PCI ID 8086:15fc for the ethernet adapter (Ethernet Connection (13) I219-V) - so you definitely need a kernel newer than 5.5.x to make this work. A quick search did not reveal anything obvious where people would report slow network speed with this card/driver.
So it’s probably good to take a step back from the kernel/driver causing the slowness (my thinking at first) and check other things. As George said, maybe it’s just running at 100 MBit/s? Can you check status LEDs on the NIC and switch port? As well I could imagine some energy saving issue on the NIC (EEE). Can you connect a dumb (unmanaged) switch in between and see if it’s still going slow?
-
@george1421 I do get the parameter when entering that command, and the storageip is the one I was expecting. The device is not on the same subnet as the storage node or master server, but it’s within the same building and as mentioned other devices are deploying as expected.
Just in case I switched over to the same subnet and switch as one of our storage nodes (a different subnet then the main server) and saw the same results.
-
@sebastian-roth Thanks for looking into that Sebastian. I can confirm that we’re connected at gigabit, and it is already connected to an unmanaged switch at my desk.
-
@jacob-gallant So just to clarify, its only the “HP ProBook 640 G8” computers that are behaving this way? Installing another computer on the same network jack returns better performance? If so then we can rule out infrastructure as the root of the problem.
Is the firmware up to date on that 640?
-
@george1421 That’s correct, just the ProBook 640 G8. The firmware is up to date.
-
@george1421 @Sebastian-Roth Currently building a completely separate new FOG environment to see if it’s any version/upgrade weirdness from our current environment that may be causing issues, I’ll let you know what I find out.
-
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
just the ProBook 640 G8
The debugging truth table points to the probook at fault root.
So what is unique about this workstation from previous models? NVMe vs SATA? Specific NVMe disk?
If you want to debug this hardware a bit more we can do that. The issue will be with either the network stack or the disk subsystem.
Here is a link to iperf3 https://drive.google.com/file/d/1fLYGI-roYGongTVRS_4zQ7dWJ7osN4AW/view?usp=sharing
Place it in
/images
directoryThe concept of testing is coming from this post: https://forums.fogproject.org/post/98231
the setup for testing is pretty easy, just setup a debug deployment to this computer. (tick the debug checkbox before submitting the deployment task).
Now pxe boot the target computer after a few screens of text you will be dropped to the FOS linux command prompt.
At the fos linux command prompt key in
fog
You will need to press enter after each breakpoint in the imaging code. After you see the first partclone copy complete press the Ctrl-C to break out of the deployment script.The iperf command will be in
/images
directory on the pxe booting computer. Copy it over to the /tmp directory on the target computercp /images/iperf3 /tmp
Then run the iperf command as outlined in this post. https://forums.fogproject.org/post/98230
You will need to setup the iperf3 receiver on the FOG server and then the client on the target computer. This test will see how fast the network speed is between the target computer and the FOG server.Once the network bits have been tested then testing the hard drive is next.
This will be a bit more complicated to test, so see if the
hdparm
command comes back with something in FOS Linux. -
@george1421 Sure, give me some time and I’ll run those tests and see what I find. Thanks again!
-
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
@george1421 @Sebastian-Roth Currently building a completely separate new FOG environment to see if it’s any version/upgrade weirdness from our current environment that may be causing issues, I’ll let you know what I find out.
Same results @george1421 @Sebastian-Roth
-
@george1421 said in HP ProBook 640 G8 imaging extremely slowly:
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
just the ProBook 640 G8
The debugging truth table points to the probook at fault root.
So what is unique about this workstation from previous models? NVMe vs SATA? Specific NVMe disk?
If you want to debug this hardware a bit more we can do that. The issue will be with either the network stack or the disk subsystem.
Here is a link to iperf3 https://drive.google.com/file/d/1fLYGI-roYGongTVRS_4zQ7dWJ7osN4AW/view?usp=sharing
Place it in
/images
directoryThe concept of testing is coming from this post: https://forums.fogproject.org/post/98231
the setup for testing is pretty easy, just setup a debug deployment to this computer. (tick the debug checkbox before submitting the deployment task).
Now pxe boot the target computer after a few screens of text you will be dropped to the FOS linux command prompt.
At the fos linux command prompt key in
fog
You will need to press enter after each breakpoint in the imaging code. After you see the first partclone copy complete press the Ctrl-C to break out of the deployment script.The iperf command will be in
/images
directory on the pxe booting computer. Copy it over to the /tmp directory on the target computercp /images/iperf3 /tmp
Then run the iperf command as outlined in this post. https://forums.fogproject.org/post/98230
You will need to setup the iperf3 receiver on the FOG server and then the client on the target computer. This test will see how fast the network speed is between the target computer and the FOG server.Once the network bits have been tested then testing the hard drive is next.
This will be a bit more complicated to test, so see if the
hdparm
command comes back with something in FOS Linux.Here’s the iperf3 results: https://photos.app.goo.gl/xXFPLZFHAJT7dPEo9
-
@jacob-gallant Well what I find troubling is the Retr (retransmitts) On a stable network that should be zero. It kind of makes me think networking (could be loaded network infrastructure could be nic in computer). These retransmitts would cause it initially to have a pretty good performance but then start backing off right away until it found a happy results at a slower transfer rate. It would be interesting to see if you get the same results on retransmissions on different hardware.
If you are still at the fos linux command prompt see if
hdparm
command is installed if so thenrun
lsblk
to find the drive. It will be /dev/sda or /dev/nvme(something).Once you’ve found the disk run this
hdparm -Tt /dev/sda
assuming the disk is/dev/sda
first sata disk.There is another test where we need to use fdisk to remove the current partition and make a new partition the size of the disk and then format the partition with ext4 format. Then we can mount it and run the dd command from the article to test write speed to the disk, but I have to run off to a meeting so I won’t have a chance to write down the testing procedure right now.
-
@george1421 I ran iperf on a working device and here are the results, 0 retransmits as you mentioned. You’re also correct that that is exactly what I’m seeing on the 640 G8, starts off with reasonable performance but quickly drops down to a crawl:
https://photos.app.goo.gl/oVrtqpnhmYHh39LK9
Here are the results of the hdparm command.