HP ProBook 640 G8 imaging extremely slowly
-
@Jacob-Gallant In the picture we see PCI ID 8086:15fc for the ethernet adapter (Ethernet Connection (13) I219-V) - so you definitely need a kernel newer than 5.5.x to make this work. A quick search did not reveal anything obvious where people would report slow network speed with this card/driver.
So it’s probably good to take a step back from the kernel/driver causing the slowness (my thinking at first) and check other things. As George said, maybe it’s just running at 100 MBit/s? Can you check status LEDs on the NIC and switch port? As well I could imagine some energy saving issue on the NIC (EEE). Can you connect a dumb (unmanaged) switch in between and see if it’s still going slow?
-
@george1421 I do get the parameter when entering that command, and the storageip is the one I was expecting. The device is not on the same subnet as the storage node or master server, but it’s within the same building and as mentioned other devices are deploying as expected.
Just in case I switched over to the same subnet and switch as one of our storage nodes (a different subnet then the main server) and saw the same results.
-
@sebastian-roth Thanks for looking into that Sebastian. I can confirm that we’re connected at gigabit, and it is already connected to an unmanaged switch at my desk.
-
@jacob-gallant So just to clarify, its only the “HP ProBook 640 G8” computers that are behaving this way? Installing another computer on the same network jack returns better performance? If so then we can rule out infrastructure as the root of the problem.
Is the firmware up to date on that 640?
-
@george1421 That’s correct, just the ProBook 640 G8. The firmware is up to date.
-
@george1421 @Sebastian-Roth Currently building a completely separate new FOG environment to see if it’s any version/upgrade weirdness from our current environment that may be causing issues, I’ll let you know what I find out.
-
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
just the ProBook 640 G8
The debugging truth table points to the probook at fault root.
So what is unique about this workstation from previous models? NVMe vs SATA? Specific NVMe disk?
If you want to debug this hardware a bit more we can do that. The issue will be with either the network stack or the disk subsystem.
Here is a link to iperf3 https://drive.google.com/file/d/1fLYGI-roYGongTVRS_4zQ7dWJ7osN4AW/view?usp=sharing
Place it in
/images
directoryThe concept of testing is coming from this post: https://forums.fogproject.org/post/98231
the setup for testing is pretty easy, just setup a debug deployment to this computer. (tick the debug checkbox before submitting the deployment task).
Now pxe boot the target computer after a few screens of text you will be dropped to the FOS linux command prompt.
At the fos linux command prompt key in
fog
You will need to press enter after each breakpoint in the imaging code. After you see the first partclone copy complete press the Ctrl-C to break out of the deployment script.The iperf command will be in
/images
directory on the pxe booting computer. Copy it over to the /tmp directory on the target computercp /images/iperf3 /tmp
Then run the iperf command as outlined in this post. https://forums.fogproject.org/post/98230
You will need to setup the iperf3 receiver on the FOG server and then the client on the target computer. This test will see how fast the network speed is between the target computer and the FOG server.Once the network bits have been tested then testing the hard drive is next.
This will be a bit more complicated to test, so see if the
hdparm
command comes back with something in FOS Linux. -
@george1421 Sure, give me some time and I’ll run those tests and see what I find. Thanks again!
-
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
@george1421 @Sebastian-Roth Currently building a completely separate new FOG environment to see if it’s any version/upgrade weirdness from our current environment that may be causing issues, I’ll let you know what I find out.
Same results @george1421 @Sebastian-Roth
-
@george1421 said in HP ProBook 640 G8 imaging extremely slowly:
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
just the ProBook 640 G8
The debugging truth table points to the probook at fault root.
So what is unique about this workstation from previous models? NVMe vs SATA? Specific NVMe disk?
If you want to debug this hardware a bit more we can do that. The issue will be with either the network stack or the disk subsystem.
Here is a link to iperf3 https://drive.google.com/file/d/1fLYGI-roYGongTVRS_4zQ7dWJ7osN4AW/view?usp=sharing
Place it in
/images
directoryThe concept of testing is coming from this post: https://forums.fogproject.org/post/98231
the setup for testing is pretty easy, just setup a debug deployment to this computer. (tick the debug checkbox before submitting the deployment task).
Now pxe boot the target computer after a few screens of text you will be dropped to the FOS linux command prompt.
At the fos linux command prompt key in
fog
You will need to press enter after each breakpoint in the imaging code. After you see the first partclone copy complete press the Ctrl-C to break out of the deployment script.The iperf command will be in
/images
directory on the pxe booting computer. Copy it over to the /tmp directory on the target computercp /images/iperf3 /tmp
Then run the iperf command as outlined in this post. https://forums.fogproject.org/post/98230
You will need to setup the iperf3 receiver on the FOG server and then the client on the target computer. This test will see how fast the network speed is between the target computer and the FOG server.Once the network bits have been tested then testing the hard drive is next.
This will be a bit more complicated to test, so see if the
hdparm
command comes back with something in FOS Linux.Here’s the iperf3 results: https://photos.app.goo.gl/xXFPLZFHAJT7dPEo9
-
@jacob-gallant Well what I find troubling is the Retr (retransmitts) On a stable network that should be zero. It kind of makes me think networking (could be loaded network infrastructure could be nic in computer). These retransmitts would cause it initially to have a pretty good performance but then start backing off right away until it found a happy results at a slower transfer rate. It would be interesting to see if you get the same results on retransmissions on different hardware.
If you are still at the fos linux command prompt see if
hdparm
command is installed if so thenrun
lsblk
to find the drive. It will be /dev/sda or /dev/nvme(something).Once you’ve found the disk run this
hdparm -Tt /dev/sda
assuming the disk is/dev/sda
first sata disk.There is another test where we need to use fdisk to remove the current partition and make a new partition the size of the disk and then format the partition with ext4 format. Then we can mount it and run the dd command from the article to test write speed to the disk, but I have to run off to a meeting so I won’t have a chance to write down the testing procedure right now.
-
@george1421 I ran iperf on a working device and here are the results, 0 retransmits as you mentioned. You’re also correct that that is exactly what I’m seeing on the 640 G8, starts off with reasonable performance but quickly drops down to a crawl:
https://photos.app.goo.gl/oVrtqpnhmYHh39LK9
Here are the results of the hdparm command.
-
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
I ran iperf on a working device and here are the results, 0 retransmits as you mentioned.
In the same network jack as the 640 G8?
The network adapter in the 640 G8 is built in or USB based?
-
@george1421 said in HP ProBook 640 G8 imaging extremely slowly:
@jacob-gallant said in HP ProBook 640 G8 imaging extremely slowly:
I ran iperf on a working device and here are the results, 0 retransmits as you mentioned.
In the same network jack as the 640 G8?
The network adapter in the 640 G8 is built in or USB based?
The very same, yes. And it’s built-in.
-
@jacob-gallant If you have your other computer that works, if you have windows loaded on it can you get the hardware ID of that network interface. We know the 640G8 is 8086:15fc (linux format). So the question is the working one the same?
I have a one off kernel 5.10.x that we might want to try. But so far I’m leaning towards the nic itself or the kernel nic driver in 5.6.18.
-
@george1421 The working one is different, 8086:15e3
-
@jacob-gallant Ok the 15e3 nic is an older nic that was first introduced in the 4.6 linux kernel. The 15fc was first introduced in 5.5 linux kernel and we are currently trying 5.6.18 “right?” (from the FOS Linux debug console you can key in
uname -r
to give you the kernel version).Here is an experimental FOS Linux kernel 5.10.2. Download this file and rename as
bzImage
(case is important)
https://drive.google.com/file/d/1-4HyQD8ttz_GCE_vKrvuydFVqcPUMqzU/view?usp=sharingrename the bzImage file in
/var/www/html/fog/service/ipxe
directory and drop this file in there. Lets see if this kernel gives us a better deployment. I know there was again a major rewrite in the 5.9.x series of the linux kernel, akin to what happened with 5.5 -
@george1421 Same results with 5.10.12 I’m afraid. We were using 5.6.18 for all of the previous tests, that’s right.
-
@jacob-gallant Well nuts. I was hoping the updated kernel would function better. Yes we need 5.6.18 to have support for that network interface, if you were using 4.19x the network interface wouldn’t work at all.
-
@Jacob-Gallant @george1421 So far it all looks like a driver issue in the Linux kernel. Though I am really wondering that we don’t find other users’ reports about this NIC.
Maybe this is some kind of jumbo frame issue?
@Jacob-Gallant Would you be willing to capture a short part of the network traffic on your FOG server and upload the PCAP so we can take a look? Schedule a debug deploy task. Boot the host up and ein
ip a s
and note down the IP address before you start the job viafog
command. Now runtcpdump -w /tmp/dump.pcap host x.x.x.x
as root on your FOG server using the IP address noted down. Leave that tcpdump sit there and step through the deply task on the machine. Quickly after the first blue partclone screen starts you want to stop tcpdump on your FOG server (Ctrl+c) so the PCAP file is not growing too much! I am fairly sure we see the retransmits at that point already and might find why.Just copy the file /tmp/dump.pcap from your server and upload to a share we can access.