Lenovo ThinkCentre M70a
-
@dmaret Possibly we are missing just one particular firmware blob for the NIC you have in our kernel. Nur sure yet but a quick lock at our kernel config I find 12 times “rtl8168” while I find 13 times in the kernel firmware repo. We’ll compile a fresh 5.10.19 kernel soon to see if that helps.
Edit: We are missing
rtl8168fp-3.fw
-
@dmaret I’m also finding posts about this nic just turning off/powering down during file transfer the nic fails to connect to its phy device.
I have found some kernel switch that might work, but others say no.
pcie_aspm=off
iommu=soft
You could run your deployment in debug mode and when partclone throws the error instead of rebooting you could check to see if the network is gone as well as search the /var/log/messages log file to see if there is any clue, but this error appears to be common for this realtek nic.
Lastly have you updated the firmware on this computer? Is it the latest?
-
@george1421 Hello George,
I am not sure where to find and update these switches. Could you please elaborate?
Also I did update the BIOS to latest available version on this computer:
M2SKT1FA 10 Mar 2021Is there anything else I should try to update?
I will try to run again in debug mode as you suggest.
Thank you again.
David.
-
@dmaret In the fog configuration near where you found bzImage field, there is one called kernel args. You would place
pcie_aspm=off
in that field. Understand this is the global setting space so that will apply that kernel arg to every deployment. But for now we are just testing. When we are done testing you will need to reset these values.The more research I do the more its leaning towards a linux driver conflict with the hardware. One recommendation I found was to switch back to 4.15.x kernel. You can get older FOG kernels from here: https://fogproject.org/kernels/ I would download and rename the 64 bit 4.15.3 as bzImage4153 and then move it to the
/var/www/html/fog/service/ipxe
directory then update the global kernel parameter to bzImage4153 and test your deployment. The linux kernel developers change how the driver works for the realtek nic after the 4.15.x series of linux kernels as well as the 4.9.x series. But that is getting back to a pretty old version of the linux kernel. Just be aware that the newer (ish) hardware will not run on these old kernels. Right now we are trying to solve the problem with this lenovo system. We will need to reset everything to put FOG back to normal. -
@george1421 Hi George,
I have added pcie_aspm=off to the kernel args field but it did not change anything.
However I could not find 4.15.3 but I found 4.15.2 and tried with this one, and it worked! It was going at a slower rate but it went all the way and the imaging completed successfully!
David.
-
@dmaret How old is this lenovo?
Now that you have a working, but slower kernel. In the host definition for this lenovo you can hard code this 4.15.2 special kernel right into that host definition. So every time that computer needs imaging it will use that 4.15.2 kernel. I don’t know if the fog project devs will be able to fix this since its a linux kernel change / hardware conflict causing the issue.
You could try this kernel parameter instead, but I don’t have high hopes that it will work, but you can try
acpi=off
The error seems to relate to a component of the network interface going to sleep or loosing communication inside the nic. I have found some references to updating the NIC firmware, but if this nic is built into the mobo of the computer, the bios / firmware update should take care of that for you.
-
@dmaret @george1421 Interesting findings! Just wondering if you think the mentioned firmware blob could make a difference or not? Shall I add it?
-
@george1421 Hi,
This Lenovo is brand new.
M70a Desktop (ThinkCentre) - Type 11CK
Machine Type Model: 11CKS03900For now, I will try testing my way up the different kernel versions and see what is the most recent kernel version that works with this model and hard code that one into the host definition as you suggest.
Thank you again so much for your help.
David.
-
@sebastian-roth We can surely try it. But I would think the nic wouldn’t work if it needed the firmware. BUT that may be the nic firmware that is missing/needs to be patched. I guess what I’m saying is we should try it to see if it fixes the problem with the current kernel. We know now rolling back to 4.15 also fixes it, but that is not a good long term strategy.
-
@dmaret Please give this kernel a try: https://fogproject.org/kernels/bzImage-5.10.19-rtl8168fp-3 (not saying it does help but will be interesting to see if adding the firmware blob makes a difference)
-
@sebastian-roth Hi,
I have tried but it does not resolve the issue.
So since I successfully imaged with 4.15.2, I have also tested the following successfully: 4.16.6, 4.17.0, 4.18.3.
And I will continue to try to identify the most recent version which works with this Lenovo model.
Thanks again.
David.
-
@dmaret said in Lenovo ThinkCentre M70a:
So since I successfully imaged with 4.15.2, I have also tested the following successfully: 4.16.6, 4.17.0, 4.18.3.
Wow, didn’t expect that! Good to know. Are they all going full speed or kind of slow as you said with 4.15.2?
-
@sebastian-roth I think 4.16.6 was also slow, but the following were back to usual rate. 4.19.1 and 4.19.6 do not seem to work.
David.
-
@dmaret Do I get this right, 4.18.3 (and maybe 4.17.0) is the best candidate we have so far. Error not happening and normal speed?!
Did you test 4.18.11 yet? The closer we get the more chance we find what change in the kernel is causing this and we might be able to provide a patched up to date kernel.
-
@sebastian-roth Hi Sebastian,
Let me recap my findings:
- 4.15.2: working, slow
- 4.16.6: working, slow
- 4.17.0: working, normal speed
- 4.18.3: working, normal speed
- 4.18.11: working, normal speed
- 4.19.1: not working
- 4.19.6: not working
- 4.19.36: not working
NB: I am using realtek.pxe
Thanks.
David.
-
@dmaret said in Lenovo ThinkCentre M70a:
NB: I am using realtek.pxe
So this computer is in BIOS mode? Does undionly.kpxe work for the boot loader or will only the realtek.pxe get this system to the iPXE menu?
-
@george1421 Hi George,
This computer does not offer Legacy BIOS, only UEFI. It does not work with undionly.kpxe
There was a typo in my previous message, I actually use realtek.efi not realtek.pxe
Thank you.
David.
-
@dmaret said in Lenovo ThinkCentre M70a:
I actually use realtek.efi not realtek.pxe
So then let me ask the question with this new info. Does ipxe.efi work as well as the hardware specific realtek.efi?
-
@george1421 Hi George,
Yes it works with ipxe.efi but with a message which I do not have with realtek.efi, see screenshot:
https://drive.google.com/file/d/1eupCfc1Z5FROhi45pLlSWJsOk-AdtHt_/view?usp=sharingThe 2nd line on the screenshot starting with r8169 does not appear with realtek.efi
It is why I thought it was better to use realtek.efiHowever it does not change anything when it comes to the kernel versions: same results as with realtek.efi, i.e. working up to 4.18.11, and not working after that.
NB: the first line with the error message starting with “db_root: cannot open: /etc/target” appears with both realtek.efi and ipxe.efi
Thanks
David.
-
@dmaret said in Lenovo ThinkCentre M70a:
The 2nd line on the screenshot starting with r8169 does not appear with realtek.efi
Interesting, because the error message in the picture comes from FOS Linux and not iPXE. As soon as FOS Linux start, iPXE is moved out of memory. This tells us that FOS Linux is not configuring the network adapter right where iPXE configures something in the network adapter and leaves it behind for FOS Linux to use, where in version 4.19 the kernel reset the network adapter cleanly but doesn’t init it correctly. At least that is what I think I see.