Lenovo ThinkCentre M70a
-
@george1421 Hi George,
Yes it works with ipxe.efi but with a message which I do not have with realtek.efi, see screenshot:
https://drive.google.com/file/d/1eupCfc1Z5FROhi45pLlSWJsOk-AdtHt_/view?usp=sharingThe 2nd line on the screenshot starting with r8169 does not appear with realtek.efi
It is why I thought it was better to use realtek.efiHowever it does not change anything when it comes to the kernel versions: same results as with realtek.efi, i.e. working up to 4.18.11, and not working after that.
NB: the first line with the error message starting with “db_root: cannot open: /etc/target” appears with both realtek.efi and ipxe.efi
Thanks
David.
-
@dmaret said in Lenovo ThinkCentre M70a:
The 2nd line on the screenshot starting with r8169 does not appear with realtek.efi
Interesting, because the error message in the picture comes from FOS Linux and not iPXE. As soon as FOS Linux start, iPXE is moved out of memory. This tells us that FOS Linux is not configuring the network adapter right where iPXE configures something in the network adapter and leaves it behind for FOS Linux to use, where in version 4.19 the kernel reset the network adapter cleanly but doesn’t init it correctly. At least that is what I think I see.
-
@dmaret Will compile a few more kernels for you to test soon.
-
@dmaret Just build and uploaded some more 4.18.x versions:
https://fogproject.org/kernels/Kernel.TomElliott.4.18.12.64
https://fogproject.org/kernels/Kernel.TomElliott.4.18.15.64
https://fogproject.org/kernels/Kernel.TomElliott.4.18.20.64Please give those a try and let us know how well those work.
-
@sebastian-roth Hi Sebastian,
I tried all three. I did not see any difference, no better no worse, imaging works, normal speed, except for the last one (4.18.20.64) which was slow.
Thank you.
David.
-
@dmaret said in Lenovo ThinkCentre M70a:
I tried all three. I did not see any difference, no better no worse, imaging works, normal speed, except for the last one (4.18.20.64) which was slow.
Hmm, interesting. I did not expect the slowness to come back in a later version. Mainly because a kernel usually stabilizes over the lifetime of one version branch. Would you mind re-testing with 4.18.20.64 again? Any chance the slowness comes from other network traffic?
There are more for you to test. I would expect the 4.19 kernel to show the initially reported issue but let’s see.
https://fogproject.org/kernels/Kernel.TomElliott.4.18.17.64
https://fogproject.org/kernels/Kernel.TomElliott.4.18.19.64
https://fogproject.org/kernels/Kernel.TomElliott.4.19.64Can’t promise you that we’ll actually find the root cause of the issue. But it’s still worth a try.
-
@sebastian-roth Hi Sebastian,
I tried 4.18.20.64 again, and I got normal speed this time.
4.18.17.64 and 4.18.19.64 worked just fine as well (normal speed).
4.19.64 did not work as you expected.
Thanks again!
David.
-
@dmaret So the issue starts with kernel version 4.19. I had a first look at the changes between 4.18.20 and 4.19 but it’s hell of a lot of code changes even when just looking at the Realtek driver alone.
Are you keen to go ahead and test 4.19 release candidate versions (those between 4.18.20 and 4.19)? I have to say that it’s very hard to say if we will find the exact code change between kernel major versions that is causing the initial error. When we first started this I was still hoping a change in the 4.18.x series might be it. That would be more easy to nail down because of less changes in between.
-
-
@dmaret I have done some more reading on topics that show the same error messages (1, 2) and the more I read the more I can imagine the messages from the Realtek driver not to be the cause of the problem but maybe just one of the symptoms of a kernel issue (even a soft crash) just before that.
Do you have more than one of these Lenovo ThinkCentre M70a? Have you tried deploying to different ones? Do all show the same issue? Just trying to rule out this is not a hardware issue on one of the machines.
I am not too sure how to take this further. We don’t see other error messages on screen. Does the partclone screen seem to freeze completely after the error messages come up?
Maybe you can schedule a debug deploy task, boot up the machine, set a root password (command
passwd
), check the IP address (ip a s
) and connect to it via SSH. Now you have two command shells to go ahead. Start the deployment (commandfog
) on the machine directly and step through it. Meanwhile also issue a command on the SSH shell just to see it’s still up and running. Especially when the blue partclone screen starts you want to stick around in the SSH shell to see if that freezes at some point - e.g. just type random characters to see if they appear on screen or stop at some point.If partclone stops but your SSH shell still works, then run
dmesg | tail -20
, take a picture and post that here.