Lenovo M73 network fails on FOG OS boot
I have several Lenovo M73 MT-M 10B7-S01900 desktops. I am using one as a FOG server and trying to manage the others as clients. The clients successfully PXE boot to the FOG menu but when attempting to register the clients I get the following error:
starting enp2s0 interface and waiting for the link to come up failed to get an IP via DHCP! tried on interfaces(s): enp2s0 please check your network setup and try again
FOG Version: 1.5.9-RC2
Note: the fog server is dual-homed interface enp2s0 is on the isolated imaging network (192.168.168.0/24) and the other on the main LAN (10.0.0.0/24). The imaging network is using FOG DHCP.
Other machines, including Lenovo ThinkPad t500, DELL Optiplex 9020, register successfully.
The M73s also successfully register on another FOG server on a separate network (single-homed with separate DHCP/DNS) that has been upgraded from 1.5.8 to 1.5.9-RC2.
Does anyone have any tips on where to start troubleshooting this?
Thanks, @george1421. I haven’t worked out the DHCP problem yet, though - I just need to walk away from it for a while. I’m stumped.
Sorry for the delay, @george1421 - I’ve been battling with a DHCP issue (handing out the same IP to every machine) and it’s beaten me! I’ve just run through your test on one machine and, although it is reporting 7 urandom warnings like the 5.6 kernel, it is successfully imaging now.
@3mu I have a one off kernel that has the updated realtek drivers in it. Lets give this one a shot. Download this kernel and save it as bzImageRT (watch the case) https://drive.google.com/file/d/1wZwwOwbEr0nR3mnPLKg7AsulwJaGhO0A/view?usp=sharing
- Move that file to the fog server in `/var/www/html/fog/service/ipxe directory.
- Manually register this target host
- In the host definition there is a field Kernel. In that field enter
- Save the host definition
- Setup a capture/deploy task
- PXE boot the target computer and see if it picks up an IP address with this kernel.
@george1421 - 5.6 fails (I think the messages above were from 5.6). I think 4.19.123 was the same. 4.18.3 works, but does have “3 urandom warning(s) missed due to ratelimiting.” 4.17 gives no urandom warnings.
Would you like me to test each and record the results?
@3mu It sounds like the issue is the linux kernel needs the updated realtek drivers. Using the FOG web ui does the 5.6.x linux kernel work better with these nics?
I did create a one off kernel with the updated nics for the 4.19.x series but we are finding the newest hardware drivers are not being backported to the 4.19.x series any more.
I had two machines that booted successfully once on a different switch, but then never again. I did a packet capture and saw that the client wasn’t requesting an IP address when booting FogOS. I changed KERNELLOGLEVEL to 7 to get some more clues:
Starting haveged: haveged: listening socket at 3 OK random: crng init done random: 7 urandom warning(s) missed due to ratelimiting starting enp2s0 interface and waiting for the link to come up Generic PHY r8169-200:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) r8169 0000:02:00.0 enp2s0: no native access to PCI extended config space, falling back to CSI No link detected on enp2s0 for 35 seconds, skipping it. Failed to get an IP via DHCP! Tried on interface(s): enp2s0 Please check your network setup and try again
Reading up on urandom and haveged led me (eventually) to the Kernel Update function, and changing to kernel 4.18.3 now gives me a reliable boot.