Strange registration messages
-
Hello,
When I want register a host, I’ve the following message before FOG presentation:
On the Lenovo ThinkCentre E73, I then have the following message (works with FOG 1.2.0) :
The others computers models I own works perfectly: Optiplex 390, 790, 3010 and 7010.
Thanks for your help
PS : FOG git version 6519
-
While I don’t have an answer for you I just want to recap what I understand so far. Let me restate this in my words to ensure your issue is clear.
I have a Lenovo E73 that will not register under the trunk build 6519. The booting process works fine until FOS starts. FOS is unable to pick up a dhcp address and eventually fails to register the host. I can register my o390, o790, 3010 and 7010 without issue. I was able to register this system without issue before under fog 1.2.0 stable.
-
Just looked at the specs for the E73, the network adapter is a Realtek RTL8111GN. The @Developers may have to take a look to ensure this realtek driver is part of the FOS kernel.
-
Does this computer’s NIC have activity lights? Can you monitor them to see if they are active or not when this last error appears?
I’m just working off of a hunch here.
-
On Optiplex 390, 790, 3010 and 7090, no issue.
On ThinkCentre E73:
- on FOG boot menu, I choose “Quick Registration and Inventory” (same with all options to inventory)
- messages on the first screenshot appears
- after few second, message on the second screenshot appears
I had no problem with the ThinkCentre on FOG 1.2.0 , but I need to deploy Windows 10. So I reinstalled a fresh version of FOG from the git. Currently, I’m on FOG build 6519 and Debian Jessie.
-
If it were me, I’d start by seeing when the issue occurred/occurs.
What kernel version are you using to boot the clients? (FOG Configuration Page-> Storage Node name should show bzImage and bzImage32 kernel versions.)
Can you try downgrading the kernel version? Try 3.19 series and maybe 4.0 thru 4.3?
-
Kernel version
bzImage Version: 4.4.3 bzImage32 Version: 4.4.3
When I want to downgrade, I’ve the following message:
Error: Failed to install new kernel
On FOG 1.2.0 I had version 3.19.3.
-
I guess I’m not understanding how this is a BUG?
Also,
Have you ensured the FOG FTP Credentials are correct? The credentials you need for kernel updating is located in:
FOG Configuration Page->FOG Settings->TFTP Server->FOG_TFTP_FTP_USERNAME and FOG_TFTP_FTP_PASSWORD
-
@Tom-Elliott This is not a credential issue but directory…
Bad value during install ?
/var/www/fogservice/ipxe/
Instead of:
/var/www/fog/service/ipxe/
Kernel is downgraded at 3.19.3 and everything works!
I just want to clarify that whatever the machine used (Optiplex or ThinkCentre), the first screenshot remains true.
-
I spoke too soon , it does not work on the ThinkCentre with the same message…
-
Did you fix the missing slash ‘/’ in the file path? You are not the first person to report this slash missing.
-
@george1421 Yes, of course.
-
Ok then, do you know how to do a debug capture/deploy debug?
Manually register the device with FOG.
Then select either capture or deploy (which one is not that important now).
Before you create the task make sure you select “Schedule this task as debug task”. Then submit the task.
PXE boot that E73 again and the FOS kernel should start. You will see a bunch of commands displayed on the screen, at the end press enter and you should be dropped into a linux command shell. From there we need to check to see if the network adapter is working. you can do this by issuing the commandip addr show
It should list the network adapters the FOS kernel has discovered. -
@aruhuno As well please run
lspci -nn | grep Ethernet
while you are in debug mode. Please take a picture of the commands and outputs on screen and upload it to the forum. Would be very helpful.We actually see the NIC being recognized (message “Starting eth0 interface” on the first picture). So it seams like it cannot get an IP via DHCP. Maybe it’s a spanning tree issue?
-
@Sebastian-Roth & @george1421:
@Sebastian-Roth: The switches are configured correctly
-
@aruhuno This is great!! Nice details While I think I know the answer, but just to verify, on the target device is the link light on when you took this picture. From the picture it appears that eth0 is not picking up an ip address (what I suspected).
Sebastian: Isn’t there a lsmod command or something that should show if the realtek driver is loaded into memory?
-
@george1421 said:
Isn’t there a lsmod command or something that should show if the realtek driver is loaded into memory?
We have all the drivers compiled directly into the kernel. So you won’t see it with
lsmod
But I am sure it is properly recognized by the driver as we won’t see eth0 if not!!@aruhuno Please keep an eye on the NIC LEDs while booting up. Are they on or off while you see the “udhcpc … Sending discovery…”?? As well try debug mode again, wait for half a minute or so (check NIC LEDs) and try requesting an IP then
udhcpc -i eth0
As well, please show us the output of
dmesg | grep r8169
-
@Sebastian-Roth For sure you are right. If the kernel didn’t have the driver built in, there would be no eth0 adapter. The only thing I can think of is that the kernel has a driver close to the network adapter and was able to init the nic but still can’t talk. I remember something about some network boards required two drivers. One for the nic and one for the interface (name MII comes to mind for the interface).
But I again would agree the OP should watch the link lights on this device to see if the link is being set after the FOS kernel already tried the network interface. If the link lights are on and the udhcpc command doesn’t pickup and IP address, we will have to dig deeper.
-
@george1421: Ok, thanks!
@Sebastian-Roth: When I show “Starting eth0”, NIC LED is off and never did become on!
Now, result of your commands:
After
udhcpc -i eth0
- NIC LED is on
- have an IP
ping
works
-
Those RTL8168/8169/8111 are making me nuts. Seen a similar behavior with one of these NICs but that was with FOG 1.2.0 where the network startup scripts were very different. The current FOG trunk network init scripts should actually bring the interface UP and wait till it’s up. I don’t understand what’s going on with those NICs. They tend to stay down…