Strange registration messages
-
@george1421: Ok, thanks!
@Sebastian-Roth: When I show “Starting eth0”, NIC LED is off and never did become on!
Now, result of your commands:
After
udhcpc -i eth0
- NIC LED is on
- have an IP
ping
works
-
Those RTL8168/8169/8111 are making me nuts. Seen a similar behavior with one of these NICs but that was with FOG 1.2.0 where the network startup scripts were very different. The current FOG trunk network init scripts should actually bring the interface UP and wait till it’s up. I don’t understand what’s going on with those NICs. They tend to stay down…
-
@Sebastian-Roth: I have found that once the prompt , the network card is active. I have of course no IP , but it is already very good. The startup script not that be the problem? An IP request too early or something like that, right?
I will try to start a 14.04.4 Lubuntu to see if the NIC is recognized or not.
-
Test with last lubuntu LTS: NIC recognized and works at boot.
- kernel version is 3.13.0-24
lsmod | grep real
return only snd realtek, no additionnal module is used for this NIC
-
@Sebastian-Roth Does the FOS kernel contain the realtek driver that is part of the official linux kernel or does it contain the latest driver from realtek. I’ve seen several recommendations to use the realtek drivers over the linux kernel drivers.
aruhuno, just so I understand this. You booted into debug mode and the FOS kernel did not pick up an ip address and the link light stayed off. You executed the udhcpc command (only) and the interface came up and it picked up an IP address. If this was an spanning tree issue, I think we would see the issue sooner in the booting process, since each time (during pxe booting) the transition between kernels happens the network adapter should be reset and the link dropped.
If you watch the link light during the entire target device booting, the link light should momentarily shut off during PXE rom -> iPXE kernel startup and then when the FOS kernel starts.
-
@george1421 I highly doubt the issue he’s seeing here has anything to do with the driver at all.
My guess to what’s being seen here is the 802.11az issues I have seen. This, also, is my same suspicion to what @Wayne-Workman was seeing with his network and the random ports.
The fix, for us, is to simply take another switch and connect it.
802.11az is nice, when the OS is booted and running, and the system sleeps. When it sleeps it disconnects the NIC. Not all nic’s have this type is setup which is also likely why @aruhuno is not seeing this on other models of the system.
The issue, as I’ve seen it occur, presents itself very slowly. You go to boot a system, sometimes TFTP is fast (usually after a cold boot) and get’s you into iPXE and into the FOS system just fine (albeit with slowed/lack of IP address obtaining). Other times the TFTP pxe process even seems to get hung up, but after 30 - 50 seconds it might get a DHCP anyway (people don’t like such minor delays usually though and FOS definitely doesn’t like it.)
These are all guess on my part though, because it’s such a few and far between issue, it’s not something easily findable.
@aruhuno, can you start the tasking in “Debug” mode (Create the tasking like you normally would, but before confirming check the box for “Schedule as debug task”.)
When the client boots it should drop you into a linux shell prompt. From there, you can try to see if it will pick up an IP address after a small period of time. Just run:
/etc/init.d/S40network stop && sleep 2
/etc/init.d/S40network start` and it should try to get you an IP address again. -
@george1421
After new test, when I’m run in debug:- NIC is on everytimes
ifconfig
no IPping
no response
If I run
udhcpc -i eth0
or/etc/init.d/S40network stop && sleep 2/etc/init.d/S40network start
, everything is working.@Tom-Elliott
Ok, when I run your command, I get an IP -
@aruhuno This leads me to believe it is indeed due to power saving features on the NIC.
-
@Tom-Elliott
What can I do at once ? -
@aruhuno Put in another switch (one that doesn’t use 802.11az would be helpful too) or edit the firmware to always enable the LAN from within windows.
Unfortunately there’s no straightforward method to doing this.
-
@Tom-Elliott
I don’t know if my switch supports 802.11az but I do not think.
I change option in firmware from Windows but has no impact -
@aruhuno Do you have an old unmanaged switch you can insert between this computer and your building network switch? If Tom is right, inserting a really dumb unmanaged switch should disable the 802.11az port protocol.
@Wayne-Workman you may want to follow this thread.
-
@aruhuno Please have a look at what the network script actually does. On line 25 the network interface is brought up, then link state is detected (in a loop ten times!) and when the link is ready we start udhcpc to get an IP. What else could we possibly do??
One thing you could also try is adding
has_usb_nic=1
in the field “Host Kernel Arguments” in that host’s configuration on the web interface. This way you will be prompted to un/re-plug the USB NIC. Ignore that. But you will have some sleep time on bootup just before the network configuration. -
@Sebastian-Roth
Ok, but, see my cold boot:- Power On: no LED (link down)
- PXE found: LED (link up)
- PXE boot: LED (link up)
Starting eth0
message: no LED (link down)Computer will reboot in 1 minute
message: no LED (link down)- few seconds after: LED (link up)
But when link is up, script wait for reboot computer… too quickly boot or?
In the script, comment line 18, it’s written
wait 10 seconds
but where is thewait
/sleep
? -
@aruhuno The last time I had this issue I switched ethernet cables and it worked fine. Your mileage may vary of course.
-
-
@aruhuno If you have time, I would like to see if you can create a FOS-L (FOG OS Live boot flash drive). https://forums.fogproject.org/topic/6532/usb-boot-target-device-into-fog-os-live-fosl-for-debugging
Actually I want you to follow the Option #4 process:
https://forums.fogproject.org/topic/6532/usb-boot-target-device-into-fog-os-live-fosl-for-debugging/19While I don’t think this will give us different results than PXE booting, I would like you to create this boot flash drive to see if we eliminate the PXE and iPXE parts out of the booting process, does anything change. Booting with this flash drive will send you directly to the debug console. I’m interested in when the link light comes on. Does it repeat the pattern of pxe booting or do you get an IP address right away.
I’ve done quite a bit of research on this in the last few days and it seams these realtek NICs are generally a problem for linix. I have found references that this particular nic 8169 driver is trouble, where the recommendation is to black list this linux driver and to use the 8168 driver instead. If this is the case then it is a kernel builder issue.
-
@aruhuno I ran into a similar problem last night. For some reason, the FOG_WEB_HOST setting under: FOG Configuration Page->FOG Settings->Web Server was blank when it should have been /fog. Please check this setting and make sure it’s set to /fog. It should work then (hopefully).
-
@george1421
Hum, just a question in your script: line 18, it’s writtenif not wait up to 10 seconds
but where is thesleep
command ? It’s possible to edit this file directly in my FOG installation?@Tom-Elliott:
It’s not blank for me -
@aruhuno said:
Hum, just a question in your script: line 18, it’s written if not wait up to 10 seconds but where is the sleep command ? It’s possible to edit this file directly in my FOG installation?
Unfortunately not. The script is packed into the init.xz file and cannot be modified as is. But you can extract the init.xz and modify it! About your question on the wait. Take a look at line 33 (sleep for 1 second) and line 26 (loop ten times or till the NIC reports to be up). Possibly your NIC reports UP status although it is actually down. Please PXE boot in debug mode and check
cat /sys/class/net/eth0/carrier
(1 means connected, 0 means disconnected). This check works for most NICs I suppose as users don’t report having issues. But possibly your NIC is behaving differently.Please let us know about your findings so we can add this to improve the network startup script…