Failed to get an IP via DHCP!
-
@jes6309 (A bit confused now). In your OP you said that when you selected full registration you were getting the “Sending discover…” this should be the FOS kernel running with the virtual hard drive but not able to get a dhcp address. But when you do a debug deploy the system is freezing after the kernel gets the FOS virtual hard drive (init.xz)?
This is a bit inconsistent.
Thinking <…>
-
@george1421 hmmmm
idk if this will help, but when I run Kernel Update (after having fixed the credentials as per Wayne) it gets stuck on “Moving to TFTP server…”. If I close out of the operation and attempt to run Kernel Update again, I receive the “Type: 2, File: /var/www/html/fog/lib/fog/fogftp.class.php, Line: 144, Message: ftp_put(/var/www/html/fog/service/ipxe/bzImage): failed to open stream: No such file or directory.” error. On the fog server, if i check /var/www/html/fog/service/ipxe, “bzImage” is nowhere to be found. Note, it was present before attempting the kernel update. It’s as if the kernel update operation is just deleting the bzImage directory rather than actually updating it. I feel like these bizarre errors I’m getting when booting my client to fog and/or registering it as a host has got to be related to a bad kernel, or, after “updating”, a missing kernel.
I’m just at loss for a fix.
-
@jes6309 ok that is now understandable. I missed the part about the failed update. Let me see if I can get the url to manually download the kernels.
-
Execute the following commands on the fog server
cd /var/www/html/fog/service/ipxe wget https://fogproject.org/inits/init.xz wget https://fogproject.org/inits/init_32.xz wget https://fogproject.org/kernels/bzImage wget https://fogproject.org/kernels/bzImage32
This will download the latest (current) kernel and inits.
-
@george1421 Ok that got me past “init.xz… ok”
However, I’m getting the “Sending discover…” message again. This is with the debug deploy task scheduled.
-
@jes6309 Good that should eventually time out (I think) and drop you to the linux shell.
-
This post is deleted! -
@george1421 Yup!
[Thu Apr 14 root@fogclient /]# is what I am seeing. What commands did you want me to run?
I went ahead and ran lshw -short and ethernet controller listed is “RTL8101E/RTL8102E PCI Express Fast Ethernet Controller”
-
@jes6309 OK that is what I expected. I need to ping the @Developers to see if that hardware is in the current FOS kernel since I haven’t seen that one before. We actually need to get the device id. There is a linux command (thinking lspci) that will display the vendor and hardware ID or we can get the same information from a windows system.
<edit> I just checked and its
lspci -k
I’m suprised that my fog server running centos 7 doesn’t have lspci
ref: https://wiki.fogproject.org/wiki/index.php/Troubleshooting_Driver_Issues
</edit> -
@george1421 Ok here is the info I believe you are looking for:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller (rev 07)
Subsystem: Hewlett-Packard Company Device 8137
Kernel driver in use: r8169
-
@jes6309 There is actually numbers like 8086:1502 that we’ll be looking for. The first group is vendor and the second is hardware id.
lspci -n
then you will need to find the network adapter from the description -
@george1421 Ok, I think its 10ec:8136
-
@jes6309 As you can see here this NIC model is supported since kernel version 2.6.x and upwards. So it should not be a driver issue! As well, “Sending discovery…” actually means that we found a network interface and try to get an IP for it via DHCP. So it could be:
- Layer 1 issue like cable (you already checked that)
- Spanning tree issue (make sure you have RSTP or configured “port fast”)
- Auto-negotiation issue (try configuring static speed instead of auto-negotiation for that port)
- Ethernet energy saving (see if your switch has EEE/802.3az feature and disable if possible)
-
@Sebastian-Roth I will start looking into these things, but if a switch configuration was causing the issue, wouldn’t I have run into the problem before upgrading to trunk? Never had these issues occur on the normal version of fog.
Thanks!
-
@jes6309 said:
if a switch configuration was causing the issue, wouldn’t I have run into the problem before upgrading to trunk? Never had these issues occur on the normal version of fog.
Good point! Looking at if from this side only the last two bullet points seem to possibly cause the issue. Newer linux kernel with newer driver for the NIC might have a auto-negotiation problem or is now able to put the NIC in energy saving mode… and doing it wrong. Please check those two first.
-
@Sebastian-Roth said in Failed to get an IP via DHCP!:
@jes6309 As you can see here this NIC model is supported since kernel version 2.6.x and upwards. So it should not be a driver issue! As well, “Sending discovery…” actually means that we found a network interface and try to get an IP for it via DHCP. So it could be:
- Layer 1 issue like cable (you already checked that)
- Spanning tree issue (make sure you have RSTP or configured “port fast”)
- Auto-negotiation issue (try configuring static speed instead of auto-negotiation for that port)
- Ethernet energy saving (see if your switch has EEE/802.3az feature and disable if possible)
wiki worthy
-
@Sebastian-Roth Thanks! Setting static speed and enabling port-fast worked! I am now able to successfully register the host.
-
@jes6309 Can you please verify if it is static speed or port-fast that make it work in your case? We have seen more auto-negotiation issues with newer kernels lately. I am a bit concerned about this! Are we able to fix this from our side (kernel parameter or whatever)??
-
@Sebastian-Roth After reverting the port back to auto and leaving port-fast enabled, registration and imaging was still successful.
Looks like port-fast was the fix.
-
@Wayne-Workman I am getting the same issue as well. I have ensured that the TFTP username and password are correct, but I get the same error. Cannot update kernel and my kernel version does not show, same as @jes6309