Failed to get an IP via DHCP!
-
@tom-elliott said in Failed to get an IP via DHCP!:
t what’s providing DHCP for the isolated network? (How is it getting the PXE menu to begin with?)
The FOG server is handling that. When I was setting up FOG I had it handle the addressing. At first the DHCP services weren’t working. I couldn’t get the PXE menu or connect to the web page. I found that the laptop wasn’t getting an IP. I went back to the server and verified the DHCP settings and found that the service wasn’t started. Started it and everything was good to go then.
I’ll take a look at the DHCP logs to see if something turns up there.
-
@kellyg This error reminds me that the FOS engine is trying to connect to the FOG server, failing then assuming that it doesn’t have an IP address and tries the discovery process again.
In the FOG web gui settings. Make sure (way at the bottom, that the tftp server and web server addresses are populated, as said from memory).
-
@Tom-Elliott Hey Tom, I pulled the DHCP logs. If I boot to the hard drive, The server gets an IP.
The 12:53 time frame was one of the attempts to image the system. The later ones are when I booted to the hard drive.********** syslog **********
Jul 27 12:53:49 <Redacted> dhcpd: DHCPDISCOVER from 2c:59:e5:47:ba:dc via em1
Jul 27 12:53:50 <Redacted> dhcpd: DHCPOFFER on 192.168.122.30 to 2c:59:e5:47:ba:dc via em1
Jul 27 12:53:50 <Redacted> dhcpd: DHCPDISCOVER from 2c:59:e5:47:ba:dc via em1
Jul 27 12:53:50 <Redacted> dhcpd: DHCPOFFER on 192.168.122.30 to 2c:59:e5:47:ba:dc via em1
Jul 27 12:53:52 <Redacted> dhcpd: DHCPREQUEST for 192.168.122.30 (192.168.122.5) from 2c:59:e5:47:ba:dc via em1
Jul 27 12:53:52 <Redacted> dhcpd: DHCPACK on 192.168.122.30 to 2c:59:e5:47:ba:dc via em1
Jul 27 13:09:01 <Redacted> CRON[17577]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Jul 27 13:17:01 <Redacted> CRON[18054]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jul 27 13:39:01 <Redacted> CRON[19536]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Jul 27 13:45:45 <Redacted> dhcpd: DHCPDISCOVER from 2c:59:e5:47:ba:dc via em1
Jul 27 13:45:46 <Redacted> dhcpd: DHCPOFFER on 192.168.122.31 to 2c:59:e5:47:ba:dc (<Redacted>) via em1
Jul 27 13:45:46 <Redacted> dhcpd: DHCPREQUEST for 192.168.122.31 (192.168.122.5) from 2c:59:e5:47:ba:dc (<Redacted>) via em1
Jul 27 13:45:46 <Redacted> dhcpd: DHCPACK on 192.168.122.31 to 2c:59:e5:47:ba:dc (<Redacted>) via em1
Jul 27 13:48:16 <Redacted> kernel: [761587.349200] usb 1-5: new high-speed USB device number 8 using ehci-pci
Jul 27 13:48:16 <Redacted> kernel: [761587.676868] usb 1-5: New USB device found, idVendor=0951, idProduct=1666
Jul 27 13:48:16 <Redacted> kernel: [761587.676873] usb 1-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 -
@kellyg So the pictures you showed us is of eth4 and eth5, do eth0, eth1, eth2, or eth3 actually get a lease, but fail after that point? (I don’t know tha tyou have a patch cable attached to all 6 nics at the same time right?)
-
@tom-elliott - I’ve got a cable in eth0, eth3, and eth5. BIOS is set to accept PXE from all but the last 2 ports. I did try switching ports and at one time had a cable in all 6 ports, but got the same message.
@george1421 - I took a look and both the web server address and the tftp address are populated with the same IP, 192.168.122.5.
-
@kellyg Is 2c:59:e5:47:ba:dc one of the macs on this target computer?
As for the web settings, I should have asked you if you can image other hardware before jumping on the web settings.
-
@george1421 - yes
-
@george1421 sorry to keep sending you back to the well, but is FOG_WEB_ROOT set to
/fog/
?The other fields I was mentioned are FOG_TFTP_HOST and FOG_WEB_HOST
-
@george1421 That is the target system. I’ve tried other servers to see if they will image, but they are all HP and they all do the same thing. I’ve got a old Dell I brought up to test to see if the problem is isolated to one manufacture or to the FOG.
-
@kellyg OK that is what I was going to suggest next. Try a different target computer. If that one can load an IP address we can almost 100% rule out FOG server.
-
@george1421 said in Failed to get an IP via DHCP!:
@george1421 sorry to keep sending you back to the well, but is FOG_WEB_ROOT set to
/fog/
?The other fields I was mentioned are FOG_TFTP_HOST and FOG_WEB_HOST
Don’t worry about that. Yes, both are set to the IP address and FOG_WEB_ROOT is set to
/fog/
. -
@george1421 - Ok… The Dell server does the exact same thing. I did find something interesting however. For this Dell, there are two onboard NICs and a PCIe NIC. I initially plugged a cable into eth0 and got nothing. A added a cable to eth1 and still got nothing. The system would attempt to get an IP and then fail. However, when I added a cable to the PCIe card, the system inventoried. I then added a card to one of the HP servers and was able to get it to inventory. So… the embedded NIC is good for PXE, but bad for getting a DHCP and the PCIe is good for getting a DHCP, but bad for PXE?
While this is a viable workaround, I won’t have the time to pop in an add in card to every system just to get it to image? I did notice that both of the servers are running NetXtreme II NICs onboard. Is there something with that manufacturer that doesn’t work with isc-dhcp?
-
@kellyg If I remember correctly (at least on dell servers), when it goes an enumerates the network interfaces it will start with the add on adapters first and assign the LOM adapters last.
If FOS is seeing the mac address of the LOM network adapters then it should be able to use them to pxe boot. If the mac addresses are not showing up then FOS probably doesn’t have the nic drivers.
-
@george1421 - I’m not sure about the drivers. So here’s what I did. With the addon card connected to the switch and the LOM connected to the switch, I was able to register and inventory the server with no problem. However, FOS records the addon card as the primary mac address. The server won’t PXE to the addon card, even thought it’s listed in the boot order, it only wants to use the embedded. Don’t know why only the onboard, must be some connection problem between the server and the chair.
I connected the cable back to the LOM port 1 and the system booted to PXE fine. However, unless I change the primary MAC in FOS to match the embeded, then it will report that the system has not been registered or inventoried. So if I create a task to push an image to the server, it doesn’t recognize it and never starts.
Changing the MAC in FOS and everything works correctly. But then I’m back the original problem, I’ve got to install and addin card into the server just to get it to image. Might as well pop in the winblows DVD and install it that way. -
@kellyg Here’s what I want to try (sorry don’t have a clean understanding just yet).
- Place the pci nic in the server.
- On the FOG management gui schedule a debug (capture or deploy) it doesn’t matter we need access to the FOS command prompt.
- PXE boot the target computer.
- After a few presses of the enter key it should drop you to a command prompt.
- Give root a password, anything is fine. Root just needs a password. Set it with
passwd
. This password will be only temporary since FOS runs out of RAM. - Get the IP address of the target computer with
ip addr
- Now from a windows computer use putty to connect to FOS at the IP address collected from #6 and login as *root and the password defined in #5. (we are doing this to allow easy copy and paste of commands between windows and FOS.
- Now lets understand what FOS sees
ip link show
Please post this here. If possible note the mac address to the physical network interface. Holefully you will have one ethX interface for each physical interface. - Next lets see what FOS sees as pci devices
lspci |grep net
Leave this setup configured. It would be interesting to know if you connected one of the
non-functional
network ports to a second network cable (not use the one on the pci nic) do they pick up an IP address with FOS running? But this is something we should test after collecting what we need to know. With the second network cable installedip addr show
should show if these other LOM nics pick up an IP address. -
@KellyG Could spanning tree or auto-speed negotiation issues play a role here as well? Just to rule that out please connect a dumb unmanaged switch in between the client and the actual switch.
To figure out which NICs play nicely with the linux kernel (as this is where you experience the issue) you might want to boot any recent live linux CD (kernel version 4.8 and newer I’d recommend). Plugin on the first NIC, boot from CD and run
lspci -vv | grep -e "^[0-9]" -e "Kernel driver" | grep -A 1 "net"
. On my PC for example this lookes like this:# lspci -vv | grep -e "^[0-9]" -e "Kernel driver" | grep -A 1 "net" 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) Kernel driver in use: e1000e
Then see if you can setup the network properly and ping other PCs. Note down driver, NIC, MAC address. Then shutdown, re-plug to a different NIX and redo the same thing again. Please post all the information here so we can see if we have all the kernel drivers included in our build. If not we can add those for you.
-
@sebastian-roth Sorry for the delay in getting back, I only work Monday through Thursday. Booting to a live CD and pulling the NIC list gives me the following
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Kernel Driver in use: bnx2 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Kernel Driver in use: bnx2 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Kernel Driver in use: bnx2 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Kernel Driver in use: bnx2
-
@george1421 Just scrolling through all of the responses. I’ll run through your steps here shortly and get back to you.
-
@KellyG Hmm, from the output you posted it seems like it only recognized the Broadcom NetXtreme II 5709 Gigabit NIC - Quad Port. What about those two onboard NICs? Not recognized at all? I wonder why they don’t show up in the
lspci
listing. Can you get the PCI IDs (for onboard and the PCIe NetXtreme card) from the windows device manager - post the *full “Hardware ID” string you find in the details tab)? Possibly we need to add a firmware driver to the kernel. -
@sebastian-roth Those are the onboard nic’s. I had to pull the PCIe card for another server. If I have the card in the system, I can boot to the LOM, but they don’t recognize that there is a DHCP responding. The PCIe NIC will respond to the DHCP, but won’t PXE boot.
I’m not finding anything specific about the NetXtreme, are they supported for imaging?