FOG will not boot - "Failed to get an IP via DHCP!
-
I just installed my first FOG server (1.3.0 SVN Rev 6050) and am receiving this same error message, “Failed to get an IP via DHCP! Tried on interfaces(s):” when attempting to register a client.
I’ve simplified the infrastructure to the client and the FOG server connected through a dumb switch.
The client appears to receive an IP from the FOG server during PXE boot. The error message is received only after full or quick registration is selected.
This is a mintbox mini (http://www.fit-pc.com/web/products/mintbox/mintbox-mini/).
A developer’s previous request asked another user for the output to lspci -nn | grep Ethernet. Mine is as follows…
01:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
Any hints, help, or pointers would be appreciated.
Cheers!
-
@Wirefall Update: Looking at the daemon.log, it appears that the client makes a DHCP request and is provided an IP from the FOG server of X.X.X.11. There is then an error message of “init.tftpd[XXXX]: tftp: client does not accept options.” A new DHCP Discover is then completed and the client receives X.X.X.10 (the first available entry in the DHCP pool).
-
@Wirefall It sounds like the target computer is picking up an IP address and pxe booting since the FOG iPXE menu is being displayed and you can pick registration. The FOS engine (the customized linux OS that captures and deploys images on the target) is starting up, but its FOS that can’t seem to get an IP address.
This could be a spanning tree issue, if it is I’m a bit surprised that it didn’t show up earlier in the booting process. You can test this by putting a dumb (unmanaged) switch between the booting computer and the building network switch. If the computer boots correctly into registration and imaging then its probably a spanning tree issue.
It could also be a driver issue in the FOS engine. FOS uses linux 4.8.1 (if I remember correcly) so it should support that nic adapter no problem. If you manually register that target computer then schedule a debug capture (create a dummy image definition, assign that dummy image to the host, then schedule a caputure of that host but remember to check the debug option when you schedule it, then finally pxe boot the target computer). The FOS engine will boot and after a few key presses drop you at the FOS command prompt. (I suspect you’ve done this before since you ran the lspci command on the target computer). I would be interested in seeing what the result of
ip addr show
is telling us. -
@george1421 said in FOG will not boot - "Failed to get an IP via DHCP!:
You can test this by putting a dumb (unmanaged) switch between the booting computer and the building network switch.
He did that, in his first post:
I’ve simplified the infrastructure to the client and the FOG server connected through a dumb switch.
-
@Wirefall said in FOG will not boot - "Failed to get an IP via DHCP!:
@Wirefall Update: Looking at the daemon.log, it appears that the client makes a DHCP request and is provided an IP from the FOG server of X.X.X.11. There is then an error message of “init.tftpd[XXXX]: tftp: client does not accept options.” A new DHCP Discover is then completed and the client receives X.X.X.10 (the first available entry in the DHCP pool).
The “tftp: client does not accept options.” message is a misnomer. It doesn’t mean anything in regards to the problem(s) you’re seeing. The DHCP request appears to be the issue but only within the init itself. This leads me to think the switch is potentially a “power saving” switch?
-
@Tom-Elliott said in FOG will not boot - "Failed to get an IP via DHCP!:
The “tftp: client does not accept options.” message is a misnomer. It doesn’t mean anything in regards to the problem(s) you’re seeing.
I agree, that tftp message is just a warning message.
We’ve seen a similar issue before with realtek nics and green ethernet/power saving/802.3az being enabled on the switch. But this is the first time seeing an intel nic do this.
-
@george1421 Thanks for the pointer, I’ll check to see if there’s anything in the BIOS I can manipulate regarding power saving features.
-
@Tom-Elliott I’ve tried through two separate switches. One managed the other as cheap as cheap can get…doubt it has any additional functionality. I’ll dig out an old hub to make sure this isn’t the problem. Thanks for input!
-
@Wirefall I think our next step is to get the target computer to boot into a debug capture console to see what is going on. If a dumb switch with no fancy support doesn’t work I think a hub would be a waste of time (unless you had one under your desk already).
-
@george1421 I’ll attempt a manual assignment and look at the debug info if successful. Cheers
-
@george1421 Roger that. I’ll attempt the manual insertion and debug capture this evening. Thanks for the prompt responses!
And, yes, I have all sorts of junk under my desk…
-
@george1421 @TOM ELLIOTT @WAYNE WORKMAN
I manually added and assigned debug task. The contents of /var/log/ are attached. I also ran ifconfig; it only shows the local adapter and it wasn’t possible to bring up eth0.
Also, previously daemon.log on the FOG server showed a DHCP offer of X.X.X.11 that was then requested by the client and ack’d by the server. This was almost immediately followed by a new discover and an offer made for X.X.X.10, which was also requested and ack’d. This time the same thing occurred, other than the initial offer being for .13, which was still followed up with .10.
Thanks for looking at this. Let me know if there’s any other information that would be helpful to troubleshoot.
Cheers!
-
@Wirefall What is the model of the problematic host? Is this a server?
-
We’d also need to see
ifconfig -a
rather than justifconfig
(The latter only shows active devices). -
@Tom-Elliott @Tom-Elliott identical output - lo only. When not PXE booting the device is running a Debian variant (Kali) with full network capabilities.
-
@Wayne-Workman The full specs for the MintBox can be found here - http://www.fit-pc.com/web/products/mintbox/mintbox-mini/
Max Screen Resolution 1920 x 1200 @ 60Hz
Processor AMD A Series
RAM 4 GB SO-DIMM DDR3
Hard Drive 64 GB
Graphics Coprocessor Radeon
Card Description dedicated
Wireless Type 802.11bgn
Number of USB 2.0 Ports 3
Other Technical Details
Brand Name MintBox
Series MintBox Mini
Item model number FITLET-GB-C64-D4-M64S-W-XL-BMint
Hardware Platform PC
Operating System linux
Item Weight 8.8 ounces
Product Dimensions 4.2 x 3.3 x 0.9 inches
Item Dimensions L x W x H 4.25 x 3.27 x 0.94 inches
Color Green
Processor Brand AMD
Computer Memory Type DDR3 SDRAM
Hard Drive Interface Solid State
Audio-out Ports (#) 1
Power Source DC
Voltage 12 volts -
@Wirefall From within the FOG debug console (the same place where you entered the
ifconfig -a
command key inlspci -nn|grep etwork
We will be interested in a line that looks like this
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)
More precisely the hex numbers (i.e. [8086:1502] ) in the square brackets. Those numbers identifies the nic controller.
-
Just a simple question, seeing as you state the device id information is 8086:1539 (which has been in the linux kernels for a very long time), does your registration issues happen on multiple systems of the same model or just this one system? I only ask because this isn’t seeming to make sense. The only things I can think of:
- Device is having issues (Unlikely seeing as tftp and ipxe appears to work properly).
- Device patch cable is screwed up? (Likely because TFTP doesn’t require the cabling layout as full network support typically does, – confusing because while TFTP might work, iPXE shouldn’t).
-
@george1421 [8086:1539]
-
@Tom-Elliott This does occur on both of the two test boxes I have.
To rule out flaky hardware/cables, I’ve taken the following steps.
- Swapped smart switch for unmanaged switch.
- Replaced all cables.
- Replaced NIC on FOG server - both are Intel Gigabit, [8086:10ce] and [8086:10d3]
Thanks and Happy New Year!