Failed to get an ip from DHCP ! Tried on interfaces eth0 eth0 .
-
Server
- FOG Version: 1.3.5-RC-16
- OS: Debian 8
Client
- Service Version:
- OS:
Description
So my problem is that suddenly fog is not able to bring up the interfaces on machines and get ips . It boots up the pxe i select Full inventory and Registration then it loads some files but always fails at : Starting eth0 inerface blablabla . Also before that i am gatting an error on Populating /dev using udev : error creating epoll fd : Function not implemeted. This is very strange everything worked fine until suddenly today i am having this issue. Check on differents switches but nothing . also different devices same error. Can anyone help ?
-
@kleanthis It sounds like the FOS engine (bzImage + init.xz) is making it to the target computer, but when FOS initializes it can’t find the network/pickup and pickup an ip address. Did this issue just start with RC16?
This sounds like a spanning tree issue, but its strange it would just start. Could you test something for us? Please place an unmanaged network switch between the pxe booting computer and your building network switch. See if that solves the dhcp issue.
Also for the sake of completeness, what is the the target hardware that is having an issue right now.
-
Same issue with Dlink unmanaged switch .
The strange thing is that The ethernet port lights are on until init.xz is loaded , then the screen goes black for 2 secs comes back up and the lights of the machines ethernet port are off and stay off . The machine is Adnatec 3360-F
:
http://www.advantech.com/products/1-2jkd2d/ark-3360f/mod_db130bfd-5684-4598-b1bc-01877c5f8160No the issue started with the previous versions i just updated today to make sure its not something that has been fixed in latest version.
-
@kleanthis Outside of this embedded device, does imaging work correctly on traditional office computers?
Looking into the data sheet this device has 3 network adapters, each has a different model number. LAN 1 Intel 82567V GbE, LAN 2 Intel 82583V GbE, LAN 3 Intel 82541PI GbE
I wonder if FOS is picking an interface that doesn’t have a network adapter as its primary.
-
@kleanthis what version of FOG were you on before you updated to RC16.
What version of FOG was the last one you remembered worked correctly with this device? I know the FOS kernel (bzImage) was updated with RC14 (I think). If we can narrow down when it worked vs now we may be able to point to the kernel update that is at fault.
-
On my pc Lenovo L460 with single ethernet port it loads fine altough i am still getting the Populating /dev using udev : error creating epoll fd : Function not implemeted error . dont know if its relevant . But it continues on to registration. On these embedded devices fog regognices Lan 2 eth1 only . Also when trying to start the ethernet links it sais starting eth0 … then times out after 35 then sais Starting eth0 again does not move to eth1 which used to do in the past . Last time i checked that was working ok was 1.3.5-RC-10 but didnt update that until today that i had this issue.
-
/dev using udev : error creating epoll fd
That is just a warning message and is not related to your current issue.
OK I think the next step is to manually register this device and setup a debug capture/deploy (what ever you were doing). A debug deploy/capture will instruct the FOS engine to drop you to a command line on the target computer. From there we need to do some debugging commands like
ip addr show
to see if any ip address is being picked up by any interface.OK so can we say that RC10 was also defective in regards to this system too?
-
One more thing . I have tried now on the 3rd lan port of the box . Intel Agent 1.2.6 and not 1.3.x loaded . Fog was able to grab an ip now , register and start deployment . What could be causing this ? Lan 1+2 with Intel Boot Agent 1.3.x dont work but Lan 3 which loads with Intel agent 1.2.x loads just fine . I am waiting to see if the deployment finishes fine.
-
@kleanthis So it looks like it does not work with Lan 1 and Lan 2 but only with LAN 3 . I dont know why though . Can you check ?
-
@kleanthis it would be interesting (again) to setup a debug deploy/capture to get access to the FOS engine command line. I’m suspecting that FOS is setting the eth0, eth1, eth2 order based on hardware discovery order, which seems to be inconsistent with what the iPXE kernel is doing.
-
Is there a guide on this ? Debug deployment?
-
@kleanthis Its simple. Just schedule another capture/deploy to the same target. Since you have LAN3 working use that interface. But before you submit the task select the check box that says debug.
This tells FOS to not start the task automatically but drop the user to a linux command line. This is what the developers use to debug the FOS engine.
-
@george1421 ok so use lan 3 (the one thats working) for the debug deployment?
-
@kleanthis what we “need” is to get to the linux command prompt on the target computer THEN key in
ip addr show
, orip link show
. From the output then we can see the reference between ethX and mac address. That will help us identify what FOS “thinks” eth0 is.Don’t worry about breaking what you were just able to do, we are NOT going to start the deploy/capture as long as debug was enabled.
-
@george1421 When you get to the FOS command line, if you give root a password, you can connect to the FOS engine via putty to help with any copy paste operations you need.
at the FOS command line key in
ip addr show
to get the device’s IP address
then key inpasswd
and give root a password. It doesn’t matter what password you give it, as long as its not blank. From there you can connect to the FOS engine using putty and login asroot
and what ever password you give it. Then you can run the debugging commands from there. -
@kleanthis This connecting via putty is nice because you can copy/paste the contents of the output from commands we might request.
-
There you go :
[Fri Mar 17 root@fogclient ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0B:AB:70:9A:21 inet addr:192.168.1.111 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1600 errors:0 dropped:0 overruns:0 frame:0 TX packets:67 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:183345 (179.0 KiB) TX bytes:15605 (15.2 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [Fri Mar 17 root@fogclient ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0b:ab:70:9a:21 brd ff:ff:ff:ff:ff:ff inet 192.168.1.111/24 brd 192.168.1.255 scope global eth0 valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000 link/ether 00:0b:ab:70:aa:69 brd ff:ff:ff:ff:ff:ff 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000 link/ether 00:0b:ab:70:aa:6a brd ff:ff:ff:ff:ff:ff
-
@kleanthis OK now we know the order that FOS is seeing the interfaces.
eth0: 00:0b:ab:70:9a:21 (192.168.1.111) eth1: 00:0b:ab:70:aa:69 eth2: 00:0b:ab:70:aa:6a
If this was a typical “server” I would say that eth0 is considered an add on network adapter (which is typically discovered first) and then the built in network adapters (eth1 and eth2). Where eth1 and eth2 mac address are consistent with a dual port network adapter. What’s happening is understandable.
For the sake of discussion, if you plug in the other two nics and run the
ip addr show
again, do the other interfaces get an IP address too? You may loose access via putty when you do that, but you can still key things in on the FOS console.The bigger question now is why isn’t FOS testing each network interface for an up link. Right now it sounds like its giving up after testing eth0 (LAN3) and not continuing to loop.
-
@george1421 Nope , the other two lan ports dont get an ip . Their leds dont even come on . Its like they are unplugged or not activated. As soon as i plug the cable on lan 3 lights come up.
-
@kleanthis this is kind of by design, as @george1421 was saying earlier.
It’s not caring “what” the IP address is. The idea is, get an IP address, and as soon as one is available continue forward with the tasking. Otherwise you’re waiting for each interface to come up before your tasking can start. (This can be a bad thing if there is nothing for the device to pick up).