No DHCP after PxE Menu
-
Hello there,
OS : Centos 7
FOG Version : 1.5.5
bzImage Version: 4.19.1
bzImage32 Version: 4.19.1It’s been 2 month now I have a serious issue deploying Images from my fog server
Explanation :
My FOG Server is on a particular VLAN on my network and was configured as a DHCP server ( I use it from 3 years now)
1st I plug a computer on my network in order to image it
PXE Boot ==> OK
Pxe Menu ==> select inventory (or deploy image for an existing computer in my DB) ==> No DHCP response on interface eth0, skipping it
Failed to get an IP via DHCP ! Tried on interfaces (s) : eth0
Before christmas all worked perfectly ==> a dozen of computers were imaged and I also captured 2 new imagesMy network expert affirm it doesn’t come from the equipment, ok so why it doesn’t work now ?
I read something about spanning tree but they say no
I’m convince it’s a network issue but I cannot prove it
Can you help me please ?Contrary to totoro in a previous post I don’t have the same response in the debug mode :
-
Lets start out with 2 questions:
- If you place a functional windows computer on this same vlan does it get an IP address?
- If you place a cheap/dumb/unmanaged switch between the pxe booting computer and the building network switch can you image every time?
My experience is telling me if you don’t get an ip address during FOS startup, but can issue the udhcpc command at the command prompt and get an IP address then its probably a spanning tree issue. Question #2 from above would prove that out.
-
1\ Better than that : I can deploy snapin all over my company network and, of course if I place a computer in this vlan it gets an IP
2\ I have to test this but I don’t have such switch availableI am really convinced of a network problem (snapping tree for example), but I can’t prove it and it annoys me because without it, I can’t ask the network team to solve my problem.
-
@Seydoo said in No DHCP after PxE Menu:
I am really convinced of a network problem (snapping tree for example), but I can’t prove it and it annoys me because without it, I can’t ask the network team to solve my problem.
Well that dumb switch is the easiest and best way to prove it. What I suspect is going on here is as the pxe booting computer boots, it will momentarily drop the network link ask each kernel hands off control to the next one. You will see this network “wink” when the PXE boot ROM hands over control to the iPXE boot kernel, and then again when iPXE hands over control to the FOS kernel (bzImage).
Now where spanning tree comes into play is if standard spanning tree is used, standard spanning tree uses pessimistic blocking in that it won’t forward any data until 27 seconds after the link goes up. During this 27 seconds its listening for duplicate BPDU packets. If it hears none then it starts forwarding data. If one of the fast spanning tree protocols are used (fast-stp, rstp, mstp, etc) they use optimistic blocking in that they start forwarding right away while listening for a duplicate BPDU packet.
The problem is this, if standard spanning tree is used, that 27 second delay before forwarding to too long of a time. FOS boots and is ready to go so fast, but the time 27 seconds come and the port starts forwarding data FOS has already given up trying to get a network address. But if you boot into debug mode and issue the udhcpc command, that is probably after the 27 second timeout and it gets an IP address like it should.
That is at least that’s the way I see why it’s not working.
-
Ok I’ll try to get a dumb switch and test it in the network core
-
some news
I plug a computer on the switch which directly connected to the physical server and… it works
so now… I have to find where the issue is -
@Seydoo said in No DHCP after PxE Menu:
I have to find where the issue is
So the questions are now:
- Is the switch connected to physical server the same model as where you have problems?
- Are the two switched configured the same way. I can understand that switches where end users are you would want to have spanning tree enabled, but in the data center when there is no chance of a loop back its not needed.
-
@Seydoo said in No DHCP after PxE Menu:
I plug a computer on the switch which directly connected to the physical server and… it works
so now… I have to find where the issue isSpanning tree (talk to your network guys and ask them to set the ports to “port fast”) or some kind of EEE (ethernet energy saving) stuff. Just my guess on what could be the issue.
-
spanning tree, spanning tree la la la la la (jingle bells)
I’m saying this from 2 months
I’m glad to read you saying it ^^