FOG DHCP problems with possible printer interference?
-
This appears to be a spanning tree issue to me. Initially the workstation gets a DHCP address because it’s pxe booting and the iPXE kernel makes it to the target, but that’s when things goes sideways. iPXE can not pick up a dhcp address and it fails. BUT if you issue a few commands in the iPXE kernel you have the FOG menu.
What is happening here is that PXE boots, and then when the iPXE kernel starts up it winks (monetarily turns off and on the network link) which causes the switch to start the spanning tree counter again. The port will stay in a listening state for 27 seconds then start forwarding data. To lighting fast FOG, 27 seconds is an eternity. FOG has already given up and gone to sleep by the time STP starts forwarding data. This is a function of the switch and not the PC or any of FOG’s sub-components.
A quick check for spanning tree issues is to just put a dumb (unmanaged) switch between the building switch and the pxe booting computer. If the target computer boots to the fog menu then you found the issue.
Now fixing the issue, you need to turn on one of the fast STP protocols like (fastSTP, portfast, RSTP) to eliminate this issue while keeping the benefits of spanning tree enabled.
-
Hey Tom.
“A rogue DHCP server”. I don’t see a second DHCP server anywhere on our network, that I’m aware of. Is wireshark one of the only ways to see if there’s a rogue/secret DHCP server hiding somewhere?
I think there’s a switch in between, but if I even find that switch, would I have to do any type of forwarding with that?
-
@afriedman said in FOG DHCP problems with possible printer interference?:
Hey Tom.
“A rogue DHCP server”. I don’t see a second DHCP server anywhere on our network, that I’m aware of. Is wireshark one of the only ways to see if there’s a rogue/secret DHCP server hiding somewhere?
Yes, wireshark will tell you this. Since dhcp is broadcast traffic, you just need to attach your wireshark computer the subnet where your target computer is and then set your filters for
port 67 and port 68
then pxe boot your target computer. You will see a “DHCP Offer” packets from all of the dhcp servers that can hear the initial client dhcp request.But in this case I still think its a spanning tree issue.
-
@george1421 Spanning tree issue? Please explain.
-
@afriedman said in FOG DHCP problems with possible printer interference?:
@george1421 Spanning tree issue? Please explain.
I thought I did in my first post??
If you are not using a fast spanning tree protocol the switch port won’t start transmitting data until 27 seconds after the link comes up.
-
@Joe-Schmitt I think that Sebastian was working on one too using node-js. I’m not sure if is the same one you were working on or not.
-
@george1421 sorry I didn’t see your original post - ill look at that.
@Joe-Schmitt sounds good. I’ll be waiting for your answer.
-
@Joe-Schmitt Thank you very much for this program.
When I run this program, I should then turn on a different machine in the same area and look at the results on the computer where im running your program?
-
@afriedman Nope, the program will simulate a computer booting up requesting PXE information and capture who responds and with what.
-
@Joe-Schmitt Ahhhhhh okay. I’ll let you know the results very soon!
-
@Joe-Schmitt Side note: it will need to capture at least two if not more offers from dhcp servers. If we are running dnsmasq you will get two offers right away one from the dhcp server and one from the dhcpProxy server.
-
The bottom image is the first half, and the top image is the second half.
-
@Joe-Schmitt You weren’t expecting that outcome? Lol interesting.
-
@Joe-Schmitt Oh alright. Well it’s a pretty neat program.
@george1421 I’m going to try to talk to Cisco Technical Support either today or tomorrow about having them remote into our switch and turn on one of the fast STP protocols. I’ll let you know the results, unless you’d prefer I do something else before talking to Cisco.
-
@afriedman As I said, if you place the dumbest switch you can find (that’s still functional) between your cisco switch and the target computer. Then pxe boot the target computer, if you can get to the fog menu where you couldn’t without the dumb switch, then its most likely a spanning tree issue.
I can say typically they would turn on one of the fast STP protocols by default (just for this reason). There have been documented cases of target computers not getting dhcp addresses because of this.
-
@afriedman We’ll just taking with Joe through chat. What he’s seeing and what I thought I say was too different things.
It would be helpful if you can capture a pcap of the pxe booting process.
Please do the following (assuming your fog server, dhcp server, and pxe booting clinet are on the same subnet).
- Install tcpdump on your fog server
- Launch the tcpdump program with this command
tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4011
- PXE boot the target computer until you get the error
- Press ctrl-c to exit out of the tcpdump program
- Upload the pcap file here for review.
-
Amazing!! I just used a dumb switch in between that trouble computer, and it booted to FOG INSTANTLY, no hesitation.
Still want me to install tcpdump and follow your instructions for it?
-
@afriedman Yes please that would help understand the data that joe’s script is spitting out.
But you know you need to talk with your network group about the switch configuration too, we kind of have two threads running inside this one.
-
Sounds good. I’ll try to run that when I do some work from home tonight.
Yeah I’m waiting on some responses from her. I believe I’ve narrowed it down to 1 specific cluster of computers (about 24-28 of them) in our building. All other 90% of computers see the FOG server with no problems.
I’ll post an update when I can.
-
Sorry about the delayed response, I was out sick yesterday.
Do you want me to PXE boot a computer that isn’t booting to FOG with the tcpdump program? Or PXE boot a computer that is working with FOG?