DNSMasq ProxyDHCP tries to pull ipxe.default from wrong IP
-
@Wayne-Workman It does work and most of the time it works just fine without issues, but usually after some network issues with the ISP we end up having IP conflicts everywhere that often take hours to resolve which involves mostly praying for good luck and not much else.
-
Are you sure your ISP isn’t providing DHCP for your network? Is DHCP being stopped at your router?
-
@Wayne-Workman The modem/router most certainly acts like a DHCP server, hence why we have issues with ISC-DHCP every once in a while. Most of the time it is able to coexist peacefully and everything (even multicast) works perfectly fine.
However, we can’t really change anything about the modem/router (believe me we were already trying to get into it to change things before we even implemented FOG)
All in all, it’s not super high priority or anything, but it would be pretty nice to see it resolved.
I know one of the devs is working on a ProxyDHCP implementation which would also allow for UEFI/BIOS combo (which I’d also like, even though even with ISC this has not really worked at all, it just boots from hard drive after pulling the ipxe.efi file)
Anyway, I do believe I have my answer to this question and I guess I’ll just have to be more patient
-
@Quazz said:
Is it possible to capture network information to the router itself?
Yes, it’s possible. You’d just use Wireshark, and then just do an
ipconfig /release
and thenipconfig /renew
in Windows to see the DHCP stuff that the router is sending. I expect you to find what @Sebastian-Roth forewarned, which is the device incorrectly sending out option 66.If that is indeed the case - you have maybe four options.
-
Figure out how to set DHCP properly on that device.
-
Figure out how to just turn DHCP off for that device and use ISC-DHCP only (best option).
-
Buy an inexpensive router that allows you to turn off DHCP (or be flashed with dd-wrt) from your nearest electronics store and put it directly behind the ISP router, then use that as your new gateway. Then you could use 10.0.0.0/16 or 10.0.0.0/24 for your internal network depending on the number of internet connected devices you need to support. I’d go with /16. (this option will increase your internet traffic latency a little).
-
Hope that @Sebastian-Roth 's ProxyDHCP works for you.
-
-
@Quazz said:
Is it possible to capture network information to the router itself?
There is always a way to capture traffic if you are willing. Depending on your network setup you might be able to configure a mirror port on the switch (if this is a managed model) or connect a hub in between to see the traffic. I would be really interested to see this packet dump!!! You can also try capturing the traffic on the client end. Connect the client to the rest of your network using a hub and capture dhcp/bootp and tftp traffic from there.
but usually after some network issues with the ISP we end up having IP conflicts everywhere
Where do the clients get their IPs from? The ISP or ISC DHCP or both? Somehow I feel that we’ve talked about this some time ago already. Or maybe it was someone else having a similar issue. One way would be to prevent DHCP messages from leaving/entering your network. Then you can handle everything with ISC-DHCP. If this is not possible (e.g. router is provided and controlled by the ISP) then I think you are better off using proxy DHCP. Right now dnsmasq is the official way to do this. But it’f not very good at booting UEFI devices. I am working on a JavaScript/node.js implementation but it’s still alpha stage.
But even using the proxy DHCP you might run into trouble. At least if I am reading correctly between the lines of what you wrote (some clients/NICs work and some are not). I guess you need to get into action and start capturing traffic to see what’s really going on DHCP-wise. I am happy to assist if you upload packet dumps here.
-
Okay, so, I did some wireshark capturing and when ISC-DHCP-SERVER is running, all devices seem to receive their DHCP info from ISC-DHCP.
So I guess the main reason the issue of IP conflict comes up is when it’s unable to communicate with the rest of the network which typically happens when the router has issues as that is referred to as the DNS. So I guess for ISC-DHCP if I can resolve that I would be fine.
Also tried to capture traffic directly to the router and next-server option seems to be set as 0.0.0.0 according to wireshark. However option 63 is not passed along. (does this mean it defaults to the router’s ip or that it’s not set?)
Thanks for the help already, guys.
-
Relay agent IP being 192.168.1.1 makes me think this might be a DHCP answer from your ISP. But this is just a wild guess. Would you mind uploading the full packet dump (use display filter
bootp || tftp
and then save those packets to a new PCAP file) so I can have a closer look? You can also send me a chat message with a link if you don’t want to publicly upload the file - although there should not be any reason to be concerned as this seams to be all private IP addresses! -
-
@Quazz Intersting one I only see one DHCP conversation in that dump file and no TFTP traffic.
The DHCP discover and request packets have option 60 (Vendor class identifier) set to ‘MSFT 5.0’ which indicates this packets are coming from a windows client. They are not pxe boot requests as they would have ‘PXEClient’ set as vendor class if the NIC would try booting via PXE! As a result the DHCP server answering those does not send option 66/67 or next-server/filename which is just fine because the client didn’t ask for PXE info.
Where exactly did you capture this traffic? Did you see the client booting up via PXE? My guess would be that you captured traffic on the “router side” and therefore missed the PXE DHCP traffic to the FOG server. But wait? Clients send their very first DHCP discovery and request to 255.255.255.255 broadcast. We should see those! I am confused.
-
@Sebastian-Roth I may have dun’goofed.
I just captured this on Windows, as you already know.
I suppose I should set up a debug task and capture it like that, then filter it later, right?
Should dnsmasq be enabled when doing this? Should ISC-DHCP?
EDIT: I also noticed dnsbootimage was set to 192.168.1.1 in .fogsettings . I don’t really know what that setting does, do you reckon that should be switched out?
I set up a DNS server in the mean time anyway, as it was a pain to not be able to access our internal network during network outages.
Thanks
-
@Quazz Edit the default.ipxe file?
vi /tftpboot/default.ipxe
From the sounds of it, the ipxe boot process is set correctly, but the default.ipxe file may have the wrong IP to get the information from to begin with.
For upgrade/install fixing, yes edit the /opt/fog/.fogsettings particularly the ipaddress= line.
-
@Tom-Elliott Just checked and all those are pointing to the FOG server, which only makes sense since it works with ISC-DHCP and on some clients with dnsmasq ProxyDHCP.
And seeing as it looks on the wrong IP address for the default.ipxe, I’m not sure changing anything in that file would help. (even so everything looks normal in it)
-
@Quazz Could you please try capturing the traffic on your FOG server. Start dnsmasq and stop ISC-DHCP service. Install tcpdump (package is called tcpdump on redhat/centos/fedora and debian/ubuntu) and then start it:
sudo tcpdump -i eth0 -w pxeboot.pcap port 67 or port 68 or port 69
Leave this command sitting there and startup one of your clients. When you see the FOG boot menu you better shutdown the client, go back to your FOG server and stop tcpdump with Ctrl-c. Please upload this dump file here. -
@Sebastian-Roth I did as you say, the file will be lower in this comment. First though, something strange happened on the iPXE load screen that I have not seen before (perhaps related to upgrading FOG to latest earlier). It said it got info from both the dhcp server and the proxydhcp (and then selects proxyDHCP correctly).
Anyway, here’s the dump file:
Thanks
-
@Quazz Packet two is concerning…
Maybe try this?
port=0 log-dhcp tftp-root=/tftpboot dhcp-boot=undionly.kpxe,192.168.1.156,192.168.1.156 dhcp-no-override pxe-service=X86PC, "Boot from network", undionly dhcp-range=192.168.1.156,proxy
-
Will be interesting to see if Wayne’s suggestion will make a difference. From my experience the clients don’t care about ‘server host name’ being set or not! But give it a try and let us know.
For further reference here in this thread I try to describe what I see in the pcap file. Overall this looks pretty good to me. Client (‘vendor class’ = PXEClient…) broadcasts a DHCP discovery and gets two answers. One from the router (192.168.1.1) offering IP, netmask, DNS server etc. The other answer comes from dnsmasq running on the FOG server (192.168.1.156) and provides PXE boot information next-server pointing to itself and filename = undionly.0
Then the client sends a DHCP request to confirm the IP information and is presented a DHCP ACK from 192.168.1.1. All fine from what I can see. Then the client (192.168.1.26) requests undionly.0 via TFTP.After that first round of DHCP/TFTP from the NIC ROM we see another DHCP communication with very similar information being exchanged. This is iPXE requesting an IP. Looks good as well! Then iPXE receives default.ipxe
To sum things up: To me this looks pretty good! Now can you please try the exact same thing but bootup one of the clients that does not work…
-
I work in a small repairshop, things come and go at a fast pace, as soon as I come across one that doesn’t work again I’ll update this.
-
I have not come across this issue since starting this thread (I did update FOG somewhere after the start of the thread which included updates for the binaries)
At worst it will ask me to enter the IP address which is a small price to pay for peace of mind.
So I guess this can be considered solved for now.