Yet another Person with DNSMasq issues
-
Hey, everyone.
I have the fog trunk up and running and so far (knock on wood) I haven’t had any problems with re-imaging/Kernel issues/iPXE issues etc. All that seems to be working well.
This is more just wondering if this is related more to Windows Server Server 2012 DHCP server, I previously had DNSMasq working perfectly with re-imaging labs (on fog 1.2.0) but the Windows Servers were due to be updated/replaced and have since had Windows Server 2012 installed with DHCP running. As is normal I can’t actually access the settings at all to see what they changed.
As a result DNSMasq no longer injects any of my fog TFTP settings into our Desktop clients and I’m wondering if anyone else has noticed the same?I’ve currently installed DNSMasq on my fog server with the following settings from /etc/dnsmasq.d/fog.conf
Dnsmasq version 2.75
port=0 log-dhcp tftp-root=/tftpboot dhcp-boot=undionly.0,,10.254.14.77 #dhcp-option=17,/images dhcp-option-force=17,/images #dhcp-option=vendor:PXEClient,6,2b dhcp-option-force=vendor:PXEClient,6,2b dhcp-no-override pxe-prompt="Press F8 for boot menu", 3 pxe-service=X86PC, .Boot from network., undionly pxe-service=X86PC, "Boot from local hard disk", 0 dhcp-range=10.254.14.77,proxy
I’ve gone through the Wiki and also DNSMasq logs and it looks like it should be working fine; the clients just don’t respond.
Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 available DHCP subnet: 10.254.14.77/255.255.240.0 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 vendor class: PXEClient:Arch:00000:UNDI:002001 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 PXE(eno16780032) f8:b1:56:cd:89:8b proxy Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 tags: eno16780032 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 bootfile name: undionly.0 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 next server: 10.254.14.77 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 broadcast response Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 sent size: 1 option: 53 message-type 2 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 sent size: 4 option: 54 server-identifier 10.254.14.77 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 sent size: 9 option: 60 vendor-class 50:58:45:43:6c:69:65:6e:74 Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 sent size: 17 option: 97 client-machine-id 00:44:45:4c:4c:59:00:10:42:80:57:c6:c0:4f... Feb 8 15:53:06 fog2 dnsmasq-dhcp[15267]: 1456310667 sent size: 93 option: 43 vendor-encap 06:01:03:08:07:80:00:01:0a:fe:0e:4d:09:32... Feb 8 15:53:10 fog2 dnsmasq-dhcp[15267]: 1456310667 available DHCP subnet: 10.254.14.77/255.255.240.0 Feb 8 15:53:10 fog2 dnsmasq-dhcp[15267]: 1456310667 vendor class: PXEClient:Arch:00000:UNDI:002001 Feb 8 15:53:11 fog2 kernel: device eno16780032 left promiscuous mode Feb 8 15:53:11 fog2 audit: <audit-1700> dev=eno16780032 prom=0 old_prom=256 auid=0 uid=72 gid=72 ses=24 Feb 8 15:53:34 fog2 dnsmasq-dhcp[15267]: 1704800392 available DHCP subnet: 10.254.14.77/255.255.240.0 Feb 8 15:53:34 fog2 dnsmasq-dhcp[15267]: 1704800392 vendor class: MSFT 5.0 Feb 8 15:53:34 fog2 dnsmasq-dhcp[15267]: 1704800392 client provides name: PC-D-FYBWC02 Feb 8 15:54:12 fog2 dnsmasq-dhcp[15267]: 2750416450 available DHCP subnet: 10.254.14.77/255.255.240.0 Feb 8 15:54:12 fog2 dnsmasq-dhcp[15267]: 2750416450 vendor class: MSFT 5.0 Feb 8 15:54:12 fog2 dnsmasq-dhcp[15267]: 2750416450 client provides name: LT-M-07909124 Feb 8 15:54:18 fog2 dnsmasq-dhcp[15267]: 2321604315 available DHCP subnet: 10.254.14.77/255.255.240.0 Feb 8 15:54:18 fog2 dnsmasq-dhcp[15267]: 2321604315 vendor class: MSFT 5.0 Feb 8 15:54:18 fog2 dnsmasq-dhcp[15267]: 2321604315 client provides name: LT-M-07909124
Just wondering if I’m missing something?
Thanks again.
-
@RipAU said:
As a result DNSMasq no longer injects any of my fog TFTP settings into our Desktop clients and I’m wondering if anyone else has noticed the same?
What exactly do you mean by this? I guess you see clients not properly PXE booting anymore. But is there any error message? Please describe what goes wrong now and maybe post a picture of what you see. Log file and config seam fine from what I can see.
I noticed the message “device eno16780032 left promiscuous mode”. Did you capture the traffic using wireshark or tcpdump? That would be really great! Please upload the PCAP file. Use display filter
bootp || tftp
in wireshark to reduce the noise. You can save it as a new (smaller) PCAP file after filtering. -
Yeah all of our Desktops/Laptops seem to just continue with the normal Windows network booting (exactly the same if I didn’t have DNSMasq running) pointed at the Windows Deployment Servers.
The main change obviously is the Windows servers were updated to 2012. So I am assuming it is related to some setting that can/has been set?
Yeah I was running tcpdump to watch the traffic.I’ll run this again when I’m back at work in the morning and see if I can upload the PCAP file somewhere.
I guess this is my last main hurdle to get fog working perfect againThanks again.
-
Try editing out the dhcp-option-force.
Also
pxe-service=X86PC, .Boot from network., undionly pxe-service=X86PC, "Boot from local hard disk", 0
Try editing out the boot from local hard disk and put the Boot from network between " "
Does anyone know if dhcp-option-force actually works in proxymode? I know that a lot of the DHCP options don’t work in proxy mode.
edit: also make sure there are NO other config files in /etc/dnsmasqd/
-
@RipAU Also:
dhcp-boot=undionly.0,10.254.14.77,10.254.14.77
-
@RipAU said:
Yeah all of our Desktops/Laptops seem to just continue with the normal Windows network booting (exactly the same if I didn’t have DNSMasq running) pointed at the Windows Deployment Servers.
I am not familiar with the details of Windows Deployment Services (WDS) but I thought this is using proxyDHCP just as dnsmasq does. Would they interfere? Why didn’t that cause issues with the older windows DHCP?
-
@Sebastian-Roth said:
I am not familiar with the details of Windows Deployment Services (WDS) but I thought this is using proxyDHCP just as dnsmasq does.
I tried once, but when I was trying to use WDS, I did notice it sets DHCP options 066 and 067 for you if your DHCP server is a DC like ours is.
-
I tested those options and it still acts the same.
I’ve put the options back to what previously worked and just running TCPdump.I know the main Windows DHCP server used to be configured as a DHCP proxy with a Domain controller, but it did previously work.
I’m unsure how it is configured now though.Whats the best way to post up wireshark dumps on the forums?
-
I’ve attached a pcap file, I’m not an expert on tftp or dhcp but I don’t seem to be able to see any issues?
It just looked to me as if the main DHCP server is still handing out all the details and the clients are just ignoring DNSMasq? -
@RipAU Yes, we see two DHCP offer replays. One from dnsmasq (10.254.14.77) and the other one from 10.254.14.53 (probably the new windows 2012 DHCP). At first I thought those offers were fine but now I think I found something. DHCP messages have two places for pointing to TFTP servers for PXE booting. ‘next-server’ which is kind of in the DHCP header and option 66. Those offers from the windows DHCP point to 10.254.14.53 (next-server) but 10.254.14.55 (option 66).
So the client is offered three different IP addresses to get the boot file from. No wonder it is confused. I am wondering about the configuration of the old windows DHCP server. Maybe it did not offer next-server/option 66 at all? Beyond that I don’t have a good idea why it worked before the change (don’t know enough about WDS). Maybe ast your windows DHCP admins why they point option 66 to 10.254.14.55 and we might take it from there.
Edit: One possible reason I could think of is that your FOG server was installed on hardware (seams like VMware now from what I see in the packet dump) and used to answer faster than it does now. Just a wild guess…
-
Thanks, yeah I’m thinking it is a configuration of the new servers, I’ll have to give them a call and find out how it is configured.
They have a few VMs setup to handle the DC/DNS/DHCP. 14.55 is the DC and DNS I’m unsure what other roles 14.53 have.I’ve always had FOG as a VM and I’ve only had this issue since the new Windows 2012 servers are running, so I don’t think its a timing thing.
As you mentioned the old Windows servers may not have had option 66 enabled which allowed me to use DNSmasq previously.
I’ll go through the pcap files and see if I can understand it better and harass the guys in charge about why next-server and option 66 are pointing at different locations.Thanks again.
-
@RipAU You are welcome! Hope you can find out what changed or at least how to make it work again. Do they need next-server/option 66 at all on the windows side? Usually it is not a good idea to have more than one DHCP service offering “different” information anyway.
As there are requests from different clients in that PCAP file I’d suggest using this filter so you only see the stuff you need to worry about:
bootp.id == 0x56cd898b
(don’t use this filter all the time, just in this case it is very helpful) -
I’m waiting for someone to give me a call back now, to work out what exactly is set and why.
So far the DHCP/TFTP server is 14.53 so I’m not sure why they have 14.55 listed as option 66 it doesn’t actually have any TFTP or DHCP functions outside of being a DC.I’ll have to wait and see what they say. This image is a standard PXE boot and 14.55 isn’t even popping up.
Thanks, I’ll have a go with that filter.Cheers,
-
@RipAU So what did it look like with the old windows DHCP server? You never got into the “Downloaded WDSNBP” thing? Do you actually need to boot your clients from WDS at some point? You really need to think about which PXE boot you need. WDS or FOG? It’s not a good idea to try and ride two bicycles at the same time and hop from one to the other by chance. In PXE-network speak this would mean the faster PXE server is booting your client. Randomly one or the other is faster.
-
Yeah we have always had WDS and the entire windows server with DHCP/WDS etc is controlled by an outside organisation and due to policy I am unable to change it.
That said: We have had fog running and doing our imaging for quite a few years and we did previously have DNSmasq working perfectly next to Windows older windows deployment server. Just trying to figure out what has changed in the setup. Worst case I have got IPXE USB keys I can use to boot desktops but it would be handy to have a proxyDHCP working again.
I’m still waiting for feed back from the Windows admins about this and I can cross my fingers and hope.
I’ve also tested CWDS ProxyDHCP on another PC to see if it works but that has the same issue as DNSMasq in regards to the clients only going from the WDS.Thanks again.
If I can figure it out I’ll let you know just in case others run into something similar -
I’m wondering if DHCP set to authoritative would impact ProxyDHCP. If I have time and you are available we might get together and just look at some traffic from Wireshark and see what’s happening.
-
I’m happy to dump more data from wireshark?
-
Another question coming to my mind is - what if you make it work with dnsmasq? Will this interfere the other guys WDS stuff? Anyway it’s good to work together with those people instead of against them. Try to find a way to make both work. A good starting point is that it obviously used to work in your environment. Unfortunately you can’t go back in time to trace the packets and look what’s different. So I’d suggest get into the details of PXE booting and see if you can find any hints on proxyDHCP overwriting DHCP settings: ftp://download.intel.com/design/archives/wfm/downloads/pxespec.pdf
As well I just remembered seeing clients showing “PROXY IP” information on the PXE boot screen. See here: http://www.nclone.com/wp-content/uploads/2012/04/nclone-pxeboot1.png
Sure this is not the exact same firmware (the one on the picture might even be a VM) but you see what I mean. I don’t see this information on the picture you posted. Do you remember seeing that back when it all worked?? -
@RipAU Any new on this topic? Really looking forward to hear news on this.
-
@RipAU Marking this solved as I feel there is nothing much more we can do. Feel free to post again if you have more questions on this.