UEFI pxe boot problem from a network
-
@lebrun78 I’m going to have to look into this, but I have to ask the question why does the dhcp servers have two different IP addresses? Those each are listed in the pcaps.
-
This post is deleted! -
Yes, we don’t have dhcp relay, the dhcp server have several virtual network interface, one on each vlan.
-
@lebrun78 I can’t see from the config how/why its sending out the wrong router address unless something in
include "/etc/dhcp/vip.conf";
is doing it.Wait, there is something strange going on here. Look at the base address and the subnet mask as defined.
subnet 148.60.10.0 netmask 255.255.255.0 { ########################################## option domain-name-servers 148.60.15.109,148.60.15.106 ; option domain-name "istic.univ-rennes1.fr" ; option routers 148.60.10.254 ; option subnet-mask 255.255.255.0 ; default-lease-time 600 ; max-lease-time 1200 ; group { # On commente les deux lignes suivantes pour éviter le menu de Fog next-server 148.60.4.1;
But look at the pcap what the client is being told.
As you see in the picture the client is being told that its subnet mask is 255.255.248.0, but your config files says 255.255.255.0. The client is being told the router is 148.60.7.254 but your config file says 148.60.10.254.
So I’ll ask you the same question again in a different way. Is dhcp server 148.60.10.252 and 148.60.4.3 the same computer? If it is do you have 2 different instances of isc-dhcp server running, where each instance is bound to a different network interface? Something is strange with the 148.60.10.252 dhcp server.
-
So I’ll ask you the same question again in a different way. Is dhcp server 148.60.10.252 and 148.60.4.3 the same computer?
YES
I have only one dhcpd.conf file, sone only one instance of dhcpHere is what I get on the same machine on vlan 148.60.10.0/24 when windows is loaded:
It’s crazy, no ?
-
I have made 2 boot on the windows machine, UEFi pxe boot and hard drive boot.
I get this logs in my dhcp server:Apr 6 09:46:02 sybille2 dhcpd: PXEClient:Arch:00007:UNDI:003016 Apr 6 09:46:02 sybille2 dhcpd: DHCPDISCOVER from 10:65:30:83:5c:4b via em2.10 Apr 6 09:46:03 sybille2 dhcpd: DHCPOFFER on 148.60.10.198 to 10:65:30:83:5c:4b via em2.10 Apr 6 09:46:05 sybille2 dhcpd: PXEClient:Arch:00007:UNDI:003016 Apr 6 09:46:05 sybille2 dhcpd: DHCPREQUEST for 148.60.10.198 (148.60.10.252) from 10:65:30:83:5c:4b via em2.10 Apr 6 09:46:05 sybille2 dhcpd: DHCPACK on 148.60.10.198 to 10:65:30:83:5c:4b via em2.10 Apr 6 09:46:41 sybille2 dhcpd: MSFT 5.0 Apr 6 09:46:41 sybille2 dhcpd: DHCPDISCOVER from 10:65:30:83:5c:4b via em2.10 Apr 6 09:46:42 sybille2 dhcpd: DHCPOFFER on 148.60.10.190 to 10:65:30:83:5c:4b (MININT-S9D1BSU) via em2.10 Apr 6 09:46:42 sybille2 dhcpd: MSFT 5.0 Apr 6 09:46:42 sybille2 dhcpd: DHCPREQUEST for 148.60.10.190 (148.60.10.252) from 10:65:30:83:5c:4b (MININT-S9D1BSU) via em2.10 Apr 6 09:46:42 sybille2 dhcpd: DHCPACK on 148.60.10.190 to 10:65:30:83:5c:4b (MININT-S9D1BSU) via em2.10 Apr 6 09:46:42 sybille2 dhcpd: Unable to add forward map from MININT-S9D1BSU.istic.univ-rennes1.fr to 148.60.10.190: not found
The same machine gets to differents IP, 148.60.10.190 and 148.60.10.198 at 09:46:03 (pxe booot) and at 09:46:40
-
@george1421 said in UEFI pxe boot problem from a network:
I really don’t understand how this is possible. I can understand the dhcp server giving its a new IP address as its booting. I’ve seen it before. What I don’t understand is how it would give it information that is not from its pool. That is totally confusing. If it was giving the complete information from the wrong pool I might understand, but the original pcap has the right IP address range and the wrong router and subnet information.
Can you grab a pcap from a witness computer on this vlan 10 using this new capture filter. PXE boot it to the error and then let it boot into windows. I want to see the response from both dhcp requests.
port 67 or port 68 and ether host 10:65:30:83:5c:4b
The only thing I can think that we might do is create a second instance of the dhcp server, adjust the dhcp config files accordingly, and then bind each instance with config file to the proper interface. Your setup is not a traditional one using dhcp helper services and a single dhcp interface on the server. Its possible that the dhcp server is getting confused to where the bootp request is coming from. Right now I’m just grabbing at ideas, because what you are reporting should not be.
-
@george1421
Hello Geoge
this morning I made a test by reversing the places of the declarations of the subnet.
In fact the client recovers the mask and router of the first declared subnet… -
@lebrun78 Hey, I was just comparing your dhcp config file with an example ubuntu dual interface example. I loaded your configuration into notepad++ and it pointed out you have an extra curly brace at the end of your config file. I don’t know if this was a type-o when you pasted it in or you do have an extra curly brace in the config.
-
@george1421 Never mind, I just got excited for finding nothing. Still looking into the setup.
-
@george1421
I have just done the search of extra curly brace with notepadd++, I didn’t see the problem.The file in first post is an extract, you can view the production file here:
https://filesender.renater.fr/?s=download&token=11cc357f-4663-41c8-830b-71938d2d2aa7 -
@lebrun78 I just had an idea. Maybe this is caused by a problematic entry in the DHCP leases cache file?? Take a look at
/var/lib/dhcpd/dhcpd.leases
. Not really sure what we are looking for but you might search that file for pattern148.60.10.
to see what leases are in the store. If you find something concerning than I would stop dhcp service for a second, make a backup copy of that file, edit and remove the problematic entry and start dhcp service up again.As you seem to have a lot if fixed addresses defined you might not even care much about the leases. In that case you could even clear the whole leases file (stop dhcp before) and see if it makes a difference.
-
@Sebastian-Roth
Hello
I have blanked the lease file. At the reboot of the client, same problem.
Here is the actual lease file:cat dhcpd.leases # The format of this file is documented in the dhcpd.leases(5) manual page. # This lease file was written by isc-dhcp-4.2.5 server-duid "\000\001\000\001&\036\337\215P\232L\202P~"; lease 148.60.10.180 { starts 2 2020/04/07 06:53:04; ends 3 2020/04/08 06:53:04; cltt 2 2020/04/07 06:53:04; binding state active; next binding state free; rewind binding state free; hardware ethernet 10:65:30:83:5c:4b; set vendor-string = "PXEClient:Arch:00007:UNDI:003016";
-
@george1421
So here is a capture with 2 request, on pxe at time 0 and at time 196 a usb boot ubuntu.
When ubuntu loaded, ip a give good IP adress, good router and good netwask
capturedhcp.pcap -
@lebrun78 I have some good news and some bad. The good news is I found some time to setup a VM to try to replicate your setup and play with it. Found this is happening in my test setup as well!
Bad news is that I have not found why this is happening yet. But I am fairly sure I will! Stay tuned.
-
@Sebastian-Roth
Thank you very much for your work ! -
I found the solution of the problem
After a post on dhcp-users-request@lists.isc.org, Niall O’Reilly proposed to declare hosts out of the subnet.
And effectivly , now, hosts are declared in group but not in the subnet and it works .Thank you very much for your help, Sebastian and George.
-
@lebrun78 Would you mind linking to the exact post you found. Since this issue isn’t specifically related to FOG it would be nice for others if they have the same problem to share in your success.
-
@lebrun78 Good to hear you have figured it out! I have played with it a bit last night but didn’t find a solution yet.
Niall O’Reilly proposed to declare hosts out of the subnet.
Sounds interesting.
-
Here is the post:
https://lists.isc.org/pipermail/dhcp-users/2020-April/022039.html