Yet another PXE-M0F error topic!
-
At this point I’m not sure of your configuration. Dealing with clients on different subnets from the fog server and dhcp server are not the easiest to debug.
Your pcaps are not complete and what is there is a bit strange to say the least. Understand I’m not saying you captured them wrong, they are not what you might typically see. If they are accurate, I can understand why pxe booting is not working for you. One other comment is around the boot file name. undionly.kpxe is for bios (legacy) mode computers. A uefi based computer can not boot with a bios (legacy) mode boot file. For uefi computers you will need to send the ipxe.efi boot file name. The point is that dhcp servers that use static boot file names are now problematic.
A proper dhcp sequence goes
- Discover (client to world)
- Offer (dhcp(s) to client)
- Request (client to dhcp server)
- ACK (dhcp server to client)
(PXE) - client requests boot image size from tftp server
- tftp server responds
- cleint asks for file.
What I would recommend is a few things. Lets go back to your original configuration with the company managed dhcp server. Since the fog server and pxe booting clients are on different subnets you shouldn’t need to change your fog server. You might shut off dnsmasq and/or isc-dhcp server if you turned them on.
Now take a computer on the same subnet as where you are trying to pxe boot the computer. Install wireshark on it and use the same capture filter. Then pxe boot the target computer. You will be capturing the conversation between the pxe booting computer and the dhcp server. We need to find out what main dhcp server is telling the pxe booting client. I suspect something not what we expect.
Also make sure the target computer is in bios (legacy) mode so it will boot the undionly.kpxe file. We can get this going, we just need to understand what is happening on your network.
-
I am pretty sure I had Dnsmasq service disabled, but just to be on the safe side I am doing a complete nuke n’ pave as I’m typing this. I already have Wireshark setup on the same subnet. I shall post the results as soon as I get them.
-
First of all, thank you!
Latest fresh FOG install config:
* Here are the settings FOG will use: * Base Linux: Redhat * Detected Linux Distribution: CentOS Linux * Server IP Address: 10.254.10.29 * Server Subnet Mask: 255.255.255.0 * Interface: ens192 * Installation Type: Normal Server * Internationalization: 0 * Image Storage Location: /images * Using FOG DHCP: No * DHCP will NOT be setup but you must setup your | current DHCP server to use FOG for PXE services. * On a Linux DHCP server you must set: next-server and filename * On a Windows DHCP server you must set options 066 and 067 * Option 066/next-server is the IP of the FOG Server: (e.g. 10.254.10.29) * Option 067/filename is the bootfile: (e.g. undionly.kpxe) * Are you sure you wish to continue (Y/N)
Attached are 2 files.“Output 4” is from the FOG machine itself and the other one (67,68,69,4011Capture1) is from Wireshark scanning the same ports at the same time.
0_1525439990605_output4.pcap
0_1525440047448_67,68,69,4011capture1.pcapEDIT: I did some additional testing and I’ll post these results as code snippets for easier referencing:
FOG machine on a 10.254.10.1/24 subnet
0_1525443426142_output5.pcapHost machine subnet 10.252.80.0/22
0_1525443818489_67,68,69,4011capture2.pcapAs far as I myself can understand there is no actual PXE communication going on in the 10.252.80.0/22 subnet.
-
@sven-ervin again your pcaps are strange (not that you are doing anything wrong collecting them).
If I had to guess your VLAN/subnet router is not using broadcasts to relay dhcp information back to the client, but using unicasts from the dhcp relay back to the target computer. In this case wireshark will not see unicast messages unless the wireshark computer is on a mirrored port to the pxe booting client (or you happen to know of a network hub still in existence at your computer. A hub would mirror all traffic to all network ports).
So from here on out I’m going to just read the tea leaves here.
- Your pxe booting client is a bios/legacy mode system (dell to be specific)
- The client computer is being told to boot
boot\pxeboot.com
which is a WDS/SCCM boot loader. - WDS/SCCM uses a proxydhcp configuration (its akin to dnsmasq). So its settings are/will override anything you set in the dhcp server option 67.
So what can you do? It will be hard to overcome if wds is running in your network. You will need to configure your dhcp relay on your subnet router to no send dhcp boot information from the pxe booting subnet to your wds server. That way it won’t respond and take over the client.
You should also check with your networking team to see why/how dhcp boot file
boot\pxeboot.com
is being sent to the target computer. They may have insight on this. -
Well… that sucks
Alright, well, I’m just going to try putting the FOG machine in the same subnet with the host machine and see if that is going to change anything.
EDIT Deleted rant
-
@sven-ervin One way to “get round” (I don’t like the term since it implies deception) the issue is to create a dedicated deployment network. In this case your fog server would have its roles changed a bit. Your fog server would have 2 network adapters. One would be connected to your isolated deployment network. and one to your business network.
The fog server would then image on the dedicated imaging network. You could manage the fog server from the business network. The fog server would supply dhcp and pxe boot info only to the imaging network. If your clients need to connect to your AD during OOBE then you would need to configure your fog server as a NAT router so that traffic could flow between the imaging network and the business network without needing any modifications to your business network infrastructure.
In this configuration once the system was imaging you could then move it from the imaging network to the production network without issue. To use the fog client properly you will need to use the dns name of the fog server and create what is know as a split horizon dns. For example lets say your fog server FQDN will be known as fog1.domain.com. On the business network that FQDN should map to the business network interface of your fog server. On the dedicated imaging network you will need to install a dns server on FOG and then create an entry for fog1.domain.com to point to the imaging network interface of the fog server. At least that is how I think it should work.
I know its a lot of fiddling around, but if you have a locked in environment you have to be a bit creative to be functional.
-
@george1421 Alright, I will be sure to keep that in mind! But As today is Friday and I’ve already stayed int almost 3 hours too long than I’'ll try to get some rest over the weekend and have a new crack at it on Monday.
Huge thank you for all your help!
-
@george1421 Now that I’ve read through your last response again, there is one thing that is confusing me. If i create this dedicated imaging network, then how would I point host machines to PXE boot on this network?
-
@sven-ervin You would have to physically move that machine to the imaging network to install your image on it. The problem is your networking infrastructure not FOG at the moment. This is only guessing since we can’t see what the client is being told, but my guess is that you have a WDS or SCCM server on your campus that is currently configured for imaging.
-
@george1421 Oh yeah, of course! My bad. I totally agree on the fact that it’s a problem with our infastructure. As far as I can tell FOG is doing what it is supposed to do just fine.
I also shot an email to our network administration explaining the whole situation. Let’s see where that lands us.
-
A little bit of an update:
I ended up contacting our “helpdesk” where my problem got directed to the coolest guy I have ever had the opportunity to work with.
We wound up installing FOG-server on another machine on the same subnet as the hosts (10.252.80.0/22). After doing this everything works as expected.
We are still working on getting it to work across different subnets. Is there any possibility that the FOG server configuration might be at fault when dealing with communication across subnets?
-
@sven-ervin If you have the default route configured correctly on the fog server that is all you need to make imaging work across subnets from the fog server side.
For the remote subnets, you need to ensure that the pxe boot options are being sent to target computers so they can locate the FOG server.
Issues you will have when the FOG server is isolated from target computer because of a router.
- Multicasting will not work unless you setup a multicast router.
- WOL may have some difficulties.
-
Hi once again!
Our problem, as it stands today, is solved. We ended up trying to install the FOG server on the same subnet with the host machines, which worked. After that we tried putting the machine back on the subnet where all our VMs are located. That broke it. Later the network admin realized that during his configuration he accidentally left a space after the IP on option 66. After that was fixed things stated working out.
So in the end what I was chasing was a typo.
A big thanks to @george1421 for being as helpful as You could during the troubleshooting process.
I am now happily deploying my Windows 10 images.