Coexistence of UEFI and Legacy - Troubleshooting help



  • Hi.

    So, my situation is as follows:
    In our network we have two Domain Controllers, one based on Win Server 2012, the other on Server 2008.

    I read the wiki in regards to making UEFI and Legacy coexist and have all the settings set according to the guide there - and I saw that it says it wont work on Server 2008. In general though, every change I make on our DC1 (the one with Server 2012) will be taken up by the hosts when booting without me having to change the settings on DC2 (the Server 2008), I guess its prioritized that way…

    When I try to boot a Non-UEFI PC via the LAN, it tries to get the ipxe.efi file - which is too big for its memory - and aborts the boot process. Even though the Policies are in place to send the fitting ipxe.efi or undionly.kpxe to the host depending on Vendor ID.

    Can anybody give me a hand and help me find a possible solution?

    I made a tcpdump that might say something - https://drive.google.com/file/d/0B0TWuKtXovQOaEJTcDI4MFVHZVk/view?usp=sharing - but I cant seem to understand it …

    Thanks for any support you can give.



  • Already enabled and works like a charm. :)

    Can be closed, thx for helping out.


  • Moderator

    @taspharel No worries, its always good to know that neither of us are insane where you know what should be and we found what is according to wireshark.

    I assume they will reenable your policies? You kind of need them to support dynamic booting. Well that’s not totally true, if you can’t have the policies in your MS dhcp server you can run an external proxy dhcp to supply the missing pxe boot information. The point is there are other ways to achieve what you need.

    Can we close this issue?



  • I am so so so sorry.

    Just came out of a call with our external IT partner and casually he said: Oh btw.: I deactivated your Policies on the DHCP server.

    And I didn’t notice of course … so sorry for wasting time, it works now. :(



  • Okay, thanks.

    So, We have two DHCP Servers, one gets its settings from the other per replication. So both should have the identical settings.

    I’m refering to: https://wiki.fogproject.org/wiki/index.php?title=BIOS_and_UEFI_Co-Existence

    Our new DCs are Windows Server 2016.

    Screenshots from my dhcp server settings are in the same folder as before, the details for the other vendor classes are the same.

    I added Policies with a * in front of the vendor class because i hoped that would change something, it didnt work without those either though.


  • Moderator

    @taspharel

    Issue 1: This looks like a normal dhcp / pxe boot process (as in the flow). There is only one dhcp server involved here [192.168.43.3]. Its saying the boot server is 192.168.43.17. Its passing undionly.kpxe to the target computer. The target computer downloads this file from the tftp server and halts. The issue I see here is that your dhcp server is passing undionly.kpxe to a computer that has identified itself as a uefi BC arch 0007 type. It appears your filters on your dhcp server is not honoring the client request.

    Issue 2. This computer is talking with 192.168.43.2 (Different than issue 1!!). The client is identifying itself as a bios (legacy) PC. 192.168.43.2 is saying the boot server is 192.168.43.17 with a boot file of undionly.kpxe This client appears to be a microsoft VM(??) later on I see that it loads default.ipxe so this one appears to be working.

    So now it appears you have 2 AD DCs, each are providing dhcp information. with only 2 samples it hard to say they are setup correctly. It appears the bios systems are booting OK, its just the uefi systems. The systems are properly reporting itself. The dhcp server is not honoring this.

    So now we need to identify a few things.

    1. What is your dhcp server OS, is it 2012 or newer?
    2. What wiki page are you referring to
    3. Do you have both dhcp servers configured the same.

    If you have both dhcp servers setup the same, you need to inspect your rules to ensure arch 0007 and arch 0009 are being setup correctly.



  • @george1421 The pcap files are here:
    https://drive.google.com/drive/folders/0B_yYSiicYNIAMmNKaGoyOG0wd0k?usp=sharing

    issue1 is from the boot attempt of a UEFI only Lenovo X270 machine (internal name is NBW10070 (or the old one would have been NBW7070) that doesnt boot via LAN in the settings we got.

    issue2 is from a desktop (WSW10074, or old: WSW7074) that successfully boots with the settings we have atm.

    Settings are: According to wiki with undionly.kpxe set as Option 67 without Vendor Classes. IP Adresses for the new DC servers end with .2 and .3

    @Wayne-Workman The old servers have been decomissioned yes.


  • Moderator

    @wayne-workman Ideally that would be the best solution. Practically it would be pretty tough to program because some times the dhcp server and pxe client do different things. There is not a clear cut way to pxe boot. Sometimes the dhcp server will reply with the next server in the header, and the boot file in dhcp option 67 instead of the boot file in the header. Yes you could code for that, with a skilled tech you can look at the pcap file and understand what is going on pretty quickly even with the variations. I know that we do get a lot of tickets over pxe booting not specifically related to FOG, but more about the environment the FOG Admin is trying to pxe boot in.

    I guess in the end I don’t know if it would be a useful tool or not.


  • Moderator

    @george1421 As often as we review PCAP files - I wonder if it would make sense to just write a tool that does the same things… like identifying all DHCP servers, identifying information about all DHCP discovery packets, identifying what DHCP server answered, what options it provided.


  • Moderator

    @taspharel We absolutely need a short packet capture as George said. Did the old DHCP servers get decommissioned?


  • Moderator

    @taspharel Well to fully understand what is going on using the fog server or wireshark you can capture the dhcp pxe booting process: https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue

    The above works best if you have the fog server, pxe client and dhcp server on the same subnet. I would put your dhcp policy back in place as they are listed in the wiki. I’d be interested in what the client is sending in the dhcp discover packet as well as the response from the dhcp server.

    If you are not familiar with reviewing a pcap file, upload it to a google drive and post the link here or send me an IM and I will look it over.



  • Hei again.

    So, we finally got two shiny new virtual machines running Windows Server 2016.

    I have them set up with vendor classes as described in the wiki, even added the specific vendor class from one of our X270 Lenovo Laptops, but no luck.

    If I set undionly.kpxe as DHCP Policy Option 67 (without Vendor Class) then that is sent to each and every client trying to connect, regardless of UEFI or LEGACY settings on the host.
    If I remove that and only keep the Vendor Class based policies my UEFI machine goes to: Start PXE over IPv4, hangs there for some time and then reverts to the bios boot menu.

    Anybody have an idea how I would go forward from this point?


  • Testers

    @george1421 said in Coexistence of UEFI and Legacy - Troubleshooting help:

    @psycholiquid This is a bit off topic, but I see in your snapshot of “Here is a screenshot of my setup” you are using the ipxe7156.efi. I just saw a pull request on git hub that the “7156” flavor of kernels were being removed from the FOG distribution because the latest ipxe.efi version (in 1.5.0 branch) is now working with the surface pros.

    1. Just be aware that this kernel is being removed from the package and you will need to update your dhcp setup.
    2. You probabaly should confirm that the latest version of ipxe.efi (pulled from the working branch) does what YOU need it to because ipxe7156.efi was left in the distribution to address an issue with the surface pros and not VMWare.

    I’m not saying its a problem. I’m only trying to raise awareness that there was a change and based on your configuration you may have issues.

    I know I was part of the testing process for the whole thing.


  • Moderator

    @psycholiquid This is a bit off topic, but I see in your snapshot of “Here is a screenshot of my setup” you are using the ipxe7156.efi. I just saw a pull request on git hub that the “7156” flavor of kernels were being removed from the FOG distribution because the latest ipxe.efi version (in 1.5.0 branch) is now working with the surface pros.

    1. Just be aware that this kernel is being removed from the package and you will need to update your dhcp setup.
    2. You probabaly should confirm that the latest version of ipxe.efi (pulled from the working branch) does what YOU need it to because ipxe7156.efi was left in the distribution to address an issue with the surface pros and not VMWare.

    I’m not saying its a problem. I’m only trying to raise awareness that there was a change and based on your configuration you may have issues.



  • @wayne-workman

    Yeah, their working on it … :)

    But as a NGO we are none of the “big payers” so it’s not always easy to “urge” them to do something in a timely manner ;)

    But they are great for working with us of course.

    We are planning a big rollout on the 20th of october, so its the timeline that is making me try to work around restrictions.


  • Moderator

    @taspharel You could just turn off the older DC. :-)

    Ask them to build you two new 2016 boxes already, what are you paying them for?


  • Testers

    @taspharel Well that should depend on where or how the IP helpers are setup in your switches or routers (assuming you have cisco or some other managed switching environment)



  • So … both DHCP requests go to the second Domain Controller, the one that has Server 2008 on it … crap.

    I guess there is no way to make the request somehow go back to the other DC when it reaches the fog server? :)

    I would love to avoid having to have the company that manages our Domain Controllers have to install a new Domain Controller :-/


  • Testers

    @taspharel I tend to see even if you put the whole vendor class in there you have to play with adding a prefix or append asterix in order to get it to work. Please wth it a little.i



  • So, I managed to look at the pcap file and get the vendor ID from it.

    The HP PC that only runs on legacy bios has the vendor id PXEClient:Arch:00000:UNDI:00201 which should be caught when I add the Vendor ID PXEClient:Arch:00000 with a following Asterisk, but to make sure I added it as a separate Option in the DHCP server and it still wont boot with undionly.kpxe

    I changed the default (you were right) from ipxe.efi to undionly.kpxe and that machine boots to undionly.kpxe

    So trying the other side I checked that Vendor ID and it is PXEClient:Arch:00007:UNDI:003016 - added a rule to specifically adress this -> still tries to load undionly.kpxe instead of ipxe.efi

    I will check if the DHCP request goes to the Windows Server 2012 or the Server 2008, maybe it requests it from the “older” DC :-/


Log in to reply
 

454
Online

39179
Users

10827
Topics

103019
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.