Hyper V and Pxe boot to Fog problems
-
@george1421 said in Hyper V and Pxe boot to Fog problems:
Note these are not official iPXE boot loaders, these are tests to see if the certificate code offered by the iPXE developers are causing problems with 1709 based Hyper-V virtual machines
Here is the rom-o-matic build image url for undionly.kpxe and ipxe.kpxe without
#undefine DOWNLOAD_PROTO_HTTPS #undefine IMAGE_TRUST_CMD #undefine CERT_CMD
The following link will create undionly.kpxe (you will need to rename the output of this script to undionly.kpxe since it will default to ipxe.kpxe.
https://rom-o-matic.eu/build.fcgi?BINARY=ipxe.kpxe&BINDIR=bin&REVISION=master&DEBUG=&EMBED.00script.ipxe=%23%21ipxe%0Aisset%20%24%7Bnet0/mac%7D%20%26%26%20ifopen%20net0%20%26%26%20dhcp%20net0%20%7C%7C%20goto%20dhcpnet1%0Aecho%20Received%20DHCP%20answer%20on%20interface%20net0%20%26%26%20goto%20proxycheck%0A%0A%3Adhcpnet1%0Aisset%20%24%7Bnet1/mac%7D%20%26%26%20ifopen%20net1%20%26%26%20dhcp%20net1%20%7C%7C%20goto%20dhcpnet2%0Aecho%20Received%20DHCP%20answer%20on%20interface%20net1%20%26%26%20goto%20proxycheck%0A%0A%3Adhcpnet2%0Aisset%20%24%7Bnet2/mac%7D%20%26%26%20ifopen%20net2%20%26%26%20dhcp%20net2%20%7C%7C%20goto%20dhcpall%0Aecho%20Received%20DHCP%20anser%20on%20infterface%20net2%20%26%26%20goto%20proxycheck%0A%0A%3Adhcpall%0Adhcp%20%26%26%20goto%20proxycheck%20%7C%7C%20goto%20dhcperror%0A%0A%3Adhcperror%0Aprompt%20--key%20s%20--timeout%2010000%20DHCP%20failed%2C%20hit%20%27s%27%20for%20the%20iPXE%20shell%3B%20reboot%20in%2010%20seconds%20%26%26%20shell%20%7C%7C%20reboot%0A%0A%3Aproxycheck%0Aisset%20%24%7Bproxydhcp/next-server%7D%20%26%26%20set%20next-server%20%24%7Bproxydhcp/next-server%7D%20%7C%7C%20goto%20nextservercheck%0A%0A%3Anextservercheck%0Aisset%20%24%7Bnext-server%7D%20%26%26%20goto%20netboot%20%7C%7C%20goto%20setserv%0A%0A%3Asetserv%0Aecho%20-n%20Please%20enter%20tftp%20server%3A%20%26%26%20read%20next-server%20%26%26%20goto%20netboot%20%7C%7C%20goto%20setserv%0A%0A%3Anetboot%0Achain%20tftp%3A//%24%7Bnext-server%7D/default.ipxe%20%7C%7C%0Aprompt%20--key%20s%20--timeout%2010000%20Chainloading%20failed%2C%20hit%20%27s%27%20for%20the%20iPXE%20shell%3B%20reboot%20in%2010%20seconds%20%26%26%20shell%20%7C%7C%20reboot&settings.h/VMWARE_SETTINGS:=1&general.h/PXE_STACK:=1&general.h/PXE_MENU:=1&general.h/DOWNLOAD_PROTO_NFS:=1&general.h/IMAGE_PXE:=1&general.h/IMAGE_SCRIPT:=1&general.h/IMAGE_BZIMAGE:=1&general.h/IMAGE_PNM:=1&general.h/IWMGMT_CMD:=0&general.h/NSLOOKUP_CMD:=1&general.h/TIME_CMD:=1&general.h/DIGEST_CMD:=1&general.h/LOTEST_CMD:=1&general.h/VLAN_CMD:=1&general.h/PXE_CMD:=1&general.h/REBOOT_CMD:=1&general.h/POWEROFF_CMD:=1&general.h/PCI_CMD:=1&general.h/PARAM_CMD:=1&general.h/NEIGHBOUR_CMD:=1&general.h/PING_CMD:=1&general.h/CONSOLE_CMD:=1&general.h/IPSTAT_CMD:=1&general.h/NTP_CMD:=1&console.h/CONSOLE_FRAMEBUFFER:=1&console.h/CONSOLE_VMWARE:=1&general.h/ROM_BANNER_TIMEOUT=40&branding.h/PRODUCT_NAME=FOG%20iPXE&branding.h/PRODUCT_TAG_LINE=FOG%20Network%20Boot%20Firmware&
This produces the error message:
iPXE initialising devices...WARNING: Using legacy NIC wrapper on <MAC Address>
The following link will create the ipxe.kpxe file (all drivers)
https://rom-o-matic.eu/build.fcgi?BINARY=ipxe.kpxe&BINDIR=bin&REVISION=master&DEBUG=&EMBED.00script.ipxe=%23%21ipxe%0Aisset%20%24%7Bnet0/mac%7D%20%26%26%20ifopen%20net0%20%26%26%20dhcp%20net0%20%7C%7C%20goto%20dhcpnet1%0Aecho%20Received%20DHCP%20answer%20on%20interface%20net0%20%26%26%20goto%20proxycheck%0A%0A%3Adhcpnet1%0Aisset%20%24%7Bnet1/mac%7D%20%26%26%20ifopen%20net1%20%26%26%20dhcp%20net1%20%7C%7C%20goto%20dhcpnet2%0Aecho%20Received%20DHCP%20answer%20on%20interface%20net1%20%26%26%20goto%20proxycheck%0A%0A%3Adhcpnet2%0Aisset%20%24%7Bnet2/mac%7D%20%26%26%20ifopen%20net2%20%26%26%20dhcp%20net2%20%7C%7C%20goto%20dhcpall%0Aecho%20Received%20DHCP%20anser%20on%20infterface%20net2%20%26%26%20goto%20proxycheck%0A%0A%3Adhcpall%0Adhcp%20%26%26%20goto%20proxycheck%20%7C%7C%20goto%20dhcperror%0A%0A%3Adhcperror%0Aprompt%20--key%20s%20--timeout%2010000%20DHCP%20failed%2C%20hit%20%27s%27%20for%20the%20iPXE%20shell%3B%20reboot%20in%2010%20seconds%20%26%26%20shell%20%7C%7C%20reboot%0A%0A%3Aproxycheck%0Aisset%20%24%7Bproxydhcp/next-server%7D%20%26%26%20set%20next-server%20%24%7Bproxydhcp/next-server%7D%20%7C%7C%20goto%20nextservercheck%0A%0A%3Anextservercheck%0Aisset%20%24%7Bnext-server%7D%20%26%26%20goto%20netboot%20%7C%7C%20goto%20setserv%0A%0A%3Asetserv%0Aecho%20-n%20Please%20enter%20tftp%20server%3A%20%26%26%20read%20next-server%20%26%26%20goto%20netboot%20%7C%7C%20goto%20setserv%0A%0A%3Anetboot%0Achain%20tftp%3A//%24%7Bnext-server%7D/default.ipxe%20%7C%7C%0Aprompt%20--key%20s%20--timeout%2010000%20Chainloading%20failed%2C%20hit%20%27s%27%20for%20the%20iPXE%20shell%3B%20reboot%20in%2010%20seconds%20%26%26%20shell%20%7C%7C%20reboot&settings.h/VMWARE_SETTINGS:=1&general.h/PXE_STACK:=1&general.h/PXE_MENU:=1&general.h/DOWNLOAD_PROTO_NFS:=1&general.h/IMAGE_PXE:=1&general.h/IMAGE_SCRIPT:=1&general.h/IMAGE_BZIMAGE:=1&general.h/IMAGE_PNM:=1&general.h/IWMGMT_CMD:=0&general.h/NSLOOKUP_CMD:=1&general.h/TIME_CMD:=1&general.h/DIGEST_CMD:=1&general.h/LOTEST_CMD:=1&general.h/VLAN_CMD:=1&general.h/PXE_CMD:=1&general.h/REBOOT_CMD:=1&general.h/POWEROFF_CMD:=1&general.h/PCI_CMD:=1&general.h/PARAM_CMD:=1&general.h/NEIGHBOUR_CMD:=1&general.h/PING_CMD:=1&general.h/CONSOLE_CMD:=1&general.h/IPSTAT_CMD:=1&general.h/NTP_CMD:=1&console.h/CONSOLE_FRAMEBUFFER:=1&console.h/CONSOLE_VMWARE:=1&general.h/ROM_BANNER_TIMEOUT=40&branding.h/PRODUCT_NAME=FOG%20iPXE&branding.h/PRODUCT_TAG_LINE=FOG%20Network%20Boot%20Firmware&
Same screen with the second file as well. Still stalls out, so it does not go past that.
-
@lukebarone Here is the one I built from source that is working for us
https://drive.google.com/file/d/1_HRH87klCVQ46raZDAeB8kwCdto0z8MR/view?usp=sharing -
@paulman9 This worked for me! I could boot to do a Quick Registration, and I’m currently capturing my golden image!
Thank you very much!
-
@lukebarone @Paulman9 I have the same issue with Hyper V and Windows 10 1709. What is the process to get the updated undionly.kpxe downloaded and installed on the Fog server running Ubuntu from the google drive link?
-
@jkoos101 I dropped the file into my
/tftpboot
folder on the FOG server. In my DHCP server, I made sure thefilename
attribute matched the filename to server. -
@lukebarone Thanks! I’ll try that.
-
@lukebarone I was able to copy the file over and now I’m currently capturing my image! Thanks again!
-
@lukebarone said in Hyper V and Pxe boot to Fog problems:
Is there any update on this? Or are we waiting on the iPXE crew to fix it?
Thanks for bringing that back up on screen. If I remember correctly I had only reported the rom-o-matic issue so far.
About the actual HyperV iPXE crypto code issue - I don’t think this has been reported anywhere yet. @Paulman9 I think you know the most about which options exactly play a role here. Did you get to compile a debug enabled binary and look into the code as I suggested (
make ... DEBUG=https
)? -
iT WORKS FOR ME TOOO THX A LOT
-
@sebastian-roth Sorry I must have missed your last comment originally. Compiling with any one of those three lines (even just the https line) causes it to fail. I didn’t see any difference using the debug command to compile it. Honestly, not sure what I should be looking for with that though.
Also, I did submit this to the ipxe forums, who responded that this was a Microsoft issue. While I did also submit this in the feedback app in windows 10, I’m not holding my breath for Microsoft to do anything about this.
-
@Paulman9 Thanks for letting me know! Good to hear you already posted this in the iPXE forums. Too bad that it didn’t get more attention but I guess there are more important issues to fix from their point of view.
If you are keen to get into debugging this I can give you a bit of advice. I am sure you’d have noticed the debug output if it were there. Just to give you an idea of how iPXE debug output looks like - it’s in color!!!
To find startup issues I usually start by adding my own debug output code. For that edit
src/core/init.c
, find functionstartup
(around line 65) and modify to make it read like this:void startup ( void ) { struct startup_fn *startup_fn; if ( started ) return; /* Call registered startup functions */ for_each_table_entry ( startup_fn, STARTUP_FNS ) { DBGC(0x023223, "calling startup function 0x%p\n", startup_fn); if ( startup_fn->startup ) startup_fn->startup(); } DBGC(0x023223, "done\n"); sleep(10); started = 1; }
Recompile with
make ... DEBUG=init
and run that binary first on a working VM - the sleep will give you enough time to take a picture of the function pointers printed on the console. Then run the same binary on a non-working VM and take a picture too. Please post both pictures and I am sure I can help you find out which initialization code is hanging.Not saying that we’ll definitely find a fix but possibly we can come up with some more information for the iPXE devs to work with.
-
@paulman9 Can you post the link of how you built that from source?
-
@lukebarone Found on the wiki. Only difference here is open general.h after downloading and change #define to #undefine for these lines:
#define DOWNLOAD_PROTO_HTTPS
#define IMAGE_TRUST_CMD
#define CERT_CMD
These lines aren’t consecutive in the file so you’ll have to look for each. -
@sebastian-roth Finally got it to work. Unsure if something is wrong on my side but I had to delete the sleep line to get it to compile properly. As a result, my images are dim since I had to pause the VMs to get screenshots.
Working:
No-worky:
-
@Paulman9 Great to see you got it working and figured a way to get the pictures. Sorry for the sleep compile issue. I forgot to tell you need to add the unistd header at the top (e.g. line 29) for that to work.
... #include <ipxe/device.h> #include <ipxe/console.h> #include <ipxe/init.h> #include <unistd.h> ...
Now the next step is to figure out which startup functions are called and where it hangs. Unfortunately there does not seem to be an easy way to get the function names from the pointers. So you need to add debug code to each of the startup functions by hand - sorry!
Here is a list of all eleven startup functions in iPXE (found runningfind ipxe/src/ -type f -exec grep "\.startup =" {} /dev/null \;
ipxe/src/hci/linux_args.c: .startup = linux_args_parse, ipxe/src/arch/x86/interface/pcbios/hidemem.c: .startup = hide_etherboot, ipxe/src/arch/x86/interface/pcbios/bios_console.c: .startup = bios_inject_startup, ipxe/src/arch/x86/image/initrd.c: .startup = initrd_startup, ipxe/src/arch/x86/core/cachedhcp.c: .startup = cachedhcp_startup, ipxe/src/arch/x86/core/runtime.c: .startup = runtime_init, ipxe/src/interface/linux/linux_console.c: .startup = linux_console_startup, ipxe/src/interface/efi/efi_timer.c: .startup = efi_tick_startup, ipxe/src/core/device.c: .startup = probe_devices, ipxe/src/crypto/rootcert.c: .startup = rootcert_init, ipxe/src/crypto/rbg.c: .startup = rbg_startup_fn,
Should maybe start with the crypto stuff as we think this might be causing it here. So edit
ipxe/src/crypto/rootcert.c
, jump to line 95 and add aDBGC
as first call after the function header:... static void rootcert_init ( void ) { DBGC(0x1, "rootcert start"); static int initialised; ...
Now compile the binary with
make ... DEBUG=init,rootcert
and try it out. Follow the same schema for all the other startup functions. The first parameter is just a color code, can be any hex number really. So you can use0x1
for the first,0x2
for the second if you like it colorful.Note that you have the
calling startup function 0x...
printout first and then your newly added output when it enters the particular startup routine. So I suspect it to halt after one of your newly added printouts. From there you can add more printouts throughout that function and those being called. Let me know what you find or if you get stuck at some point.PS: I’ve done those debugging steps a couple of times when trying to find out why iPXE would hang on some particular hardware. Usually I’d just compile the binary and give it to users for testing. So this is the first time I hand over the knowledge on how to debug iPXE init code and I am grateful @Paulman9 is keen to follow this. @Wayne-Workman mind adding that to the wiki as well?
-
@sebastian-roth Here is the output from a working vm:
I’m no programmer so you’re way over my head here haha Output seemed the same as before on the non-working one -
@Paulman9 Ok, so it’s definitely not the rootcert code causing the hang. Just keep going like this with all the other files. Try adding debug to
ipxe/src/crypto/rbg.c
next I’d suggest andpxe/src/interface/efi/efi_timer.c
is a good candidate for an issue as well!! Just don’t forget to add those to themake ... DEBUG=
command too when compiling. -
@sebastian-roth I suppose this is what I am looking for then?
Non working machine stalled here
-
@Paulman9 Yeah, exactly. Now from here just put in more
DBGC
startments in therbg_startup
function (line 73ff) to see if it gets past thefetch_uuid_setting
anddrbg_instantiate
calls. -
@Paulman9 Any news on this. Please let me know if you need further assistance.