Hyper V and Pxe boot to Fog problems
-
@lukebarone said in Hyper V and Pxe boot to Fog problems:
Is there any update on this? Or are we waiting on the iPXE crew to fix it?
Thanks for bringing that back up on screen. If I remember correctly I had only reported the rom-o-matic issue so far.
About the actual HyperV iPXE crypto code issue - I don’t think this has been reported anywhere yet. @Paulman9 I think you know the most about which options exactly play a role here. Did you get to compile a debug enabled binary and look into the code as I suggested (
make ... DEBUG=https
)? -
iT WORKS FOR ME TOOO THX A LOT
-
@sebastian-roth Sorry I must have missed your last comment originally. Compiling with any one of those three lines (even just the https line) causes it to fail. I didn’t see any difference using the debug command to compile it. Honestly, not sure what I should be looking for with that though.
Also, I did submit this to the ipxe forums, who responded that this was a Microsoft issue. While I did also submit this in the feedback app in windows 10, I’m not holding my breath for Microsoft to do anything about this.
-
@Paulman9 Thanks for letting me know! Good to hear you already posted this in the iPXE forums. Too bad that it didn’t get more attention but I guess there are more important issues to fix from their point of view.
If you are keen to get into debugging this I can give you a bit of advice. I am sure you’d have noticed the debug output if it were there. Just to give you an idea of how iPXE debug output looks like - it’s in color!!!
To find startup issues I usually start by adding my own debug output code. For that edit
src/core/init.c
, find functionstartup
(around line 65) and modify to make it read like this:void startup ( void ) { struct startup_fn *startup_fn; if ( started ) return; /* Call registered startup functions */ for_each_table_entry ( startup_fn, STARTUP_FNS ) { DBGC(0x023223, "calling startup function 0x%p\n", startup_fn); if ( startup_fn->startup ) startup_fn->startup(); } DBGC(0x023223, "done\n"); sleep(10); started = 1; }
Recompile with
make ... DEBUG=init
and run that binary first on a working VM - the sleep will give you enough time to take a picture of the function pointers printed on the console. Then run the same binary on a non-working VM and take a picture too. Please post both pictures and I am sure I can help you find out which initialization code is hanging.Not saying that we’ll definitely find a fix but possibly we can come up with some more information for the iPXE devs to work with.
-
@paulman9 Can you post the link of how you built that from source?
-
@lukebarone Found on the wiki. Only difference here is open general.h after downloading and change #define to #undefine for these lines:
#define DOWNLOAD_PROTO_HTTPS
#define IMAGE_TRUST_CMD
#define CERT_CMD
These lines aren’t consecutive in the file so you’ll have to look for each. -
@sebastian-roth Finally got it to work. Unsure if something is wrong on my side but I had to delete the sleep line to get it to compile properly. As a result, my images are dim since I had to pause the VMs to get screenshots.
Working:
No-worky:
-
@Paulman9 Great to see you got it working and figured a way to get the pictures. Sorry for the sleep compile issue. I forgot to tell you need to add the unistd header at the top (e.g. line 29) for that to work.
... #include <ipxe/device.h> #include <ipxe/console.h> #include <ipxe/init.h> #include <unistd.h> ...
Now the next step is to figure out which startup functions are called and where it hangs. Unfortunately there does not seem to be an easy way to get the function names from the pointers. So you need to add debug code to each of the startup functions by hand - sorry!
Here is a list of all eleven startup functions in iPXE (found runningfind ipxe/src/ -type f -exec grep "\.startup =" {} /dev/null \;
ipxe/src/hci/linux_args.c: .startup = linux_args_parse, ipxe/src/arch/x86/interface/pcbios/hidemem.c: .startup = hide_etherboot, ipxe/src/arch/x86/interface/pcbios/bios_console.c: .startup = bios_inject_startup, ipxe/src/arch/x86/image/initrd.c: .startup = initrd_startup, ipxe/src/arch/x86/core/cachedhcp.c: .startup = cachedhcp_startup, ipxe/src/arch/x86/core/runtime.c: .startup = runtime_init, ipxe/src/interface/linux/linux_console.c: .startup = linux_console_startup, ipxe/src/interface/efi/efi_timer.c: .startup = efi_tick_startup, ipxe/src/core/device.c: .startup = probe_devices, ipxe/src/crypto/rootcert.c: .startup = rootcert_init, ipxe/src/crypto/rbg.c: .startup = rbg_startup_fn,
Should maybe start with the crypto stuff as we think this might be causing it here. So edit
ipxe/src/crypto/rootcert.c
, jump to line 95 and add aDBGC
as first call after the function header:... static void rootcert_init ( void ) { DBGC(0x1, "rootcert start"); static int initialised; ...
Now compile the binary with
make ... DEBUG=init,rootcert
and try it out. Follow the same schema for all the other startup functions. The first parameter is just a color code, can be any hex number really. So you can use0x1
for the first,0x2
for the second if you like it colorful.Note that you have the
calling startup function 0x...
printout first and then your newly added output when it enters the particular startup routine. So I suspect it to halt after one of your newly added printouts. From there you can add more printouts throughout that function and those being called. Let me know what you find or if you get stuck at some point.PS: I’ve done those debugging steps a couple of times when trying to find out why iPXE would hang on some particular hardware. Usually I’d just compile the binary and give it to users for testing. So this is the first time I hand over the knowledge on how to debug iPXE init code and I am grateful @Paulman9 is keen to follow this. @Wayne-Workman mind adding that to the wiki as well?
-
@sebastian-roth Here is the output from a working vm:
I’m no programmer so you’re way over my head here haha Output seemed the same as before on the non-working one -
@Paulman9 Ok, so it’s definitely not the rootcert code causing the hang. Just keep going like this with all the other files. Try adding debug to
ipxe/src/crypto/rbg.c
next I’d suggest andpxe/src/interface/efi/efi_timer.c
is a good candidate for an issue as well!! Just don’t forget to add those to themake ... DEBUG=
command too when compiling. -
@sebastian-roth I suppose this is what I am looking for then?
Non working machine stalled here
-
@Paulman9 Yeah, exactly. Now from here just put in more
DBGC
startments in therbg_startup
function (line 73ff) to see if it gets past thefetch_uuid_setting
anddrbg_instantiate
calls. -
@Paulman9 Any news on this. Please let me know if you need further assistance.
-
I just came across this by accident and wondered if this was ever solved. Reading through it and the related iPXE forum post (link) it seems like this was caused and fixed by Microsoft. So if you have this issue, update and you should be fine.
-
@Sebastian-Roth Sorry, I completely forgot about this. Just updated to latest kernel on my server and tested on 1803, worked perfect. Thanks for the update.