Chainloading Failed when using EXIT method for drive boot
-
@piotr86pl said in Chainloading Failed when using EXIT method for drive boot:
I’ve tried setting “exit” as Boot to Hard Drive option. But it is failing. The “Chainloading failed” error is being thrown by iPXE when choosing Boot to Hard Drive option in FOG iPXE Menu, then the PC reboots after few seconds. But it gives me opportunity to enter iPXE shell, and here I can type “exit” and it works.
The EXIT from the fog bios exit mode and keying in exit from the ipxe console should be exactly the same thing.
This CSM auto mode might be confusing iPXE. So lets set both exit modes to EXIT.
Does this target computer have a true bios mode?
-
@george1421 Yes, the target computer have a true BIOS mode. When I’m turning on CSM, I can select if I want to boot from PXE, Hard Disks, etc. using “Legacy Mode” or “UEFI Mode”, so I’ve selected everywhere that I want to use Legacy Mode. Now everything is working in classic BIOS mode. OS’es are installed in BIOS mode, PXE is also booting using legacy PXE (and undionly.kpxe file). So I guess it has a true BIOS mode (in some sort). I’ll try setting both exit modes to EXIT, when I’ll get opportunity to get to the school (holidays in my country, so I’m not everyday at school).
I have vSphere infrastructure at school, so I remotely tested FOG on a Virtual Machine with both exit modes setted to EXIT - and here I the same error. It may not be problem with motherboard but with FOG itself. I can post screen of this issue. The virtual machine is running a true BIOS so it is not the issue of using CSM. Typing “exit” in shell, here works too.
-
I’ve might found the core of the issue. iPXE which is generated by FOG embeds a iPXE script (Link to FOG Github). This script tries to chainload default.ipxe from TFTP Server, but if the chainloading failes it throws the “Chainloading failed” error.
I though it was an error notification created by iPXE Team, but to my surprise it is written by FOG Project Team. So I misunderstood some things (I was searching the solution on iPXE forums, thinking it may be iPXE causing this error). When I deleted the whole chainload error handler, leaving only this:
:netboot chain tftp://${next-server}/default.ipxe
and recompiling iPXE, the exit mode is now working properly and iPXE is exitting. I don’t know though why is the original script causing the error occurence. According to the iPXE wiki, “exit” should exit whole shell. But apparently it is exiting only chainloaded scripts - not the embedded one. And the embedded script continues to execute, throwing the chainload error. I don’t know if that should work that way and it is not just an iPXE bug, that using “exit” will not exit the whole iPXE but only the chainloaded scripts (besides the embedded one).
Either way - I found what’s causing the issue and repaired it (works on VM, I must check if the ASUS PC’s are also working now). Not an elegant way though, so I’m looking forward to hear from FOG Project Team, an more reliable solution to this issue. This might involve further and more sophisticated debugging.
Thanks for helping!
-
@piotr86pl OK I’m totally confused now.
In iPXE there is a fog provided script that is for sure. That allows iPXE to be able to find the fog server. You are right it does chainload default.ipxe. Then default.ipxe then calls boot.php on the fog server. The boot.php then builds the FOG iPXE menu.
Are you saying you never were able to get to the FOG iPXE menu?
-
No, no. I was able to get to the FOG iPXE menu. Everything here is working. Besides the option “Boot from Hard Drive” when I’m setting “Exit to Hard Drive Type” to “EXIT” in FOG Menu. SANBOOT is not working because it is not recognizing NVMe Drivers, so no chance using SANBOOT.
When “Exit to Hard Drive Type” is being set to “EXIT” and I click “Boot from Hard Drive” in FOG iPXE Menu, an error pops - “Chainloading failed, hit ‘s’ for the iPXE shell; reboot in 10 seconds”.
I thought it is an error which iPXE embeds in it’s code and it is iPXE’s direct fault that this error is happening. But today I search through FOG’s sourcecode, where I found that this error ("Chainloading failed…) is not handled by iPXE itself but rather an iPXE script that is embedded in iPXE binary by FOG. I mean this script.
Here is this code:
:netboot chain tftp://${next-server}/default.ipxe || prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot
and if I understand correctly, it should inform the user when iPXE can’t chainload default.ipxe file and it is invoked when default.ipxe file can’t be chainloaded.
But it turns out that when you use iPXE’s exit command somewhere after loading default.ipxe (for example in the Boot from Hard Drive option), it won’t exit WHOLE iPXE but only chainloaded scripts and it continues to execute iPXE’s embedded script which goes to the “prompt” command and invokes “Chainloading failed” error.
When I left only this command, deleting “prompt --key…” like this:
:netboot chain tftp://${next-server}/default.ipxe
then iPXE’s exit command now works invoked from scripts, therefore the “Boot from Hard Drive” option started to work.
I don’t know if this is how iPXE should work. I think it should exit whole iPXE, even with this fragment of code which I’ve deleted to make things work. But it is not working that way and only deleting it and leaving only “chain” command makes “exit” command work. Of course the rest of script is left intact by me.
-
:netboot chain tftp://${next-server}/default.ipxe || prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot
This is a bit strange why it works when you edit it. Basically the double pipe is shorthand for execute this if the previous command fails. Where && means execute this command if the previous command passes.
So this bit.
prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot
on the last part of that command if you hit an
s
the prompt exits true and shell command is called, if you hit something else thans
or you don’t hit a key before the timeout false is returned an reboot happens.Is only executed if the previous bit fails. Which is chain booting to default.ipxe. I’m almost of a mind that something is wrong with the ${next-server} variable that is causing the chain to fail, by removing the prompt part you are just not seeing it fail.
For testing in the FOG Web ui -> FOG Configuration -> FOG Settings hit the expand all button then search for “bios exit”. Change both bios and efi exit modes to EXIT. Then save the settings.
Now with a browser go to
http://<fog_server_ip>/fog/service/ipxe/boot.php?mac=00:00:00:00:00:00
That will display the text behind the iPXE menu. On that page search fordefault
. That will be the section that is called when the timeout happens and boots from the local hard drive. By changing globally the exit modes for both bios and uefi to exit, it should put the exit command in the iPXE Menu script. I don’t have a fog server in front of me to test this at the moment, but if it says exit there, it should be the same thing as hitting exit from the ipxe shell. -
Now with a browser go to
http://<fog_server_ip>/fog/service/ipxe/boot.php?mac=00:00:00:00:00:00
That will display the text behind the iPXE menu. On that page search fordefault
. That will be the section that is called when the timeout happens and boots from the local hard drive. By changing globally the exit modes for both bios and uefi to exit, it should put the exit command in the iPXE Menu script.And it is putting the exit command in the iPXE Menu script
choose --default fog.local --timeout 5000 target && goto ${target} :fog.local exit || goto MENU
I’ve tried to mess with the embedded script itself and I found that if I replace
:netboot chain tftp://${next-server}/default.ipxe || prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot
with this
chain tftp://${next-server}/default.ipxe || echo Chainloading failed
and try to “Boot from Hard Drive” using “EXIT” method, it is WORKING! The “Chainloading failed” is not echoing back to me. But if I write this like that
chain tftp://${next-server}/default.ipxe || echo Chainloading failed
the “Chainloading failed” is echoing back to me. So I guess the issue here is not with the chain command but with the syntax.
Apparently
command || command
is not the same as
command || command
So I’ve tried to leave this “prompt” command but in this manner:
:chainloadfailed prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot :netboot chain tftp://${next-server}/default.ipxe || goto chainloadfailed
And now it works! When I use EXIT by clicking “Boot from Hard Drive”, iPXE is correctly exiting. And if I rename default.ipxe on my server to something else (to simulate failed chainloading), the “Chainloading failed, hit ‘s’ (…)” message is appering so I guess that the core of this issue is incorrect syntax in iPXE script, and the solution is to write this like I did.
Straight away I say that if I write it like this (in one line)
:netboot chain tftp://${next-server}/default.ipxe || prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot
the script is not working, beacuse it is dropping me straight to iPXE shell, so the better solution is to write this like this:
:chainloadfailed prompt --key s --timeout 10000 Chainloading failed, hit 's' for the iPXE shell; reboot in 10 seconds && shell || reboot :netboot chain tftp://${next-server}/default.ipxe || goto chainloadfailed
So I guess, we’ve solved this mistery. Next-server variable is working - I’ve tried to echo it and it echoed IP address of my FOG server, by the way.
-
@piotr86pl Very nice find!! Its strange that this issue has not come up before now. Maybe EXIT never worked for people and they just assume it didn’t work at all. I know its been a really long time since that script was worked with.
But anyway great find.
@Developers you might want to review this thread.
-
Almost lost track of this topic over the summer. Be sure I will look into this soon.
-
Just figured this has been added already thanks to @Piotr86PL’s pull request in August (which I had obviously forgotten about, nevermind).