Dell servers R740/R750 display YSOD after image capture/deploy
-
I am evaluating FOG as an imaging and deployment solution and seeing some problems with the Dell PowerEdge servers R740/R750.
After the scheduled image capture job is completed and the server reboots I am getting the following YSOD
It might not be a FOG problem, but something to do with a rEFInd shell storing its discovery configuration on the SSD partition. I wonder if anyone had seen this and if there is a known working config to overcome.
My rEFInd config is as posted below:
There is a scan_delay option to allow rEFInd to rescan the EFI boot options and properly detect windows boot option. Otherwise, everything is set to defaults and works very fine when tested on ESXi based virtual machines.
I have not yet tried the NVRAM option in rEFInd, but wonder if there are similar sightings and workarounds.
-
@anvanster I’ve never seen this before. But lets try to see if refind is doing something strange.
The only thing that calls refind is when you are in uefi mode and you exit the FOG iPXE menu, not select an option because that calls fos linux. So the only time that refind is executed is when the fog ipxe menu exits.
So simply don’t let the fog ipxe menu exit using refind. You can (as a test) change the default exit manager for uefi to something like EXIT. That uses the uefi exit manager built into ipxe, or simply don’t use the fog menu after you image the computer.
After imaging the FOS Linux engine reboots, since its a real OS it doesn’t need to use refind, its never called directly from FOS Linux. In the context of FOG and exiting from the iPXE menu, its only used to locate the efi boot loader, unless someone has changed the configuration, it should never install itself into the efi partition.
So why are you getting that screen? Is secure boot enabled? If yes then FOS should not have run…
Does this happen on the source computer after capture or both source and target computers?
-
This post is deleted! -
@george1421 There seems to be an issue with the Dell boot manager.
Configuration is as follows:
Boot mode: UEFI
First boot option: Network boot
Second boot option: RAID drive (it is in HBA mode but the same behavior is observed in RAID mode as well)Exit option from FOG GRUB is configured as EXIT right now.
Default boot option in GRUB is Boot Local Disk.Booting the Network option is successful and GRUB menu is called. I can reimage the system as needed. But when just waiting to let it boot into the local drive it fails to recognize that it is supposed to go to a second option in the Dell boot menu, and instead of booting from SSD, it goes back to the Dell boot menu.
It can successfully boot into Windows if I choose this option directly from the Dell boot manager and if I put RAID as a first option it goes into Windows fine. But this disables FOG GRUB by default and requires manual interaction in case the system needs to be re-imaged.I can put a PCIe NVMe drive and disable RAID completely for a sake of experiment.
On Supermicro servers FOG GRUB exits and boots into local drive just fine.