Chainloading Failed on EXIT - Hangs on REFIND_EFI



  • Server
    • FOG Version: 1.3.0-RC-25
    • OS: CentOS 6.8
    Client
    • Service Version:
    • OS: Windows Server 2012 R2
    • System Manufacturer: Gigabyte Technology Co., Ltd.
    • System Product: GA-78LMT-USB3 6.0
    Description

    This motherboard boots to the menu just fine but gives the Chainloading error when using the EXIT option. When using REFIND_EFI or SANBOOT it hangs with the blinking cursor in the top left corner. The GRUB options don’t work because it’s a Windows EFI installation.

    I have no “Secure Chip” or Secure Boot settings in the BIOS so this is why I’m posting a new thread. It seems my issue might be different than the others.

    I’d love to get this working as I have several of these motherboards in my install base. I’m happy to do some digging, just please let me know what I have to do.
    Here is my boot.php file from (http://x.x.x.x/fog/service/ipxe/boot.php)

    #!ipxe
    set fog-ip x.x.x.x
    set fog-webroot fog
    set boot-url http://${fog-ip}/${fog-webroot}
    cpuid --ext 29 && set arch x86_64 || set arch i386
    goto get_console
    :console_set
    colour --rgb 0x00567a 1 ||
    colour --rgb 0x00567a 2 ||
    colour --rgb 0x00567a 4 ||
    cpair --foreground 7 --background 2 2 ||
    goto MENU
    :alt_console
    cpair --background 0 1 ||
    cpair --background 1 2 ||
    goto MENU
    :get_console
    console --picture http://x.x.x.x/fog/service/ipxe/bg.png --left 100 --right 80 && goto console_set || goto alt_console
    :MENU
    menu
    colour --rgb 0xff0000 0 ||
    cpair --foreground 1 1 ||
    cpair --foreground 0 3 ||
    cpair --foreground 4 4 ||
    item --gap Host is NOT registered!
    item --gap -- -------------------------------------
    item fog.local Boot from hard disk
    item fog.memtest Run Memtest86+
    item fog.reginput Perform Full Host Registration and Inventory
    item fog.reg Quick Registration and Inventory
    item fog.deployimage Deploy Image
    item fog.multijoin Join Multicast Session
    item fog.sysinfo Client System Information (Compatibility)
    choose --default fog.local --timeout 5000 target && goto ${target}
    :fog.local
    sanboot --no-describe --drive 0x80 || goto MENU
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4
    imgfetch init_32.xz
    boot || goto MENU
    :fog.memtest
    kernel memdisk iso raw
    initrd memtest.bin
    boot || goto MENU
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4
    imgfetch init_32.xz
    boot || goto MENU
    :fog.reginput
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=manreg
    imgfetch init_32.xz
    boot || goto MENU
    :fog.reg
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=autoreg
    imgfetch init_32.xz
    boot || goto MENU
    :fog.deployimage
    login
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param qihost 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    :fog.multijoin
    login
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param sessionJoin 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    :fog.sysinfo
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=x.x.x.x/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=sysinfo
    imgfetch init_32.xz
    boot || goto MENU
    :bootme
    chain -ar http://x.x.x.x/fog/service/ipxe/boot.php##params ||
    goto MENU
    autoboot
    

    thanks
    Lucas


  • Testers

    @88fingerslukee In the bios/uefi firmware settings, what is the boot order?

    i.e and or also…

    Is UEFI enabled and Legacy enabled, or just UEFI?
    What pxe boot file are you using? I’ve seen that make a difference. I haves some cheaper HP’s that won’t boot with sanboot to the hard drive if you use ipxe.kkpxe, but they work fine with ipxe.pxe or ipxe.kpxe.

    Also have you played with the refind boot settings? There’s a file on the fog server at
    /var/www/fog/service/ipxe/refind.conf
    there is a line that starts with scanfor I’ve changed mine to only look for internal,external which means it only looks for uefi options and not legacy options. On a few of my devices, if I have it set as uefi only it has a pause on the host every boot if UEFI CSM isn’t set and it searches for the bios options that are enabled in fog by default.



  • @Tom-Elliott This depends on the motherboard / BIOS.

    I have several models of Lenovo set to UEFI, but allow booting to both. I can drop a Legacy/CSM or UEFI image onto these systems and they will boot either OS fine.

    In fact I just Legacy PXE booted an M73z with undionly.kpxe to transfer an OEM UEFI image. It transferred and booted fine and runs as a UEFI OS.



  • @Tom-Elliott I’m not certain I understand what you’re saying here. Are you suggesting an issue with the motherboard’s implementation of iPXE or something to do with FOG?


  • Senior Developer

    @88fingerslukee if your system is expecting to boot in UEFI, but your release from ipxe to system is in legacy it doesn’t matter which form of pxe on nic you’re using.

    UEFI/EFI as far as I know uses the onboard ‘nvram’ to know how to boot the system but in legacy mode that element would not be accessible.



  • @Tom-Elliott the BIOS has two different network boots, iPXE and Legacy LAN. Both of them exhibit the same behaviour


  • Senior Developer

    Also, another possibility, for the open timeout isn’t because of a problem with the exit type, rather how it’s being told to pass the exit based on how the system is seen. For example booting something in legacy mode would, from what I can guess, would not be able to return the UEFI based data to the system.
    It suddenly thinks there’s no os to load into and restarts the system to try again. I believe you can boot down though (from EFI to legacy), just as long as the data can be found to boot up.



  • @Tom-Elliott no error logs at all. I only have ssl_error_log or access_log. Neither of them have anything relevant.


  • Senior Developer

    @88fingerslukee then are there any errors that might seem relevant in the Apache error logs?



  • Unfortunately, no. I don’t use any plugins at all.


  • Senior Developer

    @88fingerslukee do you use the location plugin by chance? RC-26 fixed an issue of a method call, of which the sounds like it’s failing rc25 s bug.



  • Okay, so totally weird. It must’ve worked at some point because I had to leave for a while but when I came back it was at the linux prompt. Here’s the lsblk command, sda is the system SSD.

    NAME    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
    sdb       8:16    0   298.1G  0 disk
    └─sdb1    8:17    0   298.1G 0 part
      sda      8:0   0   238.5G  0 disk
    └─sda4    8:4   0    238G   0 part
    └─sda2    8:2   0   100M  0 part
    └─sda3    8:3   0   128M  0 part
    └─sda1    8:1   0   300M   0 part
    
    

    When I rebooted it went back to the open timeout. So I have no idea what the deal is here.


  • Moderator

    @88fingerslukee Interesting, this looks like a normal capture menu. This is giving you a tftp timeout error?

    I’m not seeing any tftp image pulls here, only http (which it should). Can you inspect the Apache error.log file and see if there is anything useful at the end of that file (if everything is working perfectly this file should be empty).



  • Sorry, that was without the capture task . Here is what you actually asked for:

    #!ipxe
    set fog-ip 192.168.1.184
    set fog-webroot fog
    set boot-url http://${fog-ip}/${fog-webroot}
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 mac=xx:xx:xx:xx:xx:xx ftp=192.168.1.184 storage=192.168.1.184://files/images/dev/ storageip=192.168.1.184 osid=7 irqpoll hostname=HOST chkdsk=0 img=IMG imgType=n imgPartitionType=all imgid=10 imgFormat= PIGZ_COMP=-6 hostearly=1 pct=5 ignorepg=1 isdebug=yes type=up
    imgfetch init_32.xz
    boot



  • I get this:

    #!ipxe
    set fog-ip 192.168.1.184
    set fog-webroot fog
    set boot-url http://${fog-ip}/${fog-webroot}
    cpuid --ext 29 && set arch x86_64 || set arch i386
    goto get_console
    :console_set
    colour --rgb 0x00567a 1 ||
    colour --rgb 0x00567a 2 ||
    colour --rgb 0x00567a 4 ||
    cpair --foreground 7 --background 2 2 ||
    goto MENU
    :alt_console
    cpair --background 0 1 ||
    cpair --background 1 2 ||
    goto MENU
    :get_console
    console --picture http://192.168.1.184/fog/service/ipxe/bg.png --left 100 --right 80 && goto console_set || goto alt_console
    :MENU
    menu
    colour --rgb 0x00567a 0 ||
    cpair --foreground 1 1 ||
    cpair --foreground 0 3 ||
    cpair --foreground 4 4 ||
    item --gap Host is registered as PSI1!
    item --gap -- -------------------------------------
    item fog.local Boot from hard disk
    item fog.memtest Run Memtest86+
    item fog.keyreg Update Product Key
    item fog.deployimage Deploy Image
    item fog.multijoin Join Multicast Session
    item fog.quickdel Quick Host Deletion
    item fog.sysinfo Client System Information (Compatibility)
    choose --default fog.local --timeout 5000 target && goto ${target}
    :fog.local
    imgfetch ${boot-url}/service/ipxe/refind.conf
    chain -ar ${boot-url}/service/ipxe/refind.efi || goto MENU
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4
    imgfetch init_32.xz
    boot || goto MENU
    :fog.memtest
    kernel memdisk iso raw
    initrd memtest.bin
    boot || goto MENU
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4
    imgfetch init_32.xz
    boot || goto MENU
    :fog.keyreg
    login
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param keyreg 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    :fog.deployimage
    login
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param qihost 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    :fog.multijoin
    login
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param sessionJoin 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    :fog.quickdel
    login
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param delhost 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    :fog.sysinfo
    kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 web=192.168.1.184/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=sysinfo
    imgfetch init_32.xz
    boot || goto MENU
    :bootme
    chain -ar http://192.168.1.184/fog/service/ipxe/boot.php##params ||
    goto MENU
    autoboot
    

  • Moderator

    What do you get when you key this url into a browser. Make sure you still have the capture/deploy task still scheduled.

    http://<fog_server_ip>/fog/service/ipxe/boot.php?mac=<mac_of_broken_system>

    That should spit out the fog iPXE configuration menu for that target computer. If its incomplete then we’ll need to tail the apache error.log file.



  • Okay, now it’s getting weird.

    When I create a capture task I get a TFTP Open Timeout error. So it’ll TFTP just fine into the menu without any task enabled but it has an issue with a task? How does that work?


  • Moderator

    It’s most likely due to the multiple disks. Once you identify the OS disk (as George asked), you can specify that as the primary disk for hosts of this type. Of course it’s possible that the FOG linux kernel isn’t even seeing the SSD, though I doubt this.


  • Moderator

    OK here is what I want you to do here.

    Schedule a debug capture/deploy I don’t care which to this hardware. (Hint: ensure the debug check box is selected when scheduling the task) Then pxe boot the target computer. You may have to manually register this computer if you can’t get that far. When you pxe boot this computer you will see several screens of text, which you clear by pressing the enter key. Eventually you will be dropped to a linux command prompt on the target computer.

    Then key in lsblk to show the disk structure of this ssd/hdd combination.

    You don’t need to use refind_efi if this is a bios (legacy) firmware. If this is a uefi system then you might need to use refind to find the proper boot blocks.



  • No RAID. There are 2 drives, one SSD with Windows on it, the other as a general storage drive:

    System Drive:
    Drive Controller: Serial ATA 3Gb/s
    Drive Model: CRUCIAL_CT256M225
    Drive Revision: 1819
    Drive Capacity: 244,198 MBytes (256 GB)
    Media Rotation Rate: SSD Drive (Non-rotating)

    Storage Drive:
    Drive Controller: Serial ATA 3Gb/s
    Drive Model: WDC WD3200AAKS-00L9A0
    Drive Revision: 01.03E01
    Drive Capacity: 305,245 MBytes (320 GB)


Log in to reply
 

432
Online

39.3k
Users

11.0k
Topics

104.4k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.