Stalling on FOG splash screen



  • Motherboard:

    Maximus VII Ranger
    Bios version 3003
    Build date 10/28/2015
    baseboard-serial-number 140627739500080

    CPU

    intel i5-4460 @ 3.20GHz

    FOG version 1.5.5
    Chainloader = undionly.kpxe

    I have a dual-boot Windows 10/Linux (Zorin 12.4) installation.

    Windows 10 is installed on a NVMe 250GB SSD.

    Zorin was installed on a Western Digital hard drive.

    Both operating systems were installed BIOS legacy mode.

    I had Network Booting set up and FOG was working fine.

    Then the Western Digital hard drive failed so I replaced it with hardware raid using a StarTech 4 port PCIe SATA III 6Gbps RAID Controller Card with two 2TB Western Digital hard drives.

    I had to re-install Zorin so I did so in UEFI mode. Then I had problems dual booting as BIOS legacy and UEFI are not compatable so I converted my Windows 10 installation from Bios Legacy to UEFI by following the instructions here:

    https://www.maketecheasier.com/convert-legacy-bios-uefi-windows10/

    Current BIOS settings:

    Advanced --> Network Stack Configuration --> Network Stack [Enabled]
    Ipv4 PXE Support [Enabled]
    Ipv6 PXE Support [Enabled]

    CSM(Compatibilty Support Module)

    Launch CSM                                      [Enabled]
        Boot Device Control                         [UEFI and Legacy OPROM]
        Boot from Network Devices                   [Legacy OPROM first]        
        Boot from Storage Devices                   [Both, UEFI first]
        Boot from PCI-E/PCI Expansion Devices       [UEFI driver first]
    

    Advanced --> Onboard Devices Configuration --> Intel LAN Controller --> Intel LAN PXE Option ROM [Enabled]

    My current Boot Options (in order) are:

    IBA GE Slot 00C8 v1547
    ubuntu
    MARVELL Raid VD (1907649MB)
    Windows Boot Manager (Samsung SSD 970 EVO 250GB)
    P2: Asus DRW-24BSST
    

    After the FOG splash screen I now get a blank screen with a blinking cursor in the top right hand corner. I also get this outcome if I make “MARVELL Raid VD (1907649)” the second boot option (after “IBA GE Slot 00C8 v1547”). I also get this outcome if I choose “MARVELL Raid VD (1907649)” as the “Boot Overide” in the BIOS.

    If I go into the BIOS and chose “ubuntu” as the Boot Overide, or make “ubuntu” as the first boot option, bypassing network booting entirely, the Grub menu appears and I can boot into either Zorin or Windows no problem.

    If I remove “MARVELL Raid VD (1907649MB)” as a Boot Option entirely once I get to the FOG splash screen it counts down from 3 with the “Boot from hard disk” option selected then the screen refreshes and it starts the count again. Over and over. That’s as far as it gets. For some reason FOG does not want to pass the boot control over to “ubuntu”?

    As I am able to continue the boot process by choosing “ubuntu” from the BIOS I am thinking there is a setting in FOG somewhere I need to tweak to resolve this?

    Have pressed the CLR_CMOS button to reset the BIOS and repeated the entire process. No change.


  • Senior Developer

    @Jedi exactly this.



  • @Tom-Elliott Hi Tom, thank you for your reply.

    Ok I have created a image definition, assigned that image to the host and now can capture.

    “I suspect network boot is using BIOS and MBR but your machine is configured for UEFI.”

    By “network boot is using BIOS and MBR” I have interpreted that to mean a reference to Network Devices being set to Legacy OPROM first under CSM in the bios.

    “your machine” I have interpreted to be a reference to my Windows 10 operating system.

    So if my translation is correct what you are saying is that if I network boot in bios mode FOG can only facilitate booting into an OS installed in bios mode and if I network boot in UEFI mode FOG can only facilitate booting into an OS installed in UEFI mode?


  • Senior Developer

    @Jedi your natted interface likely has the host you specified, this is perfectly fine and likely not causing any issues. You likely have an external domain that points to your network with that as the name.

    If you’re trying to capture an image, you need to create the definition in fog, then assign that image to the newly registered host. Then you can capture. As to why you’re getting a blank screen, I suspect network boot is using BIOS and MBR but your machine is configured for UEFI.



  • @george1421 “it should find the windows disk no problem”

    I decided to test that, make sure the fundamentals are right.

    I set the windows drive as second in the boot order (after network boot in bios mode) and disabled all other boot options. Took the raid controller right out of the equation.

    On reboot I ended up with a blank screen and blinking cursor.

    As FOG was installed on a virtual machine I decided to restore a snapshot that was before the FOG install and do a complete reinstall of FOG (I am using Bridged Adapter and Promiscuous Mode is set to Allow All). I performed a full host registration. When asked if I would like to deploy an image to this computer now I said yes. It said the task was complete and it was going to reboot and take an image. It rebooted but did not take an image. I ended up with the blank screen and blinking cursor again (exit to hard drive type is set to SANBOOT).

    I repeated the process with a different ethernet cable connected to a different port on the switch. No change.

    I went to tasks --> List All Hosts --> Capture and it said “Failed to create task - Invalid image assigned to host”.

    In the logs under Image Replicator it says

    Starting image replication
    Please physically associate images to a storage group
    There is nothing to replicate

    In the logs under Image Size it says

    [04-06-19 9:58:51 pm] * Completed.
    [04-06-19 9:58:51 pm] * No images associated with this group as master.
    [04-06-19 9:58:51 pm] * Finding any images associated with this group as its primary group
    [04-06-19 9:58:51 pm] * We are node ID: 1. We are node name: DefaultMember
    [04-06-19 9:58:51 pm] * We are group ID: 1. We are group name: default
    [04-06-19 9:58:51 pm] * Starting Image Size Service.
    [04-06-19 9:58:50 pm] * Starting service loop
    [04-06-19 9:58:50 pm] * Checking for new items every 3600 seconds
    [04-06-19 9:58:50 pm] * Starting ImageSize Service
    [04-06-19 9:58:50 pm] Interface Ready with IP Address: tessa-vm <-- this is the host name of the virtual machine FOG runs on
    [04-06-19 9:58:50 pm] Interface Ready with IP Address: mail.odysseytours.nz
    [04-06-19 9:58:50 pm] Interface Ready with IP Address: 210.54.90.13 <-- this is my IP address assigned to me by my ISP
    [04-06-19 9:58:50 pm] Interface Ready with IP Address: 192.168.1.149 <-- this is the IP address of the virtual machine FOG runs on
    [04-06-19 9:58:50 pm] Interface Ready with IP Address: 127.0.1.1
    [04-06-19 9:58:50 pm] Interface Ready with IP Address: 127.0.0.1

    I am surprised to the see the reference to mail.odysseytours.nz

    This is an email server I am running on Ubuntu 18.04 on another computer. I have never registered this computer with FOG. Could there be a conflict with having to servers on the same subnet?

    Any suggestions?


  • Moderator

    @Jedi As for your unique configuration I don’t think boot through iPXE is the best choice for you. In uefi mode or bios mode, fog is mainly configured to boot to the first hard drive only. Both GRUB and rEFInd tries to find the first disk with a boot partition on it. It should find the windows disk no problem, but your linux OS behind the raid controller may be an issue. What I think you should do is set your disk as first in the boot order then when you want to image reboot your computer and use the F12 boot menu to select network. This is a directed network boot instead of booting through iPXE to get to your OS. Actually this is how we have it setup at my office. I want the techs sitting in front of the computer they are imaging, so they must hit F12 to get into the boot menu to select PXE boot. We do this to ensure we don’t accidentally image the wrong computer not because of a technical issue.

    You might be able to get it to work with rEFInd, but actually configuring the refind.conf file to create a boot menu where you will manually select the boot disk, but this add just another step between boot up and OS running.


  • Moderator

    @Jedi said in Stalling on FOG splash screen:

    but it seems that because I have specified ipxe.efi as the chainloader in the router that means every computer I want to deploy an image too must also network boot via UEFI

    Lets tackle the easy part first. If you have to support both uefi and bios computers on the same network then having a static dhcp boot option is very annoying. There are several ways to get around this limitation. If you are using a linux or a windows 2012 or newer dhcp server there is a configuration guide to help you configure your dhcp server for dynamic boot file support: https://wiki.fogproject.org/wiki/index.php/BIOS_and_UEFI_Co-Existence

    In your case you’re using a router for dhcp. Some routers support dynamic boot configuration files like pfsense, others do not like meraki. Actually meraki is worse because it always points to itself as the boot server and not the fog server, even if its configured to do so. Anyway… If you want to support dynamic boot files and your router does not support it, I would recommend that you install dnsmasq on your FOG server to supply the pxe boot information. In the configuration I’m going to give you, FOG/dnsmasq will only support the pxe boot information and not issue dhcp addresses. That function will remain with your existing dhcp server. Here is a tutorial on how to install dnsmasq on your fog server: https://forums.fogproject.org/topic/12796/installing-dnsmasq-on-your-fog-server If you use my configuration file exactly but replace replace <fog_server_ip> with your real fog server’s IP address it should take you about 10 minutes to install and get running. If your fog server, dhcp server and pxe boot clients are all on the same subnet, then you are done. If your pxe boot clients are on a different subnet, then you will need to log into your subnet router and add the FOG server’s IP address as the last host in your dhcp-relay (dhcp-helper) service on your subnet router. It should just work for dynamic boot file settings.



  • @george1421 Hi George, sorry to keep bothering you with this but it seems that because I have specified ipxe.efi as the chainloader in the router that means every computer I want to deploy an image too must also network boot via UEFI. From what I can gather every computer can network boot in bios mode but not every computer can network boot in UEFI mode. Virtualbox virtual machines are one notable example (https://forums.virtualbox.org/viewtopic.php?f=9&t=84349). So I need to be able to network boot in bios mode. I have changed the chainloader back to undionly.kpxe in the router. I am using SANBOOT as “Exit to Hard Drive Type”.

    My system has four storage devices:

    A NVMe 250GB drive with Windows 10 installed (ntfs)
    A 120GB SSD with DOOM installed (ext4)
    A 650GB Western Digital hard drive used as a back up (ext4)
    StarTech 4 port PCIe SATA III 6Gbps RAID Controller Card with two 2TB Western Digital hard drives which my linux OS is installed on (LVM2)

    If I specify “IBA GE Slot 00C8 v1547” as the first boot option and “ubuntu” as the second boot option (I can successfully boot into my linux OS with this option if not network booting in bios mode) and disable all other boot options I end up with a blank screen and a blinking cursor top left hand corner.

    If I add either the 120GB SSD or the 650GB HDD as a third boot option I end up with the following:

    error! no such device: 4a05a32b-f942-4bf2-815a-584d501366a.
    Entering rescue mode…
    grub rescue>

    If I boot into my linux OS via bios the output of lsblk is:

    NAME FSTYPE LABEL UUID MOUNTPOINT NAME SIZE OWNER GROUP MODE
    sdb sdb 111.8G root disk brw-rw----
    -sdb1 ext4 85e0230a-9028-4bfd-ae02-770525c04399 /mnt/85e02-sdb1 111.8G root disk brw-rw----
    sr0 sr0 1024M root cdrom brw-rw----
    sdc sdc 1.8T root disk brw-rw----
    |-sdc2 ext2 456c2955-b3fc-46e8-9340-484fd24e350a /boot |-sdc2 732M root disk brw-rw----
    |-sdc3 LVM2_me B4wxhE-1z8r-GV9D-j5ov-rOGD-1kty-wzMhM2 |-sdc3 1.8T root disk brw-rw----
    | |-zorin–vg-swap_1
    | | swap 835f8cbf-8f28-4552-9b40-3f851841f78f [SWAP] | |-zorin–vg-swap_1
    | | | | 976M root disk brw-rw----
    | -zorin--vg-root | ext4 545dd428-5b28-4ad4-9062-159d5e100767 / |-zorin–vg-root
    | | 1.8T root disk brw-rw----
    -sdc1 vfat 31BE-A03A /boot/efi-sdc1 512M root disk brw-rw----
    sda sda 596.2G root disk brw-rw----
    -sda1 ext4 ed26adf1-8d42-4af6-b52d-6c037c616847 /media/sdb-sda1 596.2G root disk brw-rw----
    nvme0n1 nvme0n1 232.9G root disk brw-rw----
    |-nvme0n1p3 ntfs FAA41560A41520A5 |-nvme0n1p3 502M root disk brw-rw----
    |-nvme0n1p1 ntfs New Volume
    | 01D313DE0D108410 |-nvme0n1p1 232.3G root disk brw-rw----
    -nvme0n1p2 vfat AA05-C22B-nvme0n1p2 100M root disk brw-rw----

    So as you can see Grub appears to be looking for a UUID that does not exist.

    Is the Grub Rescue program being provided by FOG? Because if its not then this is possibly not a FOG related issue and I need to be looking elsewhere.

    But if the Grub Rescue software is being hosted on the FOG server do you have any insights as to where Grub Rescue is getting the “4a05a32b-f942-4bf2-815a-584d501366a” UUID from? I am thinking if I can get Grub to look for the correct UUID that may solve my problem.

    Thank you for your help.


  • Moderator

    @Jedi said in Stalling on FOG splash screen:

    During the boot process I get the message:
    Waiting for link-up on net0…Down (http://ipxe.org/38086193)
    which takes about 15 seconds then it takes another fifteen seconds or so with
    Waiting for link-up on net1…
    This adds a lot to the time it takes for booting to complete, is there any way to avoid this?

    So if I understand this correctly, this device has 2 physical network adapters? The device that has the network connection is being detected as the second network interface? From the error message this is an ipxe (before fog menu is displayed) error and not FOS.

    I just looked at the fog ipxe script ( https://github.com/FOGProject/fogproject/blob/master/src/ipxe/src-efi/ipxescript ). I don’t see any 15 second delay so the delay must be in the iPXE code itself where the fog developers don’t have access to.



  • @george1421 Hi George, thank you for the time you have taken to reply to my post.

    I am happy to report I have solved my problem.

    When I said ‘if I change the “Boot from Network Devices” option to [UEFI first] I lose “IBA GE Slot 00C8 v1547” as a boot option entirely’ what I omitted to add was that I gained two new options:

    UEFI: IP4 Intel® Ethernet Connection (H) I219-V

    and

    UEFI: IP6 Intel® Ethernet Connection (H) I219-V

    I had tried these previously without success but that was with “undionly.kpxe” as the loader. Your post “You will get similar error if the computer is in bios mode and you send it ipxe.efi” was the key. Once I changed the computer to UEFI mode and set “UEFI: IP4 Intel® Ethernet Connection (H) I219-V” as the first boot option in the BIOS and changed the loader in the router to ipxe.efi it worked!

    During the boot process I get the message:

    Waiting for link-up on net0…Down (http://ipxe.org/38086193)

    which takes about 15 seconds then it takes another fifteen seconds or so with

    Waiting for link-up on net1…

    This adds a lot to the time it takes for booting to complete, is there any way to avoid this?

    I have made a small contribution of $50USD to FOG by way of thanks for your help.

    Receipt # 4157-1929-2864-5066


  • Moderator

    @Jedi said in Stalling on FOG splash screen:

    best raid controller

    Ok we have way to much going on here to focus on a solution. So lets reboot this with specific issues.

    Can you image with FOG? Lets ignore the pxe boot through issue. Does fog capture and deploy correctly to this hardware?


  • Moderator

    @Jedi said in Stalling on FOG splash screen:

    I have tried changing the boot loader in the router to ipxe.efi but that produced the error “NBP is too big to fit in free base memory”.
    This post indicated this error was caused by the client being “set to PXE boot in legacy BIOS mode but the binary offered to the client is UEFI” however if I change the “Boot from Network Devices” option to [UEFI first] I lose “IBA GE Slot 00C8 v1547” as a boot option entirely. Is this to be expected?

    I have to break the response into smaller bit size chunks because you have a lot going on here.
    The NBP size issue is because you are sending the wrong boot loader based on the target hardware. Meaning if your target computer is running in uefi mode and you send it undionly.kpxe you will see that error. You will get similar error if the computer is in bios mode and you send it ipxe.efi.

    On the uefi mode. I can’t speak for your hardware, but on the dell’s you need to enable the uefi network stack in the bios for the network adapter to be allowed for pxe booting.



  • @Jedi

    I have tried changing the boot loader in the router to ipxe.efi but that produced the error “NBP is too big to fit in free base memory”.

    This post indicated this error was caused by the client being “set to PXE boot in legacy BIOS mode but the binary offered to the client is UEFI” however if I change the “Boot from Network Devices” option to [UEFI first] I lose “IBA GE Slot 00C8 v1547” as a boot option entirely. Is this to be expected?

    https://forums.fogproject.org/topic/11828/nbp-is-too-big-to-fit-in-free-base-memory/2

    I then tried pxelinux.O.old but ended up with a blank screen and a blinking cursor so that was a big circle.

    So I went back to undionly.kpxe then tried all the options, one by one, under FOG Configuration --> iPXE General Configuration --> Boot Exit settings --> Exit to Hard Drive Type(EFI).

    No change.

    I then tried all the the options, one by one, under FOG Configuration --> iPXE General Configuration --> Boot Exit settings --> Exit to Hard Drive Type

    The “GRUB” option produced a blank screen and blinking cursor.

    The “GRUB_FIRST_HDD” option produced a

    “Launching grub
    Begin pxe scan start cmain()”

    screen where it stalled.

    Google produced this page which did not provide a solution.

    https://forums.fogproject.org/topic/9906/ubuntu-image-for-fog-clients/2

    The “GRUB_FIRST_FOUND_WINDOWS” option resulted in a grub4dos screen. Typing “ls” resulting in an error indicating it could not locate any drives so that seems a dead end also.

    I have read carefully through the refind.conf file. The only option that look like it may be of any benefit is the “also_scan_dirs boot,ESP2:EFI/linux/kernels”. I have tried uncommenting that with “REFIND_EFI” set as the option under FOG Configuration --> iPXE General Configuration --> Boot Exit settings --> Exit to Hard Drive Type. No change.

    I have gone as far as I can go.

    I chose the “StarTech 4 port PCIe SATA III 6Gbps RAID Controller Card” because I googled “best raid controller”. One site ranked it second, another third. This suggests to me this hardware is not obscure. I do not consider it unreasonable to expect FOG to support hardware as mainstream as this. It’s a bit disappointing really.

    Does anyone know of a raid controller FOG supports?



  • @george1421 Hi George, I apologise for misunderstanding your earlier post, I am new to FOG.

    If you are referring to FOG Configuration --> iPXE General Configuration --> Boot Exit settings
    Exit to Hard Drive Type is set to SANBOOT
    Exit to Hard Drive Type(EFI) is set to REFIND-EFI

    If I need to edit the refind.conf file in /var/www/html/fog/service/ipxe directory are you able to provide clear and specific information on exactly what edits I need to make.

    Thank you for your help with this, I really appreciate it.


  • Moderator

    @Jedi What I’m referring to is in the host configuration in FOG. If you look at the host configuration there are 2 exit modes. One is for uefi and one for bios. Typically you would set sanboot for bios mode and rEFInd for uefi mode. There are other exit modes so you may try each until you find one that works with this hardware. Typically rEFInd will locate the boot disk properly. There are a few settings you can tweak if needed.

    So to my question: if you look at the host definition, what is the current exit mode for uefi?



  • Hi George,

    Thank you for your reply.

    The host computer is Linux Mint 19.1 running as a virtual machine in VirtualBox on a Windows 10 host.

    So I guess that would make the boot manager Grub2?


  • Moderator

    There are a couple of things here.

    1. You changed the system configuration from bios to uefi. So your fog boot loader will go from undionly.kpxe to ipxe.efi, but that isn’t your problem here.
    2. In the host configuration for this computer, the exit mode bios will then change to exit mode uefi. What do you have configured for the uefi exit mode for this computer? Is it rEFInd or something else? If its refind you might need to tweak the refind.conf file in /var/www/html/fog/service/ipxe directory to get it to detect your boot disk.

Log in to reply
 

399
Online

5.7k
Users

13.0k
Topics

122.0k
Posts