Quick or Full Host Registration fails - screen goes black

  • Any ideas on this one:

    This issue is affecting me with both Fog 1.2.0 and now Trunk, running Ubuntu 14.04 Server. The host machine I am trying to register is a Dell Optiplex 7040M. I am using iPXE and everything works right up until I select either Full Host Registration and Inventory or Quick Host Registration and Inventory. Once I select either one of these the screen goes black - same thing happened on FOG stable and FOG Trunk (just updated it today using SVN).

    Any help would be greatly appreciated.

  • @Tom-Elliott Sorry for the delay - I am on an isolated network for my FOG Server - so right now only 3 machines in the FOG environment.

  • Moderator

    @Tom-Elliott I still have my test system setup. I’ll try different combinations in the morning to see if I can get it to fail while I’m watching it.

  • Do you see the same thing if booting in legacy mode? I’m going to guess a network uplink bug may be present and can’t load up the nic to enable it to get to network. This means it can’t boot to find the menu and subsequently gets an error returned causing it to fail. I wonder if this only happens on these systems from cold boot status?

  • Moderator

    @Tom-Elliott In my case it was on the dev box so only 2 or 3 systems were connected. Unfortunately I did not see what the error was only when I turned around I saw the chain loading error. I rebooted it without turning it off and watched it and the iPXE menu came up. Could have been a fluke or the update to the latest trunk today, or booting these 7040s in efi mode. I don’t do efi on any system in our production environment so I can’t comments other than I saw the results of the error twice during iPXE booting.

  • How many hosts are there in your environment both of you?

    I don’t think it’s timing but kind of is. If there were an error in code it would show up everytime for all machines. It would not show up randomly on even the same machine. Just a guess but I believe what you guys are seeing is inability to connect to the database due to meeting max connection limits of db and/or Apache max limits.

  • Moderator

    @kinger37 said in Quick or Full Host Registration fails - screen goes black:

    @TOM ELLIOTT On my first time booting into FOG PXE menu I receive the chainloading error, but after rebooting and going back in - it doesnt show up again.

    I have to say I saw that at random when I was testing too. I was doing 3 things at a time and just thought I missed something when I turned back. The time I actually watched the booting process it booted normally. So there may be a timing issue that is just on the edge of working. It will be interesting to know your production experience with this configuration.

  • @kinger37 Deployment was successful - I think I have it working now - thank you for everyones help.

    @TOM ELLIOTT On my first time booting into FOG PXE menu I receive the chainloading error, but after rebooting and going back in - it doesnt show up again.

  • @kinger37 #wiki worthy

  • @george1421 @Wayne-Workman I have just updated to the latest SVN 5858 and updated the firmware of the 7040 to 1.4.4 - wasnt getting a plain black screen anymore but kept failing on registration “An Error has been detected” “Cannot find disk on system”

    Solution to that issue was in my BIOS\UEFI settings - under Device Config-Sata Operation: Raid On was enable and I had to change it to AHCI.

    Testing deploying a image right now.

  • Just my personal experience - FOG Trunk from a month ago and a 7040sff out of the box worked fine. We are using them in UEFI mode, but are using legacy for network booting.

  • @george1421 Hi George - I just had it work - without doing an updates - not sure what changed. The only that i did try was doing the Legacy boot after scheduling a task to capture the image. I am going to try a straight from the factory 7040 and see what happens.

    If it fails again though I will update firmware and FOG trunk.

    Thanks again for all your help!

  • Moderator

    @george1421 Update 2: Success

    I had success, but not sure what really fixed it. The 7040 is booting in uefi mode.

    1. I updated the 7040 firmware firm 1.2.8 to 1.4.4 (note this is not a 7040M but a 7040 SFF)
    2. I first attempted to boot the FOS engine from a usb stick and it worked and it was fast!! With the FOS engine I was able to see the disk and they showed up as an NVM disk with its crazy name.
    3. Next I updated the fog trunk to what ever the latest release was about 10 minutes ago (8460)
    4. I then pxe booted into the fog menu, this time iPXE took about 6 seconds to initialize hardware. I selected the compatibility test and this time it detected the hard drive and network adapter.

    So what fixed it? Either the firmware update or the FOG trunk update. I have not yet deployed any images to this computer, but I suspect it will work, but I don’t have a uefi/gpt image to test on it.

  • Moderator

    @george1421 Update:

    I pxe booted using ipxe.efi into the fog menu. It sat for about 20 seconds at initializing devices then eventually called up the FOG iPXE menu. From there I selected check compatibility and the screen went black (like you noted). It was black for so long I turned around and started looking for firmware updates. Then I had a phone call. When I turned around the computer was sitting at the compatibility menu. This was probably about 5 minutes from the time I booted. I ran the compatibility test and the network card passed and the drive failed to detect.

    I know with these systems we had to add 2 KBs to our windows 7 build so that windows 7 could detect these drives. Not sure about the FOS engine.

    I’m going to do the following.

    1. Update my FOG dev box to the latest release
    2. Search for firmware updates for the 7040
    3. USB boot the FOS Engine to try to get into debug mode so I can attempt to interact with the drive.
  • Moderator

    @george1421 OK, our 7040’s come from dell preconfigured for legacy. So in legacy mode I can see the M.2 disk without issue. It shows up under the boot sequence as PM951 NVMe …

    Make sure your legacy rom setting is turned on under advanced boot options. (but again this is only if you want to run legacy mode).

    Let me switch this over to uefi mode and see what I can break.

  • Moderator

    @kinger37 Let me fire one up, I got sidetracked today (not in a good way).

  • @george1421 Hi George, I just finished testing that and you are correct - it registers correctly, however when i check inventory of that machine - no hard drive installed 😞

  • I also want to add something i just noticed - I have to use UEFI or my hard drive (M.2 256GB PCIe SSD Samsung PM951 NVMe Drive) isnt detected under Legacy boot. @george1421 Do you have the same hard drive as I do?

  • @george1421 I know what’s wrong, but I don’t know how to fix it :(.

  • Moderator

    @kinger37 Just to rule out some other issue, if you switch one of these target computers to bios (legacy) mode does it register correctly?