FOG menu boot loop after image deployment

AllFoggedOut

My “possible bug” comment related to the fact that, after changing “Image Path” and “FTP Path” in Storage Management -> DefaultMember, to a new path, the FOG script file “src/buildroot/package/fog/scripts/bin/fog.checkmount” still contained the old path, which caused image deployment to fail at the “Checking Mounted File System” stage. The reason it failed is because I deleted the old, defunct path from disk.

@george1421, yes, this is still an iPXE exit mode issue.

In summary, Win7x64 Legacy == iPXE Exit Mode loop. Win10x64 EFI == no issues.

Tom Elliott

@AllFoggedOut Unfortunately, once the client is loaded into FOS, the only way to change the kernel based variables is to reboot the client. You could make a call to get new data, but that would/could put tremendous load back on the server (imagine 20 hosts booting to perform an imaging task, all calling out to the main server to update their scripts).

It’s simpler just to reboot the machine (or machines as the case may be) to pick up the new data.

Omar.rodriguez

This post is deleted!

AllFoggedOut

@Omar.rodriguez said in FOG menu boot loop after image deployment:

I understand I might be late to this thread… but we’ve ran into this issue here at the office I work at. Do you happen to have 2 hard drives on your computer?

The NUC contains a single M.2 SSD

When you get the FOG menu on the PC and you press the “Esc” key does it boot into the OS?

No, I get an error message about chainloading failed, and an auto reboot in 10 seconds. This is for Win7x64 using SANBOOT.

AllFoggedOut

@Tom-Elliott said in FOG menu boot loop after image deployment:

It’s simpler just to reboot the machine (or machines as the case may be) to pick up the new data.

The script I’m referring to is/was on the server.

Tom Elliott

@AllFoggedOut You’re starting to confuse me.

The “bugs” you were referring to were in direct relation to the imaging things. What I’m trying to state is the “bugs” you saw were in regards to the FOS system. This has nothing to do with what you’re seeing in regards to the boot menu loop.

The only reason you saw issues in imaging earlier is because you made a change somewhere. These issues should no longer be present since you’ve made the corrective actions.

Tom Elliott

@AllFoggedOut said in FOG menu boot loop after image deployment:

During the course of testing I had to move my image storage directory owing to a lack of space. I moved all files under /images to another LVM volume group. I then updated the path in Storage Management. I then ran into an issue post-image capture (repeated “Database Update failed” messages) which was resolved by changing permissions on the new folder (and sub-folders to fog:fog and mode 775). Not sure if this is correct, but it worked. Apache error_log showed the FTP rename operation failing. I then had a 2nd issue during image deployment where “Checking Mounted File System” failed. Seems the script “src/buildroot/package/fog/scripts/bin/fog.checkmount” still contained the old storage path (possible bug?). I also must have missed the “.mntcheck” file when moving files around - had to recreate it.

This is what I’m referring to.

Your first issue is because permissions, once that was done the next issue is because the .mntcheck file didn’t exist. Now that those issues are corrected (these are both within the init’s btw) the bugs you thought you were seeing will no longer be there.

AllFoggedOut

Ok, I’ll try and explain this from a different angle.

I have an issue with iPXE Exit Mode loop that is affecting Win7x64/Legacy BIOS.

In the process of trying to diagnose this issue, I decided to install Windows 10/EFI to see if this also suffered from the same problem.

During the cloning of Windows 10 I realised I was running out of disk space in “/images” on the FOG server.

I aborted the clone, cancelled the imaging task and set about creating a folder for use by FOG to store images.

I moved everything under “/images” to “/stor/fog”. Via the FOG web interface, I went into Storage Management -> DefaultMember, and changed “Image Path” and “FTP Path” to the new folder.

I then recreated the imaging task which failed because I had the wrong permissions on my folder (my fault, not a bug).

When I then attempted to deploy the same image, it failed because the FOG script “fog.checkmount” still contained the OLD storage path. Why did this script still contain the old path? Shouldn’t it have been updated to the new path when I updated DefaultMember?

I also failed to copy “.mntcheck” from /images (my fault, not a bug).

I also deleted “/images” from the server. If I had left this folder in place, I’m guessing the “Checking Mounted File System” would have just continued to work, despite pointing to a defunct/no longer used location.

Now that I’m thinking about it, the NFS exports were also not updated. I had to manually edit /etc/exports, put the new folder in and run exportfs -a.

My assumption was that updating DefaultMember would mean:

a) NFS exports is updated with the new folder
b) fog.checkmount is updated with the new folder

Perhaps the preferred/supported approach is to simply add new storage areas, rather that editing the existing ones.

None of the above has any relation to my iPXE Boot Menu issue with Win7x64/Legacy. It is a coincidental observation that I thought I’d bring to your attention in case it’s a bug with updating Storage Management paths.

Tom Elliott

@AllFoggedOut The client (FOS) debug or otherwise, will only update the elements if the client is rebooted.

I’m going to guess you were in debug mode on the client you were telling to re-deploy the image?

Tom Elliott

Pinging you on chat so as to keep the arguing in forums to a minimum.

Ultimately, you should not have the problem any more now that the system is setup properly. If you want to test this, you can send the client to deploy again. It will pick up the updated storage information and should work properly.

If you can post screen shots of the error (if it is indeed occurring) it would certainly go a lot further to verifying and fixing the issue. This, however, I have not seen.

This, of course, doesn’t fix the boot loop mechanism. I’m just trying to help clarify points as best I can.

AllFoggedOut

Thanks to @TomElliott for clearing things up via Chat. Seems I dug myself a hole when changing storage paths.

I’ve posted about the F10 boot weirdness on the Intel Support forums. Will update based on what comes back.

Quazz

Afaik, aside from updating the Storage node, you also need to update the /opt/fog/.fogsettings file with the correct path and then rerun the installer. (not necessarily addressed at anyone in particular, more for people who might run into similar problems)

AllFoggedOut

Just a quick update: Intel are still investigating. They did release v57 BIOS asking me to test it ~~but it hasn’t resolved the issue~~.

Will update if any/more progress is made.

Edit: see my update below.

AllFoggedOut

So, for reasons I don’t quite fully understand, the problem appears to be resolved (it wasn’t working yesterday after I installed v57 BIOS…)

I was looking at BIOS options, hoping to find something along the lines of “Always show Boot Menu”, thinking this might be a hack to work around the problem. Following a reboot I discovered the NUC now boots from iPXE Boot Menu w/o issue.

I’m still a little confused as to how this is now working, after testing it yesterday and finding it wasn’t playing ball. Not sure if the act of disabling the SATA channel has somehow adjusted configuration somewhere (I re-enabled SATA to confirm if that was the cause, but it made no difference - everything still works fine).

I’ve just completed a Legacy Win7x64 SP1 image deployment + reboot - no issues at all.

Weird…but in a good way.

george1421

@AllFoggedOut I did find while playing with the NUCS, sometimes you need to power them off after making changes in the firmware in opposition to just restarting the device. I can’t remember the exact situation now, but we had one model that after you changed the bios settings and saved them you actually had to power off (remove the power cable) to the nuc before it would use the new setting.

So now when you get the second one it should only take you about half the time this one took to get things moving, right? Either way, great job. This nucs are sweet, but a bit on the finicky side too.

AllFoggedOut

@george1421 Yeah, maybe that was it - just a cold boot after the BIOS update to 'bed things in". Like I said, I was stunned to see the NUC suddenly behaving, so I went back in to BIOS and undid my changes to confirm if that was the cause - still boots w/o issue.

I recall there being a Wiki page somewhere listing hardware that is known-to-work with FOG. Would be good to update that with this information. Not sure if this account has permissions to update…otherwise I’ll create a new one.

FOG menu boot loop after image deployment

119

12.2k

17.4k

155.5k