Unfreeze drive from FOG init image
-
We have been modifying the FOG init image to include some bespoke functions, useful to our organisation. One of these requires us to occasionally unfreeze frozen drives. Normally we would use systemctl suspend but this functionality is not available in the FOG init image.
Any ideas?
Thanks!
EDIT: Might have solved my own problem (imagine that)
https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernateWill update here if I get it to work.
-
@c4c If you need to add functions / additional programs into the inits FOS is built on buildroot. If you are extensively tweaking the inits, there may be value in setting up a buildroot environment to rebuild the inits with your added in modifications. While this is not hard to do, it does take a little time and then the first build while creating the toolchains will take some additional time.
If you need to go this far I can give you some guidance.
-
@george1421 Thank you for your response.
We compiled a custom FOS kernel following the instructions https://docs.fogproject.org/en/latest/reference/compile_fos_kernel.html?highlight=FOS but with CONFIG_SUSPEND=y etc.
This technically works, we can suspend and resume now and machines can be accessed via SSH but their screens stay blank. We’re assuming graphics driver issues but all of the information we can find for diagnosing and fixing these issues relies on tools which are not in the init image and are more complicated to add to it.
As such, we are now looking at following your advice and setting up our own buildroot environment but we’d like to keep everything as close to the default FOS setup as possible. I cannot find anything on docs.fogproject.org that would help with this so any guidance you can give would be extremely helpful!
Thank you.
-
@c4c OK, Firstly I’ve detected you are a bit more advanced since what you have done so far. If you are not familiar with linux then don’t follow these directions since there will be holes and its not a complete step by step only a direction of the path you need to walk.
first let me say my buildroot environment is setup differently than the way the dev’s setup their buildroot environment. Mine’s different because I build other embedded OS images outside of FOG.
First you will need a linux computer to build the buildroot environment. Use current release of debian [10 or 11] or ubuntu 20.04 and install the build essentials package.
For FOG 1.5.9 use this version of buildroot: https://buildroot.org/downloads/buildroot-2020.02.tar.gz
Expand that tarball out to a working directory (i.e. ~/work). In the same working directory ~/work clone the FOS Linux repository on github.
git clone https://github.com/FOGProject/fos
That will create a fos directory (~/work/fos).Copy
~/work/fos/Buildroot
~/work/buildroot-2020.02
edit
~/work/buildroot-2020.02/packages/Config.in
add in the section from ``~/work/buildroot-2020.02/packages/newConfig.in` into near the bottom of the Config.in. This will add in the FOG package options into the buildroot menus.Lastly you need to copy over the fog settings for buildroot into your buildroot tree.
Copy
~/work/fos/configs/fsx64.config
to~/work/buildroot-2020.02/.config
(yes the hidden config file that starts with a dot).Once you have everything in place from the buildroot base directory (~/work/buildroot-2020.02) key in
make nconfig
(you might get an error about missing libraries here, go back and load them then run it again).You should now be in the buildroot configuration menu. I want you to check to see if the FOG package menus are listed and they are checked. This will confirm you have setup everything needed correctly. I know this is a lot of manual setup work, but in the end it will allow you to start at the same point FOS linux is for 1.5.9.
The FOG added in menus will appear as this:
Save the settings in the buildroot menu configuration and exit.
Now key in
make -j4
and that will start the process. If you have more than 4 processors you can increase the number of threads to use to decrease the build time. The first time through it may take 1hr to build the init.xz file. This is because its building all of the programs needed to build the init.xz file. On the second run it will be much faster since its only building the programs needed to build the init.xz file.Once the init.xz file is complete move it to the FOG server as init_test.xz (to not mess up the fog provided init.xz file). Now for your test target computer, go into the host management page for this specific computer and insert init_test.xz into the initrd field for this computer, save it. Now pxe boot the target computer, pick like hardware verification, watch the screen quickly as it will transfer bzImage and then init_test.xz to the test computer. If it does transfer init_test.xz then you have FOG configured correctly.
This first run don’t change any settings from what the FOG developers have provided. You want to test to make sure you can successfully build the init.xz file. From that basis then you can make changes to the configuration using the
make nconfig
command. If you need to include files or stuff into the init.xz file you can add them to the~/work/buildroot-2020.02/board/FOG/FOS/rootfs_overlay
directory structure. These files get copied into the init.xz file as its being created. Any tweaks you did by unpacking the init.xz file can be inserted here.I know this is A LOT of information, because buildroot IS very complex. BUT you can modify the buildroot packages to include to give you the exact initrd you need.
-
@george1421 That’s literally perfect, thanks! Seems fairly straightforward, I’ll update this thread with progress later.
As a note, we are charity refurbishing computers/laptops and donating them on. We have our own external database that we push data to but we use FOG to harvest data (we’ve created our own inventory and registration scripts to record all drives and pci devices and every RAM module separately). We also use FOG to image outgoing devices and to load tools like parted magic. We are now writing our own data erasure and verification script which is why we need to be able to sleep machines (in order to unfreeze drives), this will be used to effectively complete the automation of processing incoming items, currently we have to use external tools to perform data erasure and some of these we have configured to also report into our database but it would be quicker/more convenient to be able to do it all through 1 boot of the machine, hence adding the functionality to our FOS image.
-
@c4c Well that’s great you found a new way to extend fog beyond how the developers imagined it. Well done.
Through some testing I found that while FOS Linux is its own operating system it most closely resembles Ubuntu 16.04 in its libraries as shipped. So, I found that precompiled applications for Ubuntu 16.04 seem to work on FOS Linux built on buildroot 2020.02. So if the application you need is not in the buildroot catalog sometimes you can borrow it from other sources precompiled.
-
@george1421 Just wanted to update and say thanks again! Your guide is excellent, very easy to follow and everything is working. It would be really useful to have something like this up on fog docs.
Unfortunately we still haven’t been able to solve our core issue though. We can suspend to ram and then wake up (and ssh in) but the screens don’t wake up. I’m sure we’ll figure it out sooner or later. We are certainly learning a lot in the process!
-
@c4c The screen doesn’t wake up. I think I was focused on you adding things to the init based on your post subject.
You have to remember that FOG is intended for hit and run computing. It boots, it images, it reboots. There is no time for sleeping.
With that said I’m betting there are bits missing in the linux kernel, since the kernel is responsible for managing the computer’s hardware. The FOS Linux kernel is not a general purpose OS, its customized specifically for imaging (hit and run and run computing).
If it doesn’t come out of sleep then I would look at the ACPI settings in the kernel config. These values are not enabled in the FOS Linux kernel.
It may also be a display driver. If you want to go down this path (and learn a bunch more). (I would first just turn on all of the apci functions and rebuild the kernel to see how it goes then do the hard parts [next]) Load a full linux distro onto this hardware. Then run this command
lsmod
This will list all of the kernel drivers that are running. You will need to look at each one to decide what they do. There may be one specific for either the display or possibly back light that needs to be included in the FOS Linux kernel.One thing you need to know about linux, you can either have dynamically linked drivers or integrated (compiled in). For speed and simplicity FOG uses compiled in select drivers, a traditional linux OS will typically used modular drivers to support a larger fleet of compute functions. The lsmod command will list those dynamically linked modules. You can not use the lsmod command to find the names of the compiled in modules, but there are other ways to determine that.
Building the linux kernel is much like building the initrd file with buildroot. The process is similar so what you know now will help you with the next phase if you want it.
-
@george1421 We’re half a step ahead here! Already enabled all the CONFIG_ACPI options and CONFIG_SUSPEND etc. and built a new kernel. We also enabled RTC for rtcwake but that still leaves the screen blank. It also turns out that acpitools (something we thought might work) tries to suspend using /proc/acpi/sleep which, from what I have read, is deprecated because it’s an abuse of /proc.
We are running under the assumption that at least some power management tools have scripts built in to reinitialise monitors/displays etc. using methods which are currently unknown to us and that the easiest solution would be to just get one of them to do it all for us. Unfortunately, so far, this hasn’t worked out.
Right now I’m thinking we need to see if we can just tell stuff to turn back on so I might have to figure out enough ASL to turn the screen back on using acpiexec and an AML file. I’ve also added kexec in to our init though so maybe we can just reload the init after sleeping to unfreeze and that will restart everything.
Another option might be (if this is possible) to use ASL/AML/acpiexec to directly remove power from the drive and then power it back up.
hdparm has an unfreeze command (doesn’t always work) but unfortunately nvme-cli has no equivalent, even tried setting the power states to their lowest possible and then setting them back, also tried the reset command (didn’t work) and the reset-subsystem command just made the drive disappear.
In general, in buildroot, I can’t find any power management tools other than the acpi stuff and powertop (which is useless). Another option might be to try and add in uswsusp manually but the last build of that is from 2011!
The amount of stuff we are learning to solve one, small problem is nuts.
-
Final update: Got everything working! The monitor not waking up after sleep was fixed by adding in some extra graphics drivers to the kernel and we’ve found some solutions to wiping difficult machines (like Lenovo’s) which normally force you to use tools built into the UEFI or a special Lenovo wipe program that only works on certain models and asks you to confirm 5 times and gives you a random generated key that you have to enter in after a reboot. As a note, even parted magic fails to wipe Lenovo drives.
Anyway, thank you @george1421 you’ve helped me to learn a great deal and enabled our success in this.