Clients (randomly?) boot to Debug Mode on regular deploy or capture
-
FOG server: Debian 12
FOG storage nodes (x5): Linux Mint 22.1
All on FOG Stable: 1.5.10.1667Issue: Clients, seemingly at random (or I just haven’t pinned down the context) boot into Debug Mode instead of automatically deploying or capturing. In those cases, running fog.download or fog.upload works correctly, step-by-step, but obviously isn’t great for mass deployment.
I’m primarily working on deploying Windows 11 to Dell 3430 series PCs, but the Debug Menu problem also occurs with our older, stable Windows 10 / Dell 7010 images.
What I’ve done to get myself into this:
I inherited a network / system that was running FOG 1.5.9 on Debian 9. I used it as-was for several months until I saw a schedule break that made sense for upgrading.
I made new installs of Debian 12 (on the FOG server) and Linux Mint 22.1 (on the 5 storage nodes) on new(er) PCs. Then I copied the old /opt/fog and /images directories to the new PCs, as well as the existing mariadb database. Finally, I ran the 1.5.10.1667 install script on each, it using the pre-existing .fogsettings.
Coming at this upgrade from my oblique and ignorant angle, I did run into several problems with permissions, etc, but learned a fair bit about FOG in the process of head-banging. But now I’m stuck on this Debug Mode issue. I’m really sorry that I haven’t figured out what makes it (deploy/capture) work automatically (no Debug) sometimes and not others. My best guess so far is it might depend on which storage node gets selected for the task.
I think I’ll power down the 2 storage nodes that I know have been used when Debug Mode engaged and work from there. But I’m hoping someone here will immediately recognize this problem and say, “Hey Dummy, just do [this simple thing] right!”
Thanks in advance.
PS: yes, I know Mint isn’t supported, but other than one package install glitch, I’ve seen no obvious problems.
-
I think I’m onto it. Browsing the table ‘globalSettings’ I found FOG_KERNEL_DEBUG which was empty:
MariaDB [fog]> select * from globalSettings where settingKey = 'FOG_KERNEL_DEBUG'\G *************************** 1. row *************************** settingID: 133 settingKey: FOG_KERNEL_DEBUG settingDesc: This setting allows the user to have the kernel debug flag set. Default is off. settingValue: settingCategory: FOG Boot Settings 1 row in set (0.001 sec)
So I changed its value to 0:
MariaDB [fog]> select * from globalSettings where settingKey = 'FOG_KERNEL_DEBUG'\G *************************** 1. row *************************** settingID: 133 settingKey: FOG_KERNEL_DEBUG settingDesc: This setting allows the user to have the kernel debug flag set. Default is off. settingValue: 0 settingCategory: FOG Boot Settings 1 row in set (0.001 sec)
Then I canceled and restarted my deploy task to client Master-3430 and it went straight into deploying, no debug mode.
Assuming this is my ‘global’ fix (Edit: looks like it is. 2 auto deploys to Master-3430 and 2 to an older BIOS boot client), it leaves some questions. Is “empty” supposed to be a valid value for FOG_KERNEL_DEBUG? If not, did my db miss some upgrade step because I fumbled my way through copying files? How did this problem crop up just in my case / is it likely to happen to other upgraded installations?
I could resurrect my old server and inspect its database if that is helpful to you.
-
@RAThomas If you’re comfortable with mysql, I might suggest taking a look at the database:
You’ll have to get your fogmaster password from the /opt/fog/.fogsettings file.snmysqlpass
sudo mysql -u fogmaster fog -p select hostID,hostName,hostKernelArgs where hostKernelArgs LIKE '%debug%';
This might not be the best way but since I don’t know where the debug will show up in the hostKernelArgs, this is the best method I can give right now.
Most likely, I believe, you’re going to see your hosts with these.
If that doesn’t yeild anything, are the hosts part of a group? in which case the group likely has the kernel arg also, so check with:
select groupID,groupName,groupKernelArgs where groupKernelArgs LIKE '%debug%';
I’m suspecting one if not both of these will yield you back the results to help you fix the affected items?
-
I’d already browsed the DB (via phpMyAdmin) for anything debug-ish, but didn’t really know where to look. Anyway:
MariaDB [fog]> select hostID,hostName,hostKernelArgs from hosts where hostKernelArgs like '%debug%'; Empty set (0.001 sec)
MariaDB [fog]> select groupID,groupName,groupKernelArgs from groups where groupKernelArgs like '%debug%'; Empty set (0.001 sec)
For completeness, a BIOS boot client (previously a months-stable FOG client) that booted to debug mode on deploy just yesterday uses ‘undionly.kpxe’ via DHCP. My more recently added UEFI boot client that has booted to debug mode uses ‘snponly.efi’ via DHCP.
Edit: It just occurred to me that I should check the actual kernel args once booted into debug mode. Check logs too? Anything else I can check on the client side at runtime to provide clues?
-
@RAThomas I’m concerned that it’s randomly entering debug. If it were all machines, all the time.
That said, when you create the image you can define the task type to also include debug mode.
can you walk us through the exact steps?
If it were all, you’d look at fog settings.
The UI probably shows it as just Kernel Args, or something similary:
The DB side would be:
select * from globalSettings where settingKey = 'FOG_KERNEL_ARGS'\G
Sorry for not having a clean direct answer right away, but working through it is all we can do right?
-
@Tom-Elliott said in Clients (randomly?) boot to Debug Mode on regular deploy or capture:
but working through it is all we can do right?
Got that right.
I’ve verified “isdebug=yes” for the booted client kernel args, so I’m probably just missing something obvious on the other end. Checking…
Result:
MariaDB [fog]> select * from globalSettings where settingKey = 'FOG_KERNEL_ARGS'\G *************************** 1. row *************************** settingID: 51 settingKey: FOG_KERNEL_ARGS settingDesc: This setting allows you to add additional kernel arguments to the client boot image. This setting is global for all hosts. settingValue: settingCategory: General Settings 1 row in set (0.001 sec)
For the client ‘Master-3430’ that I’ve got booted to debug mode right now:
MariaDB [fog]> select hostName,hostKernelArgs from hosts where hostName = 'Master-3430'\G *************************** 1. row *************************** hostName: Master-3430 hostKernelArgs: 1 row in set (0.000 sec)
-
So once tasks get created it will have 'taskDebug`
set to 1 I believe:
so Maybe;
select * from tasks WHERE taskDebug=1;
It may help point us to why you’re seeing them go into debug. Theres a few other options though may need a remote session (if and when you’re able and I’m able as well I suppose?) to try to troubleshoot a bit further.
-
I think I’m onto it. Browsing the table ‘globalSettings’ I found FOG_KERNEL_DEBUG which was empty:
MariaDB [fog]> select * from globalSettings where settingKey = 'FOG_KERNEL_DEBUG'\G *************************** 1. row *************************** settingID: 133 settingKey: FOG_KERNEL_DEBUG settingDesc: This setting allows the user to have the kernel debug flag set. Default is off. settingValue: settingCategory: FOG Boot Settings 1 row in set (0.001 sec)
So I changed its value to 0:
MariaDB [fog]> select * from globalSettings where settingKey = 'FOG_KERNEL_DEBUG'\G *************************** 1. row *************************** settingID: 133 settingKey: FOG_KERNEL_DEBUG settingDesc: This setting allows the user to have the kernel debug flag set. Default is off. settingValue: 0 settingCategory: FOG Boot Settings 1 row in set (0.001 sec)
Then I canceled and restarted my deploy task to client Master-3430 and it went straight into deploying, no debug mode.
Assuming this is my ‘global’ fix (Edit: looks like it is. 2 auto deploys to Master-3430 and 2 to an older BIOS boot client), it leaves some questions. Is “empty” supposed to be a valid value for FOG_KERNEL_DEBUG? If not, did my db miss some upgrade step because I fumbled my way through copying files? How did this problem crop up just in my case / is it likely to happen to other upgraded installations?
I could resurrect my old server and inspect its database if that is helpful to you.
-
I bet if I had gone into the web “FOG Boot Settings” and toggled the check box for “Kernel Debug” to ON and then back OFF if that would have fixed it for me too. Instead of me editing the database directly.
I’m ready to call this one “Solved” for my own purposes. I do wholeheartedly thank you for pointing me in the right direction!
-
-
@RAThomas If the value is empty it shoudl think it’s false, and since it seemed to be random that’s where I got confused. But I’m glad this is working for you.
Thank you!