All hosts deleted/not showing in host list/groups
We have been using FOG for quite a few years and it has been working great up until a few weeks ago when hosts would not show up when searched for by host name, only MAC or by scrolling through groups. When a host was not found by searching with the host name, creation of a new host failed due to the host already existing, meaning the host was there but did not appear in search. Yesterday I rebooted the server in hopes to fix the issue with no luck. Today, ALL of our hosts are gone and are not found in groups, or by searching anywhere in the console. Everything was there yesterday and now no hosts. Images are still there; taking up approximately 500Gb of space, however it says 1.8Tb is being used, leading me to think hosts are still taking up the remaining 1.3Tb but I am unsure of this. Any ideas? Are there any automatic backups/logs that we can check. We are currently on 1.5.9. Thanks in advance!
@zaccx32 I have never seen hosts disappearing from the database out of the blue. Please check the database directly. On your FOG server open a terminal (or connect via SSH) and open the file
/var/www/html/fog/lib/fog/config.class.phpto note down the DB credentials (user is usually
fogmaster). Then connect to the DB:
shell> mysql -u fogmaster -p Password: ... mysql> use fog; ... mysql> SELECT * FROM hosts; ... mysql> SELECT COUNT(*) FROM hosts; ... mysql> quit
Post a picture or text output of what you get when querying the database like that.
Thanks for your reply Sebastian. I have ran the following commands in mysql and it appears the host table has crashed. Do you have any recommendations or suggestions on how to approach and repair this? Thanks.
@zaccx32 Oh well, that explains why you don’t see the hosts.
First check the available disk space as crashed tables are often caused by zero disk space. In the Linux command shell run:
df -h(if you need help reading the output, post a picture here)
If disk space is ok you go back to the mysql shell and run the query:
REPAIR TABLE hosts;(don’t forget to switch to the
fogdatabase first, see the queries I posted last). If that returns without an obvious error you can try the SELECT again.
We have checked the disk usage and as suspected it is at 99% although we dont know how or why. We cannot find anything that would be taking up that much space as our images only use approx 450Gb. Also, when trying to access the fog DB today, is cannot be found any more, almost like it is gone. Its almost like something keeps utilizing space over time and is getting worse with time. No idea what is happening!
Okay, we were able to find where all of the storage was being used. It was by images that were deleted within the GUI but were still saved on the server. Do we have to manually delete images on the backend in addition to the GUI? We had images from pre 2009 still saved in the directory. After cleaning that up, a few hosts showed up but most are still missing and our images are not showing up in the GUI either. Any other suggestions on how to get us back up and running? Thanks in advance!
@zaccx32 When you delete images from the imaging management page you MUST go into the image definition and delete them from there (in the image definition). If you delete them only from the image list then only the database entry is removed from the database and the raw image files are not touched. This is by design to keep people from accidentally removing the image record and raw files. If you have deleted the image definitions while in the list view you will need to manually go into the FOG server linux console and change to the /images directory and then remove the old image directories with this very powerful command
sudo rm -rf /images/<image_name>The -rf command switch is very powerful in that it will remove things without asking if you make the file path carelessly.
Another place to look is in the /images/dev directory. If there are any directories in there what look like mac addresses those can be removed as long as you don’t currently have a capture task underway. These files left over from the capture are botched image captures that can be delete.
After cleaning that up, a few hosts showed up but most are still missing and our images are not showing up in the GUI either.
While disk space is an important thing and 99% seems a lot in first sight, I don’t think this was causing the issue here. Why? Because we see you have a large disk with still 31 GB shown as free. I find it very strange that you see some hosts now. From the
SHOW DATABASES;output you posted I would have guessed the
fogDB is fully corrupted now and needs recovery (keeping my fingers crossed it’s still possible).
Also, when trying to access the fog DB today, is cannot be found any more, almost like it is gone.
Now we really need to figure out what’s going wrong with the DB. First of all let us take a look a the DB files on disk, the real basis:
ls -al /var/lib/mysql(post a picture here)
As well stop the DB and take a file backup. Then start it up again and check the log:
service mysqld stop cd /root mkdir backup rsync -av /var/lib/mysql backup service mysqld start tail -n 30 /var/log/my*
Post a picture of the output from the last command as well. I really hope I got all the commands right for your pretty dated CentOS 6.9 install.