SOLVED Database update failure after image capture
![alt text]( image url)
Please could I get some assistance to resolve this issue. Some brief context of how the issue happened is, after extended power outrages and the server shutting down due to UPS dying. I ran yum updates and the problem went away previously.
This time it did not work. I ran yum updates and updated Fog to version 1.5.9 from 1.5.7 and this has not solved the issue. I can deploy images without error/issue only capturing is an issue presently, which I need to urgently resolve. The error happens after image has captured and database needs to update.
Server is running Centos 7, Gnome 3.28.2
I have ensured the user credentials in TFTP under Fog settings and the Storagemanagement-Storage Nodes are identical.
I would really appreciate a fix/solution so as to mitigate further problems when power outages happen again in future.
@george1421 Thank you. I will delete them shortly to avoid any further issues.
@MikeBC In your last picture of /images/dev directory. There are several directories there with mac looking addresses. These are botched uploads. You can delete the directories and its contents they are just taking up space.
rm -rf /images/dev/082e5f1dface rm -rf /images/dev/64006a486167 rm -rf /images/dev/ecf4bb70a96c
unless you are currently capturing an image you should never have mac style named directories in /images/dev
Looks there was finally as solution. I ran yum update -y today and after a server reboot the image capture with database update worked 100%. I must thank @Sebastian-Roth for assisting me in the chat bubble running various scripts and looking many possible fixes to try remedying the issue experienced. The support received was fantastic, thank you kindly.
[root@imagingout ~]# ls -al /images/dev total 0 drwxrwxrwx. 6 fogproject root 130 Oct 6 08:20 . drwxrwxrwx. 10 fogproject root 186 Oct 6 08:21 .. drwxrwxrwx 2 fogproject root 190 Mar 20 2020 082e5f1dface drwxrwxrwx 2 fogproject root 222 Jan 14 2020 64006a486167 drwxrwxrwx 2 fogproject root 6 Apr 30 15:39 ecf4bb70a96c -rwxrwxrwx. 1 fogproject root 0 Oct 11 2019 .mntcheck drwxrwxrwx. 2 fogproject root 34 Oct 11 2019 postinitscripts [root@imagingout ~]# cat /etc/exports /images *(ro,sync,no_wdelay,no_subtree_check,insecure_locks,no_root_squash,insecure,fsid=0) /images/dev *(rw,async,no_wdelay,no_subtree_check,no_root_squash,insecure,fsid=1) [root@imagingout ~]#
@MikeBC Seems like it’s not creating the upload directory or maybe it does but in a different place. Please run these commands (again) and post output here:
ls -al /images/dev cat /etc/exports
We can also switch to chat for faster turnaround - chat bubble in the right top corner.
@Sebastian-Roth I ran the capture twice once to the /images/Client directory and then to the /images/ directory both times same error occurred that I have been experiencing. I have attached output.
@MikeBC Looking through the logs again I saw this at the bottom end:
[Mon Oct 05 10:29:844700 2020] [proxy_fcgi:error] [pid 55573] (70007)The timeout specified has expired: [client 10.0.3.129:45100] AH01075: Error dispatching request to :
I can’t make sense of this as you don’t seem to have many clients or others sending in requests that would cause I high load on Apache and PHP-FPM. Searching the forum a bit I found this similar sounding topic that seems to never have been solved: https://forums.fogproject.org/topic/14563/snapin-problem-on-fog-1-5-9-rc1-and-1-5-9-rc2
The other error messages we see from today are related to the FTP commands used to rename
/images/dev/f8bc1265c913to it’s destination in
/images/.... I really wonder if
/images/dev/f8bc1265c913really exists at this point?? It should be created when the machine runs the task to temporarily store the uploaded image files. Please run
ls -al /images/dev/f8bc1265c913and post output here.
@Sebastian-Roth I ran the task at about 10H30 or so. I wasn’t in front of the PC at the time to verify error on completion. I will run it again shortly and provide error message again.
@MikeBC What time did you run the task? I only see a few log messages for today. Which error did you get on the screen this time?
@Sebastian-Roth I have attached logs as requested
@MikeBC We need to figure out why you get HTTP 503. As you can obviously use the FOG web UI to schedule new tasks it doesn’t seem to be a general Apache/PHP issue. So please schedule another task and let the host run the task. Meanwhile open a root shell on your FOG server and run
tail -f /var/log/php-fpm/* /var/log/httpd/error_log(note there is a space between the two) to see what messages come in when the task finishes and shows the error message.
@Sebastian-Roth I have supplied output as requested
@MikeBC Please run
find /var/log -name "*php*"as root and post output here.
@Sebastian-Roth incidentally yes they did. However, I did delete that host and used a separate machine to test and problem with the 503 service unavailable still remains/persists. I also tried the dash and same 503 error came up.
The image path listed is to a Client folder for client specific images.
Ok, got me. I never used sub-directories in
/imagesand didn’t even know this would work.
Then we should look into why you get the
503 Service unavailableand
No Active Task found for Host. Did those two messages happen on the same machine one after the other?
@Sebastian-Roth Thank you for assisting. The image path listed is to a Client folder for client specific images. So some context to this. We have client specific images captured to a Client folder and then we have oem images for a wide variety of OEM vendors relating to makes and models in an oem folder. I also have a Test folder R&D/development purposes. I will try the dash, but I have captured over 300 images using this method with the slash. I only capture as needed when required. We focus on deployment of captured imaged.
Regarding the hosts query, my techs boot via pxe to deploy as such apart from the server running I don’t have any other hosts running, if I have this answered this correctly. We do have up to 50 different PC’s having an image being deployed at a time, not sure if this the load issue? I have a separate server as a back-up that runs fine with both capture and deployment, that is running version 1.5.8
We don’t use a proxy server for our network. Here is the version detail on login page:
Latest version: 1.5.9
Latest development version: 18.104.22.168
Also, I have updated kernel: bzImage version: 5.6.18 & bzImage32 version: 5.6.18
I have uploaded output of commands requested.
@MikeBC There seem to be a few things coming together in your case. We’ll try to look into those one by one. The first and most important one is the name you gave the image
Client/Skillspro_Win10...! It contains a slash. The slash is used on Linux systems as directory separator and it does cause trouble of you use it. We’ve had someone else just a week ago and we are looking into preventing this from the FOG web UI so people just cannot use a slash in the image name. So as a quick fix you edit the image and change name and path to something without
/- maybe just use a simple dash (
Please run the following commands and post output here:
ls -al /images ls -al /images/dev mount df -h
The first picture showing a
503 Service Unavailablemight mean that you have many fog-clients sending requests and putting too much load on your server. How many hosts with fog-client installed do you have?
ftp_putwarnings can be ignored because they happen because of the slash in your image name/path.
The series of
ftp_login(): login incorrectmessages can be ignored as well I would think. My guess is you played with the credentials at some point when trying to solve this issue. As we see
ftp_renameerror after that again I am fairly sure you have the credentials correct again.
The PHP warnings
Invalid argument supplied...stem from some code I added not long ago that pulls the current version information from github.com. Though I have tested this stuff on my setup it seems like it doesn’t work in all cases. Do you see the version information on the FOG web UI login page? Do you use a proxy server in your network?