FOG Client stops reporting & working
-
Server
- FOG Version: 1.3.0 RC-14
- OS: Fedora 21
Client
- Service Version: 0.11.5
- OS: Win10 LTSB x64
Description
I noticed this issue because our computers stopped shutting down overnight. Looking into it, I find that the log file for my computer hasn’t been written to since the 19th, 5 days ago. This started happening when we upgraded from RC-10 to RC-14. Yes big jump, but I was not at work for a month so nobody did it while I was gone. From logs, it looks like my computer has been on since the 19th of this month. Attached are the log and some screenshots.
Looking at the fog user log, It looks like the display manager is at fault.
-
@michael_f @Wayne-Workman @Hanz a patch has been developed and verified for this issue. It will be released with
0.11.6
(along with a few other new features) once release candidate testing finishes for the client. -
After restarting my computer, the file
C:\fog.log
is still not being written to.
However,.fog_user
did get written to, here’s the last entries in the file:------------------------------------------------------------------------------ --------------------------------DisplayManager-------------------------------- ------------------------------------------------------------------------------ 10/19/2016 3:24 PM Client-Info Client Version: 0.11.5 10/19/2016 3:24 PM Client-Info Client OS: Windows 10/19/2016 3:24 PM Client-Info Server Version: 1.3.0-RC-14 10/19/2016 3:24 PM Middleware::Response Success 10/19/2016 3:24 PM DisplayManager ERROR: Invalid settings provided ------------------------------------------------------------------------------ 10/19/2016 3:24 PM Service Sleeping for 118 seconds 10/20/2016 6:50 AM UserService Initializing - phase 1 10/20/2016 6:50 AM Bus Registering ParseBus in channel Power 10/20/2016 6:50 AM Middleware::Configuration ERROR: Invalid parameters 10/20/2016 6:50 AM UserService Initializing - phase 2 10/20/2016 6:50 AM Zazzles Creating main thread 10/20/2016 6:50 AM Zazzles Service construction complete 10/20/2016 6:50 AM Service Starting service 10/20/2016 6:50 AM Service ERROR: ServerAddress not found! Exiting. 10/24/2016 6:53 AM UserService Initializing - phase 1 10/24/2016 6:53 AM Bus Registering ParseBus in channel Power 10/24/2016 6:53 AM Middleware::Configuration ERROR: Invalid parameters 10/24/2016 6:53 AM UserService Initializing - phase 2 10/24/2016 6:53 AM Zazzles Creating main thread 10/24/2016 6:53 AM Zazzles Service construction complete 10/24/2016 6:53 AM Service Starting service 10/24/2016 6:53 AM Service ERROR: ServerAddress not found! Exiting. 10/24/2016 7:24 AM UserService Initializing - phase 1 10/24/2016 7:24 AM Bus Registering ParseBus in channel Power 10/24/2016 7:24 AM Middleware::Configuration ERROR: Invalid parameters 10/24/2016 7:24 AM UserService Initializing - phase 2 10/24/2016 7:24 AM Zazzles Creating main thread 10/24/2016 7:24 AM Zazzles Service construction complete 10/24/2016 7:24 AM Service Starting service 10/24/2016 7:24 AM Service ERROR: ServerAddress not found! Exiting.
-
@Wayne-Workman what is the content of settings.json?
-
@Joe-Schmitt It’s full of just a bunch of white spaces.
-
I do have a snapshot of our FOG Server that is from RC-10 that I am going to test briefly, I’ll also make a snapshot of the current Server state, and then move to RC-15.
-
I rolled back to RC-10 via snapshot, reset encryption data for every host, and then rebooted my computer. The problem persisted. Same exact error in the user log, the fog.log still isn’t being written to.
At this point I doubt it’s a server issue. @Joe-Schmitt
I also believe all computers in my building are affected…
I’ll send out a snapin to do a ‘roll-call’ of sorts and see what happens.I’m going to go back to my snapshot of RC-15, reset encryption data on all hosts again, re-install the FOG Client on my computer, and move forward.
Here is
C:\Program Files (x86)\FOG\fog.log
from my computer before I uninstall. -
@Wayne-Workman the problem is what I mentioned earlier. Settings.json has been cleared by something.
-
@Joe-Schmitt But by what? I didn’t do it clearly. What could possibly do it?
-
An update on this, at least 140 out of 450 computers in my building are not affected… The rest, I don’t know. Maybe they were off, maybe they don’t have the fog client installed, or maybe they have the issue.
-
-
Guessing this is still in a “we have no idea yet”, I’m just trying to keep this near the top of the list so it doesn’t get lost.
-
Still no idea. I’ve been running the debugger client that Joe gave me. It’s not yet experienced the issue, and therefore has not created the information file yet.
-
I’m very strongly thinking this related to a Windows update. Particularly one in regards to the .NET framework.
-
@Tom-Elliott Between our summer deployment and now, we’ve ran zero windows updates. We can’t risk downtime, and windows updates are not trustworthy enough, and we have too large a gap of downtime in the summer to not utilize for imaging an updated image with.
-
@Wayne-Workman I’m only giving the information as I’m seeing it.
This appears to have started in late July, early August, correct? Which would’ve been around the time (potentially) an image was being updated? I don’t know.
I still am leaning towards some windows update, and particularly a Windows Update in regards to the .NET Framework.
It’s the only guesses I can have at this point and with the “debug client” we should’ve seen something by now if it was server related. I suppose there could still be an issue in the client, but considering the client running has probably usurped the number of cycles before the issue became present, I would imagine this is not the issue either.
-
I don’t know if this is related:
I had an issue where the content of the settings.json got corrupted on half of the computers and I had to reinstall the agent manually. I don’t know what killed the contents of the file. That happened with the stable version of fog (1.2.0). After reinstalling the agent I upgraded to trunk and am now waiting to see what happens next.
-
@jhuebner The new client never worked with the version 1.2.0 of FOG. The new client was ONLY introduced during the trunk builds. It’s no wonder why they got corrupted, they had nothing to work from if you really did have the new client and FOG 1.2.0.
-
Thanks for your quick reply.
I was using the Agent 0.11.0 with 1.2.0, I think that’s the one that came with 1.2.0. I am of the opinion that, whatever version, nothing should kill the settings file so that the agent/zazzles does not work anymore. I am in the good position that our production environment is more of a lab-style character (university campus) - so the world does not end when the agent is not working. I just have to do more manual work which I of course try to avoid.
Besides that: I really like the project and appreciate the work you put into this!
-
@jhuebner 0.11.0 was NOT by any means the version that shipped with FOG 1.2.0.
The reasoning a file might go corrupt in the new client particularly with 0.11 on 1.2.0 (and potentially others) is because the client get’s configuration information from the Server. This configuration information has the potential to change each cycle, but to ensure the client operates during the time when a server may be unreachable we update information internally. If this data is corrupted the client usually cleans itself up to ensure something doesn’t go wrong, but if it can’t do that and the file is “open” there is a possible layer for the file to become corrupted.
We haven’t figured out WHAT is causing this and have been working to try to replicate the problem. To me, it seems like a windows update, but this is only my opinion.
-
@Tom-Elliott Thanks for the explanation. I am now in the process of upgrading all four fog instances I manage to trunk. This way I might be helpful to advance this further. Keep it rolling!