FOG service on 0.10.6 not restarting after reboot
-
@Joe-Schmitt - Trunk 8257 - Joe, I had client 11.2 deploy to 25 Dell 790 machines today as a test and only 2 of them joined the domain. They all renamed but the service doesn’t appear to restart after the first reboot. No logs on PC after first shutdown command. I am attaching a fog.log from one of the machines and the apache error log during the time of the task. Hopefully this will tell you something:
APACHE ERROR LOG IMMED AFTER IMAGING COMPLETED
1_1467056255448_apache error 11.2 client non-restart.txtAPACHE ERROR LOG NEXT 60 MINUTES
0_1467072182110_apache err 11.2 next 60 minutes.txtMACHINE THAT DIDN’T JOIN DOMAIN - 192.168.132.165
0_1467056255444_fog.log
0_1467060379059_zazzles.log
0_1467060828192_.fog_user.logMACHINE THAT DID JOIN DOMAIN - 192.168.132.154
0_1467071798888_fog.log
0_1467071815418_zazzles.log
0_1467071823241_.fog_user.logOddly, I don’t have this issue on all machines I’ve been working on, mainly Dell 7020 and later. Maybe it’s a problem with the older hardware keeping up??
Edits: I found the zazzles log and user log thought that may also help so I just uploaded them. Seems like a clue here but fog.log hasn’t changed any since it stopped at 1:36.
-
@Joe-Schmitt The post below this one, that’s the issue I was referring to here: https://forums.fogproject.org/topic/7912/potential-client-issue
I wasn’t able to find a log though, yet. Good thing @gwhitfield posted one.
-
@Joe-Schmitt Joe, I edited my post (a couple times now) of logs to include a machine that DID join the domain, but then the service didn’t restart after the second reboot. These have both been manually rebooted a couple times since and the service refuses to start.
There IS one machine which seems to have experienced the same failures (non-joining domain and service start failure), but after a single manual reboot, the service did restart and the machine joined the domain and the service has continued to successfully restart during multiple manual reboots since. Unfortunately I will have to wait until tomorrow to get those logs since this machine is sleeping soundly and won’t wake-on-LAN.
EDIT: reading back early in the thread I find it may help to note these machines are WIN 10x64 LTSB (10240)
Thanks for your help and Wayne’s input as well!!
-
@gwhitfield Those logs indicate that the client is performing perfectly. Here is our current theory on what is happening:
You have the issue on Dell machines. Dell machines are notorious for having slight hardware configuration differences even on the same model (wifi, graphics, …). When deployed, Windows notices that driver XX doesn’t quite match the hardware. It applies the correct driver and schedules a restart. This scheduling of the restart locks Window’s power operations.
The client then starts up, joins the domain, and asks Windows to restart. Windows refuses since there is already a lock on the power operations. The client doesn’t know this, and assumes the power operation is under way. It then gracefully exits.
This theory is consistent with the logs you provided. Here is a quick test for you:
Select one of your problem machines. Make sure all drivers are installed correctly on this one machine. Re-upload the image using this machine and re-deploy to all of your problem machines. Are there less occurrences of your issue? If so then this is your problem.
-
The driver thing is interesting - and possibly could be part of things.
So today a colleague deployed to ~60 computers.
Half of them failed to start fog service after reboot from FOGClientupdate/renaming/joining domain.
Half of them worked fine.
The difference between them is a different model motherboard (Both ASUS, H61 and B75 based respectively)The image used was made on a VM, with no drivers pre-installed for either motherboard/chipset. In theory, both would need drivers to be installed during OOBE, so I’m not so sure what…
The H61 based systems should be slower, but not by a whole lot, at least not for imaging tasks.Anyways, if it is a driver update/WU based reboot, then this reg key will have been created. Could the fog client detect that, and instead of exiting, accelerate the reboot schedule somehow?
HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\RebootPending
Here is the relevant section of the log file - which looks ok, but looks like for some reason (scheduled reboot?) it did not reboot.
------------------------------------------------------------------------------ --------------------------------HostnameChanger------------------------------- ------------------------------------------------------------------------------ 6/27/2016 4:04 PM Client-Info Client Version: 0.11.2 6/27/2016 4:04 PM Client-Info Client OS: Windows 6/27/2016 4:04 PM Client-Info Server Version: 8263 6/27/2016 4:04 PM Middleware::Response Success 6/27/2016 4:04 PM HostnameChanger Checking Hostname 6/27/2016 4:04 PM HostnameChanger Removing host from active directory 6/27/2016 4:04 PM HostnameChanger The machine is not currently joined to a domain, code = 2692 6/27/2016 4:04 PM HostnameChanger Renaming host to HiLibLabBack4 6/27/2016 4:04 PM Power Creating shutdown request 6/27/2016 4:04 PM Power Parameters: /r /c "FOG needs to rename your computer" /t 0 6/27/2016 4:04 PM Bus { "self": true, "channel": "Power", "data": "{\r\n \"action\": \"shuttingdown\"\r\n}" } 6/27/2016 4:04 PM Bus Emmiting message on channel: Power ------------------------------------------------------------------------------
At this point, no snapins have run, and the computer does not reboot…I logged in via RDP:
6/27/2016 10:03 PM Main Overriding exception handling 6/27/2016 10:03 PM Main Bootstrapping Zazzles 6/27/2016 10:03 PM Controller Initialize 6/27/2016 10:03 PM Entry Creating obj 6/27/2016 10:03 PM Controller Start 6/27/2016 10:03 PM Service Starting service 6/27/2016 10:03 PM Bus Became bus server 6/27/2016 10:03 PM Bus { "self": true, "channel": "Status", "data": "{\r\n \"action\": \"load\"\r\n}" } 6/27/2016 10:03 PM Bus Emmiting message on channel: Status ------------------------------------------------------------------------------ --------------------------------Authentication-------------------------------- ------------------------------------------------------------------------------ 6/27/2016 10:03 PM Client-Info Version: 0.11.2 6/27/2016 10:03 PM Client-Info OS: Windows 6/27/2016 10:03 PM Middleware::Authentication Waiting for authentication timeout to pass 6/27/2016 10:03 PM Middleware::Communication Download: http://fog.XYZ.local/fog/management/other/ssl/srvpublic.crt 6/27/2016 10:03 PM Data::RSA FOG Server CA cert found 6/27/2016 10:03 PM Middleware::Authentication Cert OK 6/27/2016 10:03 PM Middleware::Communication POST URL: http://fog.XYZ.local/fog/management/index.php?sub=requestClientInfo&authorize&newService 6/27/2016 10:03 PM Middleware::Response Success 6/27/2016 10:03 PM Middleware::Authentication Authenticated
As you can see, for some reason (foguser?) the service restarted when a user logged in, and carried on - despite a pending reboot that never happened.
UPDATE: As of ~10:30PM, the machines that hadn’t finished the rename/reboot, rebooted on their own (must have been that scheduled power thing!) and started carrying on.
-
@Mentaloid quick question. You mentioned you used a universal image from a VM. Is it sysprepped? And if so, is the client service disabled on startup as described in our wiki?
-
@Mentaloid said in FOG service on 0.10.6 not restarting after reboot:
Anyways, if it is a driver update/WU based reboot, then this reg key will have been created. Could the fog client detect that, and instead of exiting, accelerate the reboot schedule somehow?
HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\RebootPendingNow there’s a great idea.
-
Yes, FOG service is disabled in the image, setupcomplete.cmd re-enables.
-
@Joe-Schmitt Well, no good news to report at the moment. I remembered this morning that Dell doesn’t offer Win10 drivers for the 790 so it’s extremely difficult to tell whether the “correct” drivers were installed by Windows. There are no obvious problems in Device Manager. I have toyed with several ideas:
- Creating a scheduled task to “net start FOGService” that I can either set up in Task Manager prior to sysprepping the image, or to use GPO to push the task (which of course doesnt work if the machines aren’t joined to the domain yet).
- Since I have been pushing drivers using Lee Rowlett’s scripts, maybe I’ll stop doing that for the 790 and let the Windows install figure them out, maybe it will come up with different ones than I was sending.
So many projects underway, it may be a while before I can get through these tests but I’ll let you know what happens as I do them.
-
@gwhitfield I’ve typically had success using Vista drivers for win7, and win 7 drivers for 8.1. You have to manually specify the drivers in device manager. You could try and it may work fine.
-
@Wayne-Workman @Joe-Schmitt I’ve had some initial success with using a “net start FOGService” batch file as a boot time task. 100% so far in 3-4 individual attempts, just haven’t proven it out as 100% successful on a large group of 790’s. If I find the crazy driver that’s doing this I will report back in this thread.
The client works on everything I have newer than the 790 both with LTSB and CBB images (both 10240 and 1511) , and has been perfect in Win7 on every machine I’ve tried, even MUCH older ones. In the long run (for me anyway), as the legacy machines go away hopefully it becomes a distant memory. I don’t envy you trying to deal with demon ninja gremlins of various operating systems and hardwares!
-
@gwhitfield said in FOG service on 0.10.6 not restarting after reboot:
demon ninja gremlins
We are certified demon ninja gremlin dispatchers.