Fog client weirdness
-
So there I am, looking at why some of our rooms aren’t taking snapin tasks, and I look through the fog.log file to see this sometimes:
[QUOTE]
04/11/2014 13:07 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:07 FOG::SnapinClient General Error Returned:
04/11/2014 13:07 FOG::SnapinClient #!er:No Host Found04/11/2014 13:12 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:12 FOG::SnapinClient General Error Returned:
04/11/2014 13:12 FOG::SnapinClient #!er:No Host Found04/11/2014 13:18 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:18 FOG::SnapinClient General Error Returned:
04/11/2014 13:18 FOG::SnapinClient #!er:No Host FoundAnd so on…[/QUOTE]
It seems consistent across the rooms I am looking at today but I can’t figure out what it means. When I restart, the log looks like this (from the startup, I am guessing, to when it has finished):
[CODE] 04/11/2014 13:19 FOG Service Engine Version: 3
04/11/2014 13:19 Starting all sub processes
04/11/2014 13:19 14 modules loaded
04/11/2014 13:19 * Starting FOG.AutoLogOut
04/11/2014 13:19 * Starting FOG.SnapinClient
04/11/2014 13:19 * Starting FOG.DirCleaner
04/11/2014 13:19 * Starting FOG.DisplayManager
04/11/2014 13:19 * Starting FOG.GreenFog
04/11/2014 13:19 * Starting FOG.GUIWatcher
04/11/2014 13:19 * Starting FOG.HostNameChanger
04/11/2014 13:19 FOG::GUIWatcher Starting GUI Watcher…
04/11/2014 13:19 FOG::AutoLogOut Starting process…
04/11/2014 13:19 FOG::DirCleaner Sleeping for 37 seconds.
04/11/2014 13:19 * Starting FOG.HostRegister
04/11/2014 13:19 FOG::ClientUpdater Starting client update process…
04/11/2014 13:19 FOG::ClientUpdater Sleeping for 326 seconds.
04/11/2014 13:19 * Starting FOG.MODDebug
04/11/2014 13:19 FOG::GreenFog Sleeping for 48 seconds.
04/11/2014 13:19 FOG::MODDebug Start Called
04/11/2014 13:19 FOG::DisplayManager Starting display manager process…
04/11/2014 13:19 FOG::MODDebug Sleeping for 100 Seconds
04/11/2014 13:19 * Starting FOG.PrinterManager
04/11/2014 13:19 * Starting FOG.SnapinClient
04/11/2014 13:19 * Starting FOG.TaskReboot
04/11/2014 13:19 FOG::PrinterManager Starting interprocess communication process…
04/11/2014 13:19 FOG::HostnameChanger Starting hostname change process…
04/11/2014 13:19 FOG::HostnameChanger Yielding to other subservices for 6 seconds.
04/11/2014 13:19 * Starting FOG.UserCleanup
04/11/2014 13:19 * Starting FOG.UserTracker
04/11/2014 13:19 FOG::TaskReboot Taskreboot in lazy mode.
04/11/2014 13:19 FOG::TaskReboot Starting Task Reboot…
04/11/2014 13:19 FOG::UserTracker Starting user tracking process…
04/11/2014 13:19 FOG::SnapinClient Starting snapin client process…
04/11/2014 13:19 FOG::UserCleanup Sleeping for 17 seconds.
04/11/2014 13:19 FOG::DisplayManager Attempting to connect to fog server…
04/11/2014 13:19 FOG::SnapinClient Sleeping for 386 seconds.
04/11/2014 13:19 FOG::HostRegister Attempting to connect to fog server…
04/11/2014 13:19 FOG::UserTracker Attempting to connect to fog server…
04/11/2014 13:19 FOG::TaskReboot Attempting to connect to fog server…
04/11/2014 13:19 FOG::HostnameChanger Attempting to connect to fog server…
04/11/2014 13:19 FOG::PrinterManager General Error Returned:
04/11/2014 13:19 FOG::PrinterManager #!er:No Host Found
04/11/2014 13:19 FOG::AutoLogOut Unknown error, module will exit.
04/11/2014 13:19 FOG::TaskReboot #!er:No Host Found
04/11/2014 13:19 FOG::TaskReboot No task found for client.
04/11/2014 13:19 FOG::UserTracker Unknown error, module will exit.
04/11/2014 13:19 FOG::DisplayManager Unknown error, module will exit.
04/11/2014 13:19 FOG::HostnameChanger Module is active…
04/11/2014 13:19 FOG::HostnameChanger AD mode requested, confirming settings.
04/11/2014 13:19 FOG::HostnameChanger Hostname is up to date
04/11/2014 13:19 FOG::HostnameChanger Attempting to join domain if not already a member…
04/11/2014 13:19 FOG::HostnameChanger Domain Error! (‘SetupAlreadyJoined’ Code: 2691)
04/11/2014 13:19 FOG::UserCleanup Starting user cleanup process…
04/11/2014 13:19 FOG::UserCleanup Attempting to connect to fog server…
04/11/2014 13:19 FOG::UserCleanup Module is disabled globally on the FOG Server, exiting.
04/11/2014 13:20 FOG::DirCleaner Starting directory cleaning process…
04/11/2014 13:20 FOG::DirCleaner Attempting to connect to fog server…
04/11/2014 13:20 FOG::DirCleaner Module is disabled globally on the FOG Server.
04/11/2014 13:20 FOG::GreenFog Starting green fog…
04/11/2014 13:20 FOG::GreenFog Attempting to connect to fog server…
04/11/2014 13:20 FOG::GreenFog Module is disabled globally on the FOG Server, exiting.
04/11/2014 13:21 FOG::MODDebug Reading config settings…
04/11/2014 13:21 FOG::MODDebug Reading of config settings passed.
04/11/2014 13:21 FOG::MODDebug Starting Core processing…
04/11/2014 13:21 FOG::MODDebug Operating System ID: 6
04/11/2014 13:21 FOG::MODDebug Operating System Minor: 1
04/11/2014 13:21 FOG::MODDebug MAC ID 0 00:50:56:C0:00:01
04/11/2014 13:21 FOG::MODDebug MAC ID 1 00:50:56:C0:00:08
04/11/2014 13:21 FOG::MODDebug MAC ID 2 54:BE:F7:37:B2:E1
04/11/2014 13:21 FOG::MODDebug MAC POST String: 00:50:56:C0:00:01|00:50:56:C0:00:08|54:BE:F7:37:B2:E1
04/11/2014 13:21 FOG::MODDebug A user is currently logged in
04/11/2014 13:21 FOG::MODDebug Username: (my username)
04/11/2014 13:21 FOG::MODDebug Hostname: (hostname)
04/11/2014 13:21 FOG::MODDebug Attempting to open connect to: http://fog.cst.beds.ac.uk/fog/service/debug.php
04/11/2014 13:21 FOG::MODDebug Server responded with: Hello FOG Client
04/11/2014 13:21 FOG::MODDebug Module has finished work and will now exit.
04/11/2014 13:25 FOG::ClientUpdater Attempting to connect to fog server…
04/11/2014 13:25 FOG::ClientUpdater Module is active…
04/11/2014 13:25 FOG::ClientUpdater Checking Status : AutoLogOut.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : ClientUpdater.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : DirCleaner.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : DisplayManager.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : GreenFog.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : GUIWatcher.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : HostRegister.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : MODDebug.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : PrinterManager.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : SnapinClient.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : TaskReboot.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : UserCleanup.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : UserTracker.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : config.ini
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater 0 new modules found!
04/11/2014 13:25 FOG::ClientUpdater Client update will be applied during next service startup.
04/11/2014 13:25 FOG::ClientUpdater Client update process complete, exiting…
04/11/2014 13:26 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:26 FOG::SnapinClient General Error Returned:
04/11/2014 13:26 FOG::SnapinClient #!er:No Host Found
[/CODE]I don’t get what is wrong. Some additional notes:
[LIST]
[]Other rooms work fine
[]Host can ping fog through DNS and IP
[]Hosts are all registered on FOG
[]Boot menu shows it is registered
[]Host is joined to AD (this is how I log onto the PCs)
[]Hosts have successfully had snapin tasks applied before
[*]Out of ~500 PCs, these ~100 machines that don’t deploy their snapin all happen to have SSDs. The task I was trying to deploy to them was to extend their hard drive - manually, this same snapin works fine on the same PCs. And half of one of the rooms did seem to deploy fine. Just, bizarrely, about 100 haven’t and they all are in the SSD labs.
[/LIST]
I can well believe this isn’t something complicated as the issue but would really appreciate any pointers anyone can give.Thanks!
-
Weird thing here also is that even after I cancelled a load of tasks, they still show up under “Active Multicast Tasks”. I can’t even cancel them from this page - cancel doesn’t actually do anything. But many of those tasks were deleted already by me from the all tasks page…
[url=“/_imported_xf_attachments/1/1480_er.png?:”]er.png[/url]
-
For the second problem - I know there are times when fog does not really clean up the tasks in the actual database. For that, I used phpmyadmin to go into the actual database structure of fog - and clear (literally delete entries) in the “multicastSessions” and “multicastSessionsAssoc” tables … sometimes a session gets stuck in there if (for example) you had a group of 30 computers queued up, but only 29 actually came online, and it is still waiting for that last one - but you delete the task … so its left ‘unfinished’ in the database … it causes some confusion with future tasks … so i just clear that manually from time to time.
The first problem you notice with “er! host not found” can be caused by:
- Your mac address not associated with the host - sometimes it happens when a computer has a wifi mac you do not know about, is using a VPN / masking for some stupid reason, or has had its wifi card replaced. I’m sure that is not your problem though - you seem like you’ve taken care of it and registered ALL possible MAC addresses for your hosts (all laptops have 2 MACs at least, one for WIFI card, one for physical LAN port).
- A more interesting problem I ran into is multiple SSIDs -> yes, its weird. So basically, what happens is your laptop boots up (btw, this ONLY affects laptops) and FOG starts up as a service like normal. At my school, we recently switched SSID passwords - so I ended up adding 2 wifi network profiles - one for the old password, one with the new. The idea was “let the laptop connect to whatever wifi accepts the password … so it may connect directly to the new network, or can try to connect to the old network, then error out, then switch to the new network” … assigning multiple profiles to the wifi card works if you are going to switch the wifi password mid-school year. The problem for FOG - is when the client starts up - it looks for an active MAC address to send to the server. If your computer is unlucky - it will try to connect to the old network. It will fail - but during that time - FOG client is unable to find a proper mac address, and errors out - in the sense that it looks for a “null” mac address on the fog server. The solution I found is to put FOG service to start “Delayed Start” … to give the laptop plenty of time to figure out which fuckin’ wifi network works and connect to it properly before looking for a mac address.
TL;DR: FOG needs a ‘proper’ mac address from windows - if you wifi card is too slow to connect to wifi, FOG client will not have a mac address to look for on the FOG server
-
Not using wireless at all. And as you say, the hosts are all in the database (which is apparent from the fog log too). It is really confusing and kind of frustrating too to be honest as I have no idea really what to even look for to resolve this.
I’ll try the wiping of those sessions though, cheers!
-
Mental.
[url=“/_imported_xf_attachments/1/1491_mental.png?:”]mental.png[/url]
-
The weird thing is, is in this room of 21 PCs, 2 managed to apparently take a snapin after wiping the snapins, tasks and multicast sessions. No idea why though as their fog.log files are spammed with GUI notification/dispatch failed entries (stops after you login apparently)
But, really bizarrely, manually restarting the FOG service on one of the PCs seemed to fix this issue. But after a restart, it persists. This just makes no sense at all!
-
Two log files here - the first is when I had restarted the FOG service (left side) and the second is after a restart (right side). It just makes no sense to me as to why errors are being returned like this. The only difference between hardware types in our rooms is the presence of an SSD in these labs that has never caused an issue before!
[url=“/_imported_xf_attachments/1/1492_worksdoesnt1.png?:”]worksdoesnt1.png[/url][url=“/_imported_xf_attachments/1/1493_worksdoesnt2.png?:”]worksdoesnt2.png[/url]
-
After trying some other PCs, I have a feeling that this might be to do with the speed at which the PCs using SSDs boot up. If anyone wants to correct me on that for sure, please do - but what I’ll do is reimage one HDD and one SSD machine to see if they are behaving the same way and then secondly turn the FOG service into a delayed start service first and then try and reimage some machines to see if that changes anything.
-
what version of fog are you running?
do you have vmware/virtualbox etc installed on these lab computers? -
FOG 1.2
All our PCs have VMWare Workstation installed (hence the two additional MAC addresses). -
and since it was installed on the image you deployed, the randomized mac address on the virtual network adapter is identical. multiple hosts sharing the same mac address results in the server saying it’s invalid and not delivering tasks. this has been corrected for in the SVN. if you’re willing to update to the latest dev version, this problem will almost certainly go away.
(make a backup and be ready to revert back in case you have problems, the dev version is mostly stable but can have issues, it IS pre-release software) -
I’ll try and update to it - is it going to eventually be a part of the next FOG update then?
And what do you think would make this issue present on only some machines (which happen to all be machines with SSDs)? I never accepted any of the pending MAC addresses (as this did give an issue in 0.32 when I tried to accept one, once) and unless the machine boots from an SSD, it works fine.
Thanks!
-
yes, this will be part of fog 1.3.0
i have never noticed any difference in behavior between computers with HDDs and SSDs so long as they image properly. (i have heard of some SSDs not being recognized and being unable to be imaged, but that has been a very rare issue)
i also noticed that your hardware network card is coming up as “local area connection 2”
is there more then one physical network adapter in these machines? -
On most of our PCs there are, yes - DQ77MK motherboards have an Intel AMT management port and we avoid using this. I’ll reply back here after I try out the delayed start works as I thought it might - but I don’t think having an SSD on the system should make any difference at all. But… it does seem to. Somehow.
-
it looks to me like the vmware mac isn’t the issue then, since you say you’re aware of the related issues and you’ve never approved a pending mac. the issue seems to be that the wrong network adapter is registered with fog.
-
Yeah we know about it and have worked around it with no hassle - actually that has been a way to determine if someone has plugged a machine into the wrong port and we have actually started disabling the second network controller in bios often. I am pretty certain that the adapter isn’t the issue (if it was, it wouldn’t work in either of the cases)
But the key point is that with the same image, on essentially the same hardware, the fog service acts differently - seemingly down to whether or not it uses an SSD. On the SSD machine, the fog service gives the errors - but if you restart the service (stop and start), it works fine.
I just tried to set it to “Automatic (Delayed Start)” and - bizarrely - it is now working fine for those machines that I have done. Could it possibly be that some service is being relied on by FOG that hasn’t started at the point which it is needed? Its really all I can think of - and that could be down to my individual build ultimately. But it isn’t something we have ever seen as an issue before.
-
what version of windows is installed on these machines? if it is windows 8, then it could actually be somewhat related to the SSD. windows 8, i am told, says network adapters are disabled unless it can detect a network attached. that means we can’t get the mac address for them. the current fog client only polls the hardware once on startup, and it requests all active network adapters. if the system is starting up and the fog service starts before the network adapter has initialized, that could possibly be causing the issue.
-
Windows 7. But this sounds kind of logical…
-
windows 7 doesn’t have the same issue. what power saving features do you have enabled on these computers? and does wol work? if the onboard network card is getting set to a low power mode, that could be related
-
Un-ethically, we have everything set to not power down or use power saving modes currently. WOL might work but for the last year or so its been put on the backburner because getting it to work with systems we have limited control over was proving a hassle.