Weird Host behavior (some disappearing/losing primary mac, some suddenly needing approval)
-
@Sebastian-Roth It’s been a while I realize but it’s started happening again. I think I found the issue in an internal script that was causing part of the issue during imaging but I fixed that. (Had to do with approving pending macs automatically after imaging a computer). But now I am still getting hosts that are in production randomly becoming pending hosts.
-
@Sebastian-Roth I found that there was some need for database cleanup as I had some invalid entries. I’ll paste in my internal notes on what I did. Sadly I am still getting random pending hosts. I think that maybe something in the new version of the fog client, 0.12.0 I believe, is making it so some hosts that may have previously had some sort of invalid data in the database making them not viewable are becoming pending and then fixed. Though some of them were valid beforehand. So I still don’t know the cause of the behavior but I at least made some api commands to workaround the issue when it happens during deployment. I Also found that there were int, string, and null data types in my hostPending field. i.e. some hosts had a blank/null field, some matched a where
hostPending = '1'
some matchhostPending = 1
some matchedhostPending = 0
s and some matchedhostPending = '0'
After testing an approval of a host in the gui, I manually set all hosts in the database tohostPending = '0'
to add universal. I tried to find the code in the fog project where hosts are set to pending/approved but don’t think I found every place. It did look like when they are added via pxe (the method I usually use) it sets the pending to $null. But I was having trouble following it all perfectly. I’d be interested in seeing everything that determines whether a host should become pending or not if you’re willing to help me find all the code where it happens.in the actual database there are 26 more entries than what we see in the gui or api.
The hostPending field in the database seems to have multiple ways of being set. some are 0 some are ‘0’ some are 1 and some are ‘1’
There must be something else that makes it show up or not in the gui or api as pending or at all.
Currenty seeing 7 hosts with pending of ‘1’ but only one of them shows up in pending hosts/ in the gui/api at all.MariaDB [fog]> select hostID,hostname,hostPending,hostCreateDate,hostLastDeploy,hostImage from hosts where hostPending in ('1'); +--------+-----------------+-------------+---------------------+---------------------+-----------+ | hostID | hostname | hostPending | hostCreateDate | hostLastDeploy | hostImage | +--------+-----------------+-------------+---------------------+---------------------+-----------+ | 1519 | DESKTOP-G4VN9T5 | 1 | 2016-11-11 10:22:53 | 0000-00-00 00:00:00 | 0 | | 1710 | ARROWHE-AS1BQVC | 1 | 2019-11-14 15:51:54 | 0000-00-00 00:00:00 | 0 | | 1719 | SCH-Wks-6077 | 1 | 2020-08-04 21:10:17 | 0000-00-00 00:00:00 | 0 | | 1715 | mss-dcamTmill2 | 1 | 2020-08-19 07:44:06 | 0000-00-00 00:00:00 | 0 | | 1729 | MD-Bcode-Test | 1 | 2020-08-11 20:18:17 | 0000-00-00 00:00:00 | 0 | | 1721 | MSS-LTS-6080 | 1 | 2020-08-07 09:11:26 | 0000-00-00 00:00:00 | 0 | | 1725 | imp-wks-6085 | 1 | 2020-08-20 08:51:21 | 0000-00-00 00:00:00 | 0 | +--------+-----------------+-------------+---------------------+---------------------+-----------+ 7 rows in set (0.00 sec)
ran
update hosts set hostPending = '0';
Which updated all 286 rows, but no change on the api number of hosts but the pending hosts list link has now disappeared on the gui. Going to set all hosts that have no image id set to the stable image and see if that changes anything
Ran
update hosts set hostImage = 27 where hostImage != 29;
to set all hosts that aren’t set to the dev image to the stable image
The api is still showing 257 hosts, the database shows 283 rows in the hosts table, and then in the fog gui there are 274 members of the everyone group that was just checked and found to indeed have all possible hosts in it.
did some clean up on groups that are no longer used and deleted hosts within that couldn’t be brought up in the host view. Then emptied the everyone group and re-added all available hosts. The everyone group count and api group count are now the same but the database is still showing 283 hosts.
ran these commands to clean up the database of stale records
DELETE FROM `hosts` WHERE `hostID` NOT IN (SELECT `hmhostID` FROM `hostMAC`); DELETE FROM `hostMAC` WHERE `hmhostID` NOT IN (SELECT `hostID` FROM `hosts`); DELETE FROM `snapinAssoc` WHERE `saHostID` NOT IN (SELECT `hostID` FROM `hosts`); DELETE FROM `groupMembers` WHERE `gmHostID` NOT IN (SELECT `hostID` FROM `hosts`);
Got the code from this post when searching for forum posts concerning having a hostID of 0 https://forums.fogproject.org/topic/8359/fog-registration-error-hostname-with-that-name-already-exists/20
Database now shows 265 entries and api shows 256
used some set-clipboard and -join “,” magic to create a csv list to paste into the mariadb console and search for all hosts that are not being found in the api. Found that one of the hosts is my tablet and found that the fog log showed it as ‘not encrypted’. ran the following to do a database level client encryption reset on the 9 missing hosts
update hosts set hostsectoken="",hostpubkey="",hostsectime ="0000-00-00 00:00:00" where hostid not in (83,1645,1657,1500,1639,138,78,77,1596,1647,1469,1531,1604,1513,1605,1634,147,1543,1467,1607,1606,1466,1512,1490,1495,1631,109,1665,101,1534,1533,1659,1562,1660,1633,1628,154,1541,16,1552,82,1551,1706,1708,1714,151,156,152,157,150,1470,1574,1062,107,1679,76,72,85,137,1640,1520,86,1581,80,1540,1594,1676,130,1501,1526,1535,1545,1559,1590,1616,1627,1626,1527,1601,1686,1693,91,1644,1672,358,141,100,1553,302,1477,1479,1667,1646,93,19,92,1661,1670,155,1489,1664,142,124,1692,127,1554,1673,1724,1725,1598,24,1690,1578,1585,1705,1663,136,1656,1701,1577,1557,1736,1593,1674,103,22,1503,66,95,1652,1472,1635,133,1458,110,96,1532,1516,1678,1641,1536,1544,1481,1480,1658,111,395,1564,1563,1702,1599,1675,1682,1697,1698,1515,1671,1681,1685,1666,1560,1468,1695,1696,1720,1704,1632,1456,1497,1680,1653,1649,105,1619,1482,34,1618,140,131,25,1546,69,1688,1474,1473,1476,1475,1683,1700,1699,1711,1712,1713,1707,1735,1737,1738,1691,1517,448,1638,1614,280,1662,108,13,1502,1493,1576,21,31,1491,1630,128,1580,94,1494,1492,1629,29,1556,1510,1498,119,1558,1740,1588,1643,1496,1642,116,1612,37,1620,1739,1621,1624,129,1623,126,1622,1625,98,1689,1694,153,1709,1597,1548,1655,1684,1488,1611,90,1668,88);
there were 2 unknown hosts in the list of 9 issue hosts
+--------+-----------------+-------------+ | hostid | hostname | hostPending | +--------+-----------------+-------------+ | 1519 | DESKTOP-G4VN9T5 | 0 | | 1710 | ARROWHE-AS1BQVC | 0 | | 1723 | imp-wks-6081 | 0 | | 1719 | SCH-Wks-6077 | 0 | | 1703 | IT-Tab-6026 | 1 | | 1729 | MD-Bcode-Test | 0 | | 1718 | MSS-Lap-6076 | 0 | | 1728 | ops-cb-6087 | 0 | | 1721 | MSS-LTS-6080 | 0 | +--------+-----------------+-------------+
So ran these sql commands to remove them from the database completely.
DELETE FROM `hosts` WHERE `hostID` IN (1519,1710); DELETE FROM `hostMAC` WHERE `hmhostID` IN (1519,1710); DELETE FROM `snapinAssoc` WHERE `saHostID` IN (1519,1710); DELETE FROM `groupMembers` WHERE `gmHostID` IN (1519,1710);
-
@JJ-Fullmer Thanks heaps for the details and all your work concerning this issue. I will have a bit of time in the next days to look into this and get back to you.
-
@Sebastian-Roth I saw it happen again this morning. In at least one case (I have scenarios where it’s happening that this doesn’t match up with) it seems to happen right after a snapin hash doesn’t match.
I attached the fog.log section where the host went through Authenticated -> start snapins ->Snapin hash mismatch -> Invalid hostHow my hashes keep getting mismatches is a whole other question because I use the exact same file for every snapin. And said file hasn’t changed in 2 years.
------------------------------------------------------------------------------ --------------------------------Authentication-------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:12:24 AM Client-Info Version: 0.12.0 8/25/2020 7:12:24 AM Client-Info OS: Windows 8/25/2020 7:12:24 AM Middleware::Authentication Waiting for authentication timeout to pass 8/25/2020 7:12:24 AM Middleware::Communication Download: http://fogserver/fog/management/other/ssl/srvpublic.crt 8/25/2020 7:12:24 AM Data::RSA FOG Server CA cert found 8/25/2020 7:12:24 AM Middleware::Authentication Cert OK 8/25/2020 7:12:24 AM Middleware::Communication POST URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&authorize&newService 8/25/2020 7:12:24 AM Middleware::Response Success 8/25/2020 7:12:24 AM Middleware::Authentication Authenticated 8/25/2020 7:12:24 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&configure&newService&json 8/25/2020 7:12:24 AM Middleware::Response Success 8/25/2020 7:12:24 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json 8/25/2020 7:12:25 AM Middleware::Response Success 8/25/2020 7:12:25 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?clientver&newService&json 8/25/2020 7:12:25 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?newService&json 8/25/2020 7:12:25 AM Service Creating user agent cache 8/25/2020 7:12:25 AM Middleware::Response Module is disabled globally on the FOG server 8/25/2020 7:12:25 AM Middleware::Response Module is disabled globally on the FOG server 8/25/2020 7:12:25 AM Middleware::Response Module is disabled globally on the FOG server 8/25/2020 7:12:25 AM Service Initializing modules ------------------------------------------------------------------------------ ---------------------------------ClientUpdater-------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0 8/25/2020 7:12:25 AM Client-Info Client OS: Windows 8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:12:25 AM Middleware::Response Success ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ----------------------------------TaskReboot---------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0 8/25/2020 7:12:25 AM Client-Info Client OS: Windows 8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:12:25 AM Middleware::Response Success ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ --------------------------------HostnameChanger------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0 8/25/2020 7:12:25 AM Client-Info Client OS: Windows 8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:12:25 AM Middleware::Response Success 8/25/2020 7:12:25 AM HostnameChanger Checking Hostname 8/25/2020 7:12:25 AM HostnameChanger Hostname is correct 8/25/2020 7:12:25 AM HostnameChanger Attempting to join domain 8/25/2020 7:12:25 AM HostnameChanger Host already joined to target domain ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ---------------------------------SnapinClient--------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0 8/25/2020 7:12:25 AM Client-Info Client OS: Windows 8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:12:25 AM Middleware::Response Success 8/25/2020 7:12:25 AM SnapinClient Running snapin DeviceForm - Desktop 8/25/2020 7:12:25 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5462 8/25/2020 7:12:26 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1 8/25/2020 7:12:26 AM Bus Emmiting message on channel: Notification 8/25/2020 7:12:26 AM SnapinClient Starting snapin 8/25/2020 7:13:00 AM SnapinClient Snapin finished 8/25/2020 7:13:00 AM SnapinClient Return Code: 0 8/25/2020 7:13:00 AM Bus Emmiting message on channel: Notification 8/25/2020 7:13:00 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5462&exitcode=0&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json 8/25/2020 7:13:00 AM SnapinClient Running snapin mitel-pm 8/25/2020 7:13:00 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5475 8/25/2020 7:13:01 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1 8/25/2020 7:13:01 AM Bus Emmiting message on channel: Notification 8/25/2020 7:13:01 AM SnapinClient Starting snapin 8/25/2020 7:14:35 AM SnapinClient Snapin finished 8/25/2020 7:14:35 AM SnapinClient Return Code: 0 8/25/2020 7:14:35 AM Bus Emmiting message on channel: Notification 8/25/2020 7:14:35 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5475&exitcode=0&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json 8/25/2020 7:14:35 AM SnapinClient Running snapin office2010-Std 8/25/2020 7:14:35 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5476 8/25/2020 7:14:36 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1 8/25/2020 7:14:36 AM Bus Emmiting message on channel: Notification 8/25/2020 7:14:36 AM SnapinClient Starting snapin 8/25/2020 7:22:15 AM SnapinClient Snapin finished 8/25/2020 7:22:15 AM SnapinClient Return Code: 0 8/25/2020 7:22:15 AM Bus Emmiting message on channel: Notification 8/25/2020 7:22:15 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5476&exitcode=0&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json 8/25/2020 7:22:15 AM SnapinClient Running snapin outlook2016 8/25/2020 7:22:15 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5477 8/25/2020 7:22:16 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1 8/25/2020 7:22:16 AM SnapinClient ERROR: Hash does not match 8/25/2020 7:22:16 AM SnapinClient ERROR: --> Ideal: 817B881A4D07DCFF51F67A47D9595DDE59D010FE0C018BDF5B57C502CB83A14F3085AF48987B392E8BF38E2C6B29D094055BDFFB6BBCD5647697D64AA4BDD668 8/25/2020 7:22:16 AM SnapinClient ERROR: --> Actual: 5271D1B9F35EA9FC2F1E3CE474C20641A15B37A02035E7E8210DFB513518CBC79FDFD25092A41491C2AB02AF8E0913C99819C6066B9883F398D90C5D045E4A1D 8/25/2020 7:22:16 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5477&exitcode=-1&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ --------------------------------PrinterManager-------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:22:16 AM Client-Info Client Version: 0.12.0 8/25/2020 7:22:16 AM Client-Info Client OS: Windows 8/25/2020 7:22:16 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:22:16 AM Middleware::Response Module is disabled globally on the FOG server ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ --------------------------------PowerManagement------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:22:16 AM Client-Info Client Version: 0.12.0 8/25/2020 7:22:16 AM Client-Info Client OS: Windows 8/25/2020 7:22:16 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:22:16 AM Middleware::Response Success 8/25/2020 7:22:16 AM PowerManagement Calculating tasks to unschedule 8/25/2020 7:22:16 AM PowerManagement Calculating tasks to schedule ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ----------------------------------UserTracker--------------------------------- ------------------------------------------------------------------------------ 8/25/2020 7:22:16 AM Client-Info Client Version: 0.12.0 8/25/2020 7:22:16 AM Client-Info Client OS: Windows 8/25/2020 7:22:16 AM Client-Info Server Version: 1.5.9-RC2 8/25/2020 7:22:16 AM Middleware::Response Success 8/25/2020 7:22:16 AM Middleware::Communication URL: http://fogserver/fog/service/usertracking.report.php?action=login&user=SHP-WKS-6096\AdlAdmin&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json ------------------------------------------------------------------------------ 8/25/2020 7:22:16 AM Service Sleeping for 94 seconds 8/25/2020 7:23:50 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&configure&newService&json 8/25/2020 7:23:50 AM Middleware::Response Success 8/25/2020 7:23:50 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json 8/25/2020 7:23:50 AM Middleware::Response Invalid host 8/25/2020 7:23:50 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?clientver&newService&json 8/25/2020 7:23:51 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?newService&json 8/25/2020 7:23:51 AM Service Creating user agent cache 8/25/2020 7:23:51 AM Middleware::Response Module is disabled globally on the FOG server 8/25/2020 7:23:51 AM Middleware::Response Module is disabled globally on the FOG server 8/25/2020 7:23:51 AM Middleware::Response Module is disabled globally on the FOG server
-
@Sebastian-Roth I am also getting hosts becoming pending for no apparent reason. My best guess at the moment is that it’s related to the service as it only seems to happen when the service is running. I’d rather like to keep it running.
-
@Sebastian-Roth
This is what I’m doing to fix the problem via the api, which seems to work most of the time. A couple times either the fogservice or my code caused the host to stop showing up in the api/gui list of hosts but the host would still be in the database. When that happens I have to reset the host encryption fields and the pending field in the database hosts table for each affected host and then restart the fog service on the host and it shows back up again.below is powershell code
$mac = get-activeMacAddress; #function that filters output of get-netadapter to the currently active adapter's mac $hostObj = Get-FogHost -macAddr $mac; #function from fogapi module can be found here https://github.com/darksidemilk/FogApi/blob/master/FogApi/Public/Get-FogHost.ps1 if ($hostObj.pending -ne '0') { Write-Warning "The host is pending or not explicitly set to not pending in fog, adjust host to be approved"; Reset-HostEncryption -fogHost $hostObj; #function from fogapi module that resets the encryption fields of a host see https://github.com/darksidemilk/FogApi/blob/master/FogApi/Public/Reset-HostEncryption.ps1 Set-FogOU -hostObj $hostObj -force; #custom internal function, will share applicable contents below }
this is what
Set-FogOU
doesfunction Set-FogOU { <# .SYNOPSIS Set the proper OU on the fog host based on the hostname and group membership .DESCRIPTION Gets the current fog host infomration and then uses get-oustr to get the OU it should be in based on group membership and the hostname .PARAMETER jsonData The jsondata string used to set the ou with the fog api .PARAMETER hostObj the hostobject of the foghost gotten from get-foghost .PARAMETER OUstr The optional prewritten fogstr if you want to provide it #> [CmdletBinding()] param ( [string]$jsonData, $hostObj, [string]$OUstr, $progressID=1, $parentID=-1, [switch]$force ) process { if($null -eq $hostObj) { Write-Verbose 'no host given, getting current host...'; $mac = Get-ActiveMacAddress $hostObj = Get-FogHost -macAddr $mac } Write-Verbose 'getting fog group...'; $group = Get-FogGroup $hostObj.id; #function from fogapi see https://github.com/darksidemilk/FogApi/blob/master/FogApi/Public/Get-FogGroup.ps1 $OUstr = Get-OUStr -group $group -compName $hostObj.Name; #function that creates correct OUstring from the group name $ADPass = ""; #edited for privacy, this is gotten with powershell pscredential encryption methods Write-Verbose "OU string is $OUstr, setting to properties of host object, also enabling and enforcing domain join..."; $hostObj.ADDomain = "domain"; #edited for privacy $hostObj.ADUser = "domainUser"; #edited for privacy $hostObj.ADPass = $ADPass; $hostObj.ADOU = $OUstr; $hostObj.useAD = 1; $hostObj.enforce = 1; $hostObj.pending = '0'; #made it a string instead of an int as it seemed to make a difference Write-Verbose 'creating json string from new host ou settings...'; $jsonData = $hostObj | Select-Object id,ADDomain,ADUser,ADPass,ADOU,useAD,enforce,pending | ConvertTo-Json; Update-FogObject -type object -coreObject host -IDofObject $hostObj.id -jsonData $jsonData; #fogapi function to run the update/set api command }
Basically just doing a PUT api command on the host object to reset the encryption fields and then another one to set the pending field to
'0'
and to put all the OU settings back. The json created by the command defining$jsonData
looks like this (slightly edited for privacy){ "id": "1593", "ADDomain": "domain", "ADUser": "domainUser", "ADPass": "secretPassword", "ADOU": "OU=IT,OU=Workstations,DC=domainName,DC=com", "useAD": 1, "enforce": 1, "pending": "0" }
-
@Sebastian-Roth These weird suddenly pending asset issues are still happening. I’m about to start re-imaging all our computers. May provide more data to debug the problem, or maybe it’ll fix the problem, we’ll see.
-
@Sebastian-Roth All the computers I imaged in the last week seem to now think they’re pending.
Perhaps we could find some time next week to troubleshoot this together? -
@jj-fullmer said in Weird Host behavior (some disappearing/losing primary mac, some suddenly needing approval):
Perhaps we could find some time next week to troubleshoot this together?
Absolutely. Let’s switch over to chat for this to not flood this topic but update it with our findings later on.
-
Ok, after lots of digging in the code and @JJ-Fullmer database we seem to have found out what happend. Turned out some of his PowerShell code messed things up by deleting the primary MAC address of some hosts and the fog-client did its share to kind of reset some of the hosts’ information and make them pending.
The PowerShell code is fixed already and I will work on adding some checks to hopefully not mess as much up with the fog-client as we did here.
Initially we thought the snapin hash problems were at fault but the evidence collected points to this being a coincidence.