Weird Host behavior (some disappearing/losing primary mac, some suddenly needing approval)


  • Testers

    TL;DR

    Changed the display name of my image.
    Deployed said image.
    Suddenly some hosts have to be approved.
    Hosts are typically added via pxe boot menu. There was a snapin hash error on one host and then it suddenly needed to be approved but didn’t show up in pending hosts list.
    Other hosts did show up in pending hosts lists, but all of them had been authenticated through the fog service and working previously.


    I am on fog 1.5.8.13 on centos 7.

    I have discovered some weird behavior and I’m not sure where it’s coming from.

    I’m still troubleshooting to find more exact parameters for when this is happening so I’ll do my best to explain.

    Scenario 1 - host losing primary mac

    I have a bundle of devices in my environment that don’t have integrated ethernet. So I have a solution for using usb/usb-c ethernet adapters that support pxe.
    I have a handful of these adapters I use for specific models and such. For example, I’m imaging a microsoft surface go 2, it requires a surface usb-c ethernet adapter to pxe boot.
    So I have 1 adapter (they don’t need ethernet once they’re deployed) and I have a function at the very end of my provisioning script that removes the mac address of this usb ethernet adapter, and then approves all the pending macs that the fog client would have found by that point. Thus making it so the host will be found over wifi by the fog client identified by its wifi mac(s). This has worked well for quite some time.

    However, right now I just imaged this surface go 2 and somehow before reaching the part of my provisioning process that would remove that, the host has lost its mac address. If I search for the host in the gui (or via the api) it doesn’t show up anywhere. If I try to make a new one, it says a host with that name already exists. So I searched the database and found that the host still exists there, but it has no matching record in the HostMac table. Which is where I’m getting the ‘lost its primary mac’ as the name of this problem.

    Some other notes: I create the hosts and queue them for imaging in the pxe boot menu (Add host full registration option). The fog client is embedded in my image but the service is disabled before capturing the image. The provisioning process starts it on demand as needed till it’s completely enabled for installing assigned snapins. I noticed that this happened when it was attempting to install snapins because it got stuck in a loop since the fog client log kept throwing an ‘invalid host’ error. I use the fogservice to rename the computer and join the domain, which did happen successfully

    This happened with fog client 0.11.9, I just saw that 0.12.0 is available and added it to my server so it will start updating my clients. I didn’t see anything in the release notes that leads me to believe it will fix anything though. My current hypothesis is that the client is the culprit of this problem, but I don’t know for sure.

    Edit

    I don’t want to remove my other notes but before I finished writing this out did a bit more log tracing.
    I found the time that these things changes via the fog.log

    ------------------------------------------------------------------------------
    ---------------------------------SnapinClient---------------------------------
    ------------------------------------------------------------------------------
     7/17/2020 1:44:51 PM Client-Info Client Version: 0.11.19
     7/17/2020 1:44:51 PM Client-Info Client OS:      Windows
     7/17/2020 1:44:51 PM Client-Info Server Version: 1.5.8.13
     7/17/2020 1:44:51 PM Middleware::Response Success
     7/17/2020 1:44:51 PM SnapinClient Running snapin 7zipZstd
     7/17/2020 1:44:51 PM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=60:6D:3C:3D:41:97|F8:E4:E3:FA:6D:E8|FA:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:EB&taskid=5397
     7/17/2020 1:44:52 PM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1
     7/17/2020 1:44:52 PM Bus Emmiting message on channel: Notification
     7/17/2020 1:44:52 PM SnapinClient Starting snapin
     7/17/2020 1:46:27 PM SnapinClient Snapin finished
     7/17/2020 1:46:27 PM SnapinClient Return Code: 0
     7/17/2020 1:46:27 PM Bus Emmiting message on channel: Notification
     7/17/2020 1:46:27 PM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5397&exitcode=0&mac=60:6D:3C:3D:41:97|F8:E4:E3:FA:6D:E8|FA:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:EB&newService&json
     7/17/2020 1:46:28 PM SnapinClient Running snapin DeviceForm - Detachable
     7/17/2020 1:46:28 PM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=60:6D:3C:3D:41:97|F8:E4:E3:FA:6D:E8|FA:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:EB&taskid=5393
     7/17/2020 1:46:28 PM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1
     7/17/2020 1:46:28 PM SnapinClient ERROR: Hash does not match
     7/17/2020 1:46:28 PM SnapinClient ERROR: --> Ideal: 817B881A4D07DCFF51F67A47D9595DDE59D010FE0C018BDF5B57C502CB83A14F3085AF48987B392E8BF38E2C6B29D094055BDFFB6BBCD5647697D64AA4BDD668
     7/17/2020 1:46:28 PM SnapinClient ERROR: --> Actual: 5271D1B9F35EA9FC2F1E3CE474C20641A15B37A02035E7E8210DFB513518CBC79FDFD25092A41491C2AB02AF8E0913C99819C6066B9883F398D90C5D045E4A1D
     7/17/2020 1:46:28 PM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5393&exitcode=-1&mac=60:6D:3C:3D:41:97|F8:E4:E3:FA:6D:E8|FA:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:EB&newService&json
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    --------------------------------PrinterManager--------------------------------
    ------------------------------------------------------------------------------
     7/17/2020 1:46:28 PM Client-Info Client Version: 0.11.19
     7/17/2020 1:46:28 PM Client-Info Client OS:      Windows
     7/17/2020 1:46:28 PM Client-Info Server Version: 1.5.8.13
     7/17/2020 1:46:28 PM Middleware::Response Module is disabled globally on the FOG server
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    --------------------------------PowerManagement-------------------------------
    ------------------------------------------------------------------------------
     7/17/2020 1:46:28 PM Client-Info Client Version: 0.11.19
     7/17/2020 1:46:28 PM Client-Info Client OS:      Windows
     7/17/2020 1:46:28 PM Client-Info Server Version: 1.5.8.13
     7/17/2020 1:46:28 PM Middleware::Response Success
     7/17/2020 1:46:28 PM PowerManagement Calculating tasks to unschedule
     7/17/2020 1:46:28 PM PowerManagement Calculating tasks to schedule
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    ----------------------------------UserTracker---------------------------------
    ------------------------------------------------------------------------------
     7/17/2020 1:46:28 PM Client-Info Client Version: 0.11.19
     7/17/2020 1:46:28 PM Client-Info Client OS:      Windows
     7/17/2020 1:46:28 PM Client-Info Server Version: 1.5.8.13
     7/17/2020 1:46:28 PM Middleware::Response Success
     7/17/2020 1:46:28 PM Middleware::Communication URL: http://fogserver/fog/service/usertracking.report.php?action=login&user=QA-TAB-6105\AdlAdmin&mac=60:6D:3C:3D:41:97|F8:E4:E3:FA:6D:E8|FA:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:EB&newService&json
    ------------------------------------------------------------------------------
    
     7/17/2020 1:46:28 PM Service Sleeping for 98 seconds
     7/17/2020 1:48:07 PM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&configure&newService&json
     7/17/2020 1:48:07 PM Middleware::Response Success
     7/17/2020 1:48:07 PM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&mac=60:6D:3C:3D:41:97|F8:E4:E3:FA:6D:E8|FA:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:E7|F8:E4:E3:FA:6D:EB&newService&json
     7/17/2020 1:48:07 PM Middleware::Response Invalid host
     7/17/2020 1:48:07 PM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?clientver&newService&json
     7/17/2020 1:48:07 PM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?newService&json
    
     7/17/2020 1:48:07 PM Service Creating user agent cache
     7/17/2020 1:48:07 PM Middleware::Response Module is disabled on the host
     7/17/2020 1:48:07 PM Middleware::Response Module is disabled globally on the FOG server
     7/17/2020 1:48:07 PM Middleware::Response Module is disabled globally on the FOG server
    
    

    So right after a snapin hash didn’t match suddenly everything is disabled and invalid. (the snapin hash not matching relates to a separate issue that makes no sense when updating snapins via the api that use a snapin file that already exists, but I have a separate failsafe that gets that installed so don’t worry about that right now). This error isn’t unfamiliar to me but I can usually ignore it. I looked at my personal provisioning process logs and there wasn’t anything running at that time as an api call to the fog server or anything like that.

    This lead me to think, wait is this task still active?
    So I checked the gui and lo and behold it was. For some reason I could get to host through the link in the tasks screen but not anywhere else. It said it was pending and needed to be approved…
    It did not show up in the pending hosts and the hostPending field in the database was empty as far as I could tell. Here’s the database host record query for reference (with some private info ‘redacted’)

     select * from hosts where hostname = "qa-tab-6105";
    +--------+-------------+-----------------------------------------------+--------+-----------+--------------+---------------------+---------------------+--------------+-----------+---------------------+-----------------------------------------------------------------------------+------------+------------+------------------+----------------+------------------+----------------+------------+------------+----------+-------------+------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------+--------------+-------------+-------------+
    | hostID | hostName    | hostDesc                                      | hostIP | hostImage | hostBuilding | hostCreateDate      | hostLastDeploy      | hostCreateBy | hostUseAD | hostADDomain        | hostADOU                                                                    | hostADUser | hostADPass | hostADPassLegacy | hostProductKey | hostPrinterLevel | hostKernelArgs | hostKernel | hostDevice | hostInit | hostPending | hostPubKey                                                       | hostSecToken                                                                                                                     | hostSecTime         | hostPingCode | hostExitBios | hostExitEfi | hostEnforce |
    +--------+-------------+-----------------------------------------------+--------+-----------+--------------+---------------------+---------------------+--------------+-----------+---------------------+-----------------------------------------------------------------------------+------------+------------+------------------+----------------+------------------+----------------+------------+------------+----------+-------------+------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------+--------------+-------------+-------------+
    |   1737 | qa-tab-6105 | Created by FOG Reg on July 17, 2020, 12:35 pm |        |        27 |            0 | 2020-07-17 12:35:12 | 2020-07-17 12:41:47 | fog          | 1         | domain.site | OUstring | user  | password  |                  |                |                  |                |            |            |          |             | secretKey | secretKey | 2020-07-17 14:09:29 | 0            |              |             | 1           |
    +--------+-------------+-----------------------------------------------+--------+-----------+--------------+---------------------+---------------------+--------------+-----------+---------------------+-----------------------------------------------------------------------------+------------+------------+------------------+----------------+------------------+----------------+------------+------------+----------+-------------+------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------+--------------+-------------+-------------+
    

    So I guess that this all relates to hosts suddenly being unapproved

    Other Hosts suddenly requiring approval

    During imaging

    Another host that we imaged yesterday is a more standard desktop. It did not lose its mac address but instead suddenly required approval in the gui. It was also giving the ‘invalid host’ error.
    This one appeared in the pending hosts lists, got approved and then continued through its provisioning via the fogservice as normal.

    Randomly (seemingly at least)

    While troubleshooting the surface go scenario, checked the pending hosts in the web gui again and found 3 active/deployed hosts suddenly needing approval.
    Sadly their fog logs didn’t go as far back to be able to see if they perhaps had the same snapin hash error before becoming invalid.

    Other notes

    The only other thing I can think of that might be worth noting is that I did change the display name of the image all of these hosts had associated with them. When each of them had to be approved they also had lost any image associations. Could changing the display name of an image cause such problems. I also overwrote the image file on the server manually but that’s something I’ve also done a million times before. We have 3 images, a dev, stable, and previous. New images are uploaded as dev, then tested, if it’s stable we move the image files to the stable folder and keep a previous version of stable as a fail safe. Then we just add what version of windows 10 is in the image on the display name. i.e. the current one is Base-Stable-1909. Before renaming it it was Base-Stable-1809. The image id and all that stayed the same so this shouldn’t have caused any issue, but that’s the only thing I can think of that we changed before this happened.


  • Testers

    @Sebastian-Roth
    This is what I’m doing to fix the problem via the api, which seems to work most of the time. A couple times either the fogservice or my code caused the host to stop showing up in the api/gui list of hosts but the host would still be in the database. When that happens I have to reset the host encryption fields and the pending field in the database hosts table for each affected host and then restart the fog service on the host and it shows back up again.

    below is powershell code

    $mac = get-activeMacAddress; #function that filters output of get-netadapter to the currently active adapter's mac
    $hostObj = Get-FogHost -macAddr $mac; #function from fogapi module can be found here https://github.com/darksidemilk/FogApi/blob/master/FogApi/Public/Get-FogHost.ps1
    if ($hostObj.pending -ne '0') { 
                                Write-Warning "The host is pending or not explicitly set to not pending in fog, adjust host to be approved";
                                Reset-HostEncryption -fogHost $hostObj; #function from fogapi module that resets the encryption fields of a host see https://github.com/darksidemilk/FogApi/blob/master/FogApi/Public/Reset-HostEncryption.ps1
                                Set-FogOU -hostObj $hostObj -force; #custom internal function, will share applicable contents below
                            }
    

    this is what Set-FogOU does

    function Set-FogOU {
    <#
        .SYNOPSIS
        Set the proper OU on the fog host based on the hostname and group membership
        
        .DESCRIPTION
        Gets the current fog host infomration and then uses get-oustr to get
        the OU it should be in based on group membership and the hostname
        
        .PARAMETER jsonData
        The jsondata string used to set the ou with the fog api
        
        .PARAMETER hostObj
        the hostobject of the foghost gotten from get-foghost
        
        .PARAMETER OUstr
        The optional prewritten fogstr if you want to provide it
        
    #>
        
        [CmdletBinding()]
        param (
            [string]$jsonData,
            $hostObj,
            [string]$OUstr,
            $progressID=1,
            $parentID=-1,
            [switch]$force
        )
    
        process {
            if($null -eq $hostObj) {
                Write-Verbose 'no host given, getting current host...';
                $mac = Get-ActiveMacAddress
                $hostObj = Get-FogHost -macAddr $mac
            }
            Write-Verbose 'getting fog group...';
            $group = Get-FogGroup $hostObj.id; #function from fogapi see https://github.com/darksidemilk/FogApi/blob/master/FogApi/Public/Get-FogGroup.ps1
            $OUstr = Get-OUStr -group $group -compName $hostObj.Name; #function that creates correct OUstring from the group name
                
                $ADPass = ""; #edited for privacy, this is gotten with powershell pscredential encryption methods
                Write-Verbose "OU string is $OUstr, setting to properties of host object, also enabling and enforcing domain join...";
                $hostObj.ADDomain = "domain"; #edited for privacy
                $hostObj.ADUser = "domainUser"; #edited for privacy
                $hostObj.ADPass = $ADPass;
                $hostObj.ADOU = $OUstr;
                $hostObj.useAD = 1;
                $hostObj.enforce = 1;
                $hostObj.pending = '0'; #made it a string instead of an int as it seemed to make a difference
                Write-Verbose 'creating json string from new host ou settings...';
                $jsonData = $hostObj | Select-Object id,ADDomain,ADUser,ADPass,ADOU,useAD,enforce,pending | ConvertTo-Json;
                Update-FogObject -type object -coreObject host -IDofObject $hostObj.id -jsonData $jsonData; #fogapi function to run the update/set api command     
        }
    
    

    Basically just doing a PUT api command on the host object to reset the encryption fields and then another one to set the pending field to '0' and to put all the OU settings back. The json created by the command defining $jsonData looks like this (slightly edited for privacy)

    {
        "id":  "1593",
        "ADDomain":  "domain",
        "ADUser":  "domainUser",
        "ADPass":  "secretPassword",
        "ADOU":  "OU=IT,OU=Workstations,DC=domainName,DC=com",
        "useAD":  1,
        "enforce":  1,
        "pending":  "0"
    }
    

  • Testers

    @Sebastian-Roth I am also getting hosts becoming pending for no apparent reason. My best guess at the moment is that it’s related to the service as it only seems to happen when the service is running. I’d rather like to keep it running.


  • Testers

    @Sebastian-Roth I saw it happen again this morning. In at least one case (I have scenarios where it’s happening that this doesn’t match up with) it seems to happen right after a snapin hash doesn’t match.
    I attached the fog.log section where the host went through Authenticated -> start snapins ->Snapin hash mismatch -> Invalid host

    How my hashes keep getting mismatches is a whole other question because I use the exact same file for every snapin. And said file hasn’t changed in 2 years.

    ------------------------------------------------------------------------------
    --------------------------------Authentication--------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:12:24 AM Client-Info Version: 0.12.0
     8/25/2020 7:12:24 AM Client-Info OS:      Windows
     8/25/2020 7:12:24 AM Middleware::Authentication Waiting for authentication timeout to pass
     8/25/2020 7:12:24 AM Middleware::Communication Download: http://fogserver/fog/management/other/ssl/srvpublic.crt
     8/25/2020 7:12:24 AM Data::RSA FOG Server CA cert found
     8/25/2020 7:12:24 AM Middleware::Authentication Cert OK
     8/25/2020 7:12:24 AM Middleware::Communication POST URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&authorize&newService
     8/25/2020 7:12:24 AM Middleware::Response Success
     8/25/2020 7:12:24 AM Middleware::Authentication Authenticated
    
    
     8/25/2020 7:12:24 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&configure&newService&json
     8/25/2020 7:12:24 AM Middleware::Response Success
     8/25/2020 7:12:24 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
     8/25/2020 7:12:25 AM Middleware::Response Success
     8/25/2020 7:12:25 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?clientver&newService&json
     8/25/2020 7:12:25 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?newService&json
    
     8/25/2020 7:12:25 AM Service Creating user agent cache
     8/25/2020 7:12:25 AM Middleware::Response Module is disabled globally on the FOG server
     8/25/2020 7:12:25 AM Middleware::Response Module is disabled globally on the FOG server
     8/25/2020 7:12:25 AM Middleware::Response Module is disabled globally on the FOG server
     8/25/2020 7:12:25 AM Service Initializing modules
    
    ------------------------------------------------------------------------------
    ---------------------------------ClientUpdater--------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:12:25 AM Client-Info Client OS:      Windows
     8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:12:25 AM Middleware::Response Success
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    ----------------------------------TaskReboot----------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:12:25 AM Client-Info Client OS:      Windows
     8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:12:25 AM Middleware::Response Success
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    --------------------------------HostnameChanger-------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:12:25 AM Client-Info Client OS:      Windows
     8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:12:25 AM Middleware::Response Success
     8/25/2020 7:12:25 AM HostnameChanger Checking Hostname
     8/25/2020 7:12:25 AM HostnameChanger Hostname is correct
     8/25/2020 7:12:25 AM HostnameChanger Attempting to join domain
     8/25/2020 7:12:25 AM HostnameChanger Host already joined to target domain
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    ---------------------------------SnapinClient---------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:12:25 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:12:25 AM Client-Info Client OS:      Windows
     8/25/2020 7:12:25 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:12:25 AM Middleware::Response Success
     8/25/2020 7:12:25 AM SnapinClient Running snapin DeviceForm - Desktop
     8/25/2020 7:12:25 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5462
     8/25/2020 7:12:26 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1
     8/25/2020 7:12:26 AM Bus Emmiting message on channel: Notification
     8/25/2020 7:12:26 AM SnapinClient Starting snapin
     8/25/2020 7:13:00 AM SnapinClient Snapin finished
     8/25/2020 7:13:00 AM SnapinClient Return Code: 0
     8/25/2020 7:13:00 AM Bus Emmiting message on channel: Notification
     8/25/2020 7:13:00 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5462&exitcode=0&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
     8/25/2020 7:13:00 AM SnapinClient Running snapin mitel-pm
     8/25/2020 7:13:00 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5475
     8/25/2020 7:13:01 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1
     8/25/2020 7:13:01 AM Bus Emmiting message on channel: Notification
     8/25/2020 7:13:01 AM SnapinClient Starting snapin
     8/25/2020 7:14:35 AM SnapinClient Snapin finished
     8/25/2020 7:14:35 AM SnapinClient Return Code: 0
     8/25/2020 7:14:35 AM Bus Emmiting message on channel: Notification
     8/25/2020 7:14:35 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5475&exitcode=0&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
     8/25/2020 7:14:35 AM SnapinClient Running snapin office2010-Std
     8/25/2020 7:14:35 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5476
     8/25/2020 7:14:36 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1
     8/25/2020 7:14:36 AM Bus Emmiting message on channel: Notification
     8/25/2020 7:14:36 AM SnapinClient Starting snapin
     8/25/2020 7:22:15 AM SnapinClient Snapin finished
     8/25/2020 7:22:15 AM SnapinClient Return Code: 0
     8/25/2020 7:22:15 AM Bus Emmiting message on channel: Notification
     8/25/2020 7:22:15 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5476&exitcode=0&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
     8/25/2020 7:22:15 AM SnapinClient Running snapin outlook2016
     8/25/2020 7:22:15 AM Middleware::Communication Download: http://192.168.100.117//fog/service/snapins.file.php?mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&taskid=5477
     8/25/2020 7:22:16 AM SnapinClient C:\Program Files (x86)\FOG\tmp\chocoPkgSnapin.ps1
     8/25/2020 7:22:16 AM SnapinClient ERROR: Hash does not match
     8/25/2020 7:22:16 AM SnapinClient ERROR: --> Ideal: 817B881A4D07DCFF51F67A47D9595DDE59D010FE0C018BDF5B57C502CB83A14F3085AF48987B392E8BF38E2C6B29D094055BDFFB6BBCD5647697D64AA4BDD668
     8/25/2020 7:22:16 AM SnapinClient ERROR: --> Actual: 5271D1B9F35EA9FC2F1E3CE474C20641A15B37A02035E7E8210DFB513518CBC79FDFD25092A41491C2AB02AF8E0913C99819C6066B9883F398D90C5D045E4A1D
     8/25/2020 7:22:16 AM Middleware::Communication URL: http://fogserver/fog/service/snapins.checkin.php?taskid=5477&exitcode=-1&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    --------------------------------PrinterManager--------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:22:16 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:22:16 AM Client-Info Client OS:      Windows
     8/25/2020 7:22:16 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:22:16 AM Middleware::Response Module is disabled globally on the FOG server
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    --------------------------------PowerManagement-------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:22:16 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:22:16 AM Client-Info Client OS:      Windows
     8/25/2020 7:22:16 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:22:16 AM Middleware::Response Success
     8/25/2020 7:22:16 AM PowerManagement Calculating tasks to unschedule
     8/25/2020 7:22:16 AM PowerManagement Calculating tasks to schedule
    ------------------------------------------------------------------------------
    
    
    ------------------------------------------------------------------------------
    ----------------------------------UserTracker---------------------------------
    ------------------------------------------------------------------------------
     8/25/2020 7:22:16 AM Client-Info Client Version: 0.12.0
     8/25/2020 7:22:16 AM Client-Info Client OS:      Windows
     8/25/2020 7:22:16 AM Client-Info Server Version: 1.5.9-RC2
     8/25/2020 7:22:16 AM Middleware::Response Success
     8/25/2020 7:22:16 AM Middleware::Communication URL: http://fogserver/fog/service/usertracking.report.php?action=login&user=SHP-WKS-6096\AdlAdmin&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
    ------------------------------------------------------------------------------
    
     8/25/2020 7:22:16 AM Service Sleeping for 94 seconds
     8/25/2020 7:23:50 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&configure&newService&json
     8/25/2020 7:23:50 AM Middleware::Response Success
     8/25/2020 7:23:50 AM Middleware::Communication URL: http://fogserver/fog/management/index.php?sub=requestClientInfo&mac=9C:7B:EF:C3:90:84||00:00:00:00:00:00:00:E0&newService&json
     8/25/2020 7:23:50 AM Middleware::Response Invalid host
     8/25/2020 7:23:50 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?clientver&newService&json
     8/25/2020 7:23:51 AM Middleware::Communication URL: http://fogserver/fog/service/getversion.php?newService&json
    
     8/25/2020 7:23:51 AM Service Creating user agent cache
     8/25/2020 7:23:51 AM Middleware::Response Module is disabled globally on the FOG server
     8/25/2020 7:23:51 AM Middleware::Response Module is disabled globally on the FOG server
     8/25/2020 7:23:51 AM Middleware::Response Module is disabled globally on the FOG server
    
    

  • Senior Developer

    @JJ-Fullmer Thanks heaps for the details and all your work concerning this issue. I will have a bit of time in the next days to look into this and get back to you.


  • Testers

    @Sebastian-Roth I found that there was some need for database cleanup as I had some invalid entries. I’ll paste in my internal notes on what I did. Sadly I am still getting random pending hosts. I think that maybe something in the new version of the fog client, 0.12.0 I believe, is making it so some hosts that may have previously had some sort of invalid data in the database making them not viewable are becoming pending and then fixed. Though some of them were valid beforehand. So I still don’t know the cause of the behavior but I at least made some api commands to workaround the issue when it happens during deployment. I Also found that there were int, string, and null data types in my hostPending field. i.e. some hosts had a blank/null field, some matched a where hostPending = '1' some match hostPending = 1 some matched hostPending = 0 s and some matched hostPending = '0'
    After testing an approval of a host in the gui, I manually set all hosts in the database to hostPending = '0' to add universal. I tried to find the code in the fog project where hosts are set to pending/approved but don’t think I found every place. It did look like when they are added via pxe (the method I usually use) it sets the pending to $null. But I was having trouble following it all perfectly. I’d be interested in seeing everything that determines whether a host should become pending or not if you’re willing to help me find all the code where it happens.

    in the actual database there are 26 more entries than what we see in the gui or api.
    The hostPending field in the database seems to have multiple ways of being set. some are 0 some are ‘0’ some are 1 and some are ‘1’
    There must be something else that makes it show up or not in the gui or api as pending or at all.
    Currenty seeing 7 hosts with pending of ‘1’ but only one of them shows up in pending hosts/ in the gui/api at all.

    MariaDB [fog]> select hostID,hostname,hostPending,hostCreateDate,hostLastDeploy,hostImage from hosts where hostPending in ('1');
    +--------+-----------------+-------------+---------------------+---------------------+-----------+
    | hostID | hostname        | hostPending | hostCreateDate      | hostLastDeploy      | hostImage |
    +--------+-----------------+-------------+---------------------+---------------------+-----------+
    |   1519 | DESKTOP-G4VN9T5 | 1           | 2016-11-11 10:22:53 | 0000-00-00 00:00:00 |         0 |
    |   1710 | ARROWHE-AS1BQVC | 1           | 2019-11-14 15:51:54 | 0000-00-00 00:00:00 |         0 |
    |   1719 | SCH-Wks-6077    | 1           | 2020-08-04 21:10:17 | 0000-00-00 00:00:00 |         0 |
    |   1715 | mss-dcamTmill2  | 1           | 2020-08-19 07:44:06 | 0000-00-00 00:00:00 |         0 |
    |   1729 | MD-Bcode-Test   | 1           | 2020-08-11 20:18:17 | 0000-00-00 00:00:00 |         0 |
    |   1721 | MSS-LTS-6080    | 1           | 2020-08-07 09:11:26 | 0000-00-00 00:00:00 |         0 |
    |   1725 | imp-wks-6085    | 1           | 2020-08-20 08:51:21 | 0000-00-00 00:00:00 |         0 |
    +--------+-----------------+-------------+---------------------+---------------------+-----------+
    7 rows in set (0.00 sec) 
    

    ran

    update hosts set hostPending = '0'; 
    

    Which updated all 286 rows, but no change on the api number of hosts but the pending hosts list link has now disappeared on the gui. Going to set all hosts that have no image id set to the stable image and see if that changes anything

    Ran

    update hosts set hostImage = 27 where hostImage != 29; 
    

    to set all hosts that aren’t set to the dev image to the stable image

    The api is still showing 257 hosts, the database shows 283 rows in the hosts table, and then in the fog gui there are 274 members of the everyone group that was just checked and found to indeed have all possible hosts in it.

    did some clean up on groups that are no longer used and deleted hosts within that couldn’t be brought up in the host view. Then emptied the everyone group and re-added all available hosts. The everyone group count and api group count are now the same but the database is still showing 283 hosts.

    ran these commands to clean up the database of stale records

    DELETE FROM `hosts` WHERE `hostID` NOT IN (SELECT `hmhostID` FROM `hostMAC`);
    DELETE FROM `hostMAC` WHERE `hmhostID` NOT IN (SELECT `hostID` FROM `hosts`);
    DELETE FROM `snapinAssoc` WHERE `saHostID` NOT IN (SELECT `hostID` FROM `hosts`);
    DELETE FROM `groupMembers` WHERE `gmHostID` NOT IN (SELECT `hostID` FROM `hosts`); 
    

    Got the code from this post when searching for forum posts concerning having a hostID of 0 https://forums.fogproject.org/topic/8359/fog-registration-error-hostname-with-that-name-already-exists/20

    Database now shows 265 entries and api shows 256

    used some set-clipboard and -join “,” magic to create a csv list to paste into the mariadb console and search for all hosts that are not being found in the api. Found that one of the hosts is my tablet and found that the fog log showed it as ‘not encrypted’. ran the following to do a database level client encryption reset on the 9 missing hosts

    update hosts set hostsectoken="",hostpubkey="",hostsectime ="0000-00-00 00:00:00" where hostid not in (83,1645,1657,1500,1639,138,78,77,1596,1647,1469,1531,1604,1513,1605,1634,147,1543,1467,1607,1606,1466,1512,1490,1495,1631,109,1665,101,1534,1533,1659,1562,1660,1633,1628,154,1541,16,1552,82,1551,1706,1708,1714,151,156,152,157,150,1470,1574,1062,107,1679,76,72,85,137,1640,1520,86,1581,80,1540,1594,1676,130,1501,1526,1535,1545,1559,1590,1616,1627,1626,1527,1601,1686,1693,91,1644,1672,358,141,100,1553,302,1477,1479,1667,1646,93,19,92,1661,1670,155,1489,1664,142,124,1692,127,1554,1673,1724,1725,1598,24,1690,1578,1585,1705,1663,136,1656,1701,1577,1557,1736,1593,1674,103,22,1503,66,95,1652,1472,1635,133,1458,110,96,1532,1516,1678,1641,1536,1544,1481,1480,1658,111,395,1564,1563,1702,1599,1675,1682,1697,1698,1515,1671,1681,1685,1666,1560,1468,1695,1696,1720,1704,1632,1456,1497,1680,1653,1649,105,1619,1482,34,1618,140,131,25,1546,69,1688,1474,1473,1476,1475,1683,1700,1699,1711,1712,1713,1707,1735,1737,1738,1691,1517,448,1638,1614,280,1662,108,13,1502,1493,1576,21,31,1491,1630,128,1580,94,1494,1492,1629,29,1556,1510,1498,119,1558,1740,1588,1643,1496,1642,116,1612,37,1620,1739,1621,1624,129,1623,126,1622,1625,98,1689,1694,153,1709,1597,1548,1655,1684,1488,1611,90,1668,88); 
    

    there were 2 unknown hosts in the list of 9 issue hosts

    +--------+-----------------+-------------+
    | hostid | hostname        | hostPending |
    +--------+-----------------+-------------+
    |   1519 | DESKTOP-G4VN9T5 | 0           |
    |   1710 | ARROWHE-AS1BQVC | 0           |
    |   1723 | imp-wks-6081    | 0           |
    |   1719 | SCH-Wks-6077    | 0           |
    |   1703 | IT-Tab-6026     | 1           |
    |   1729 | MD-Bcode-Test   | 0           |
    |   1718 | MSS-Lap-6076    | 0           |
    |   1728 | ops-cb-6087     | 0           |
    |   1721 | MSS-LTS-6080    | 0           |
    +--------+-----------------+-------------+ 
    

    So ran these sql commands to remove them from the database completely.

    DELETE FROM `hosts` WHERE `hostID` IN (1519,1710);
    DELETE FROM `hostMAC` WHERE `hmhostID` IN (1519,1710);
    DELETE FROM `snapinAssoc` WHERE `saHostID` IN (1519,1710);
    DELETE FROM `groupMembers` WHERE `gmHostID` IN (1519,1710);  
    

  • Testers

    @Sebastian-Roth It’s been a while I realize but it’s started happening again. I think I found the issue in an internal script that was causing part of the issue during imaging but I fixed that. (Had to do with approving pending macs automatically after imaging a computer). But now I am still getting hosts that are in production randomly becoming pending hosts.


  • Testers

    @Sebastian-Roth I updated and imaged a computer. It didn’t happen during imaging again, yay!

    However, I just saw another host that was imaged back in February come up as pending for no apparent reason. It still has its image history, inventory, membership, snapin history, login history, etc. It is missing its assigned image (it was assigned before the image’s display name was changed) and it is missing its active directory OU settings. I have not taken it out of pending in case there’s something more to look at for debugging purposes. I’m working remotely at the moment and it seems the computer is turned off, otherwise I’d grab the fog.log in case there’s something in there that hasn’t been truncated that might help.


  • Testers

    @Sebastian-Roth I updated to the RC2 from the master branch. Gonna be imaging a machine (well a virtual machine) shortly. I’ll let you know how it goes


  • Testers

    This post is deleted!

  • Testers

    @Sebastian-Roth I will give that a go and see if it that eradicates the issue. I haven’t seen it happen since I fixed all the snapin hashes and updated the client, but I’ve also only imaged one computer since then.


  • Senior Developer

    @JJ-Fullmer Would you mind updating to the very latest RC version. While I am not sure I do remember having looked into a similar sounding issue and fixed that only a few weeks ago.


  • Testers

    So in my second test to see if this happened again it didn’t happen again. The second surface go 2 I’m imaging got past the part the other one magically became pending.

    So there were 2 differences one or both of which fixed the problem

    • Manually Re-uploaded the file for the snapin that had a mismatch hash error
    • Fog Client 0.12.0 was used instead of 0.11.19

    I need to image another different device before I can be really sure.

    So now I’m wondering why the new 0.12.0 fog client isn’t being auto updated to on all my clients through the client updater? As I recall from past updates, simply updating the installer on the fog server side should get all the clients to update. Am I forgetting a step? @Sebastian-Roth maybe you have some insights.
    I can deploy it through my own software deployment tools if I must.


  • Testers

    @JJ-Fullmer I just relooked at the database output and noted that the hostPending field was set to ‘1’. So it did at least say in the database that the host became pending, but it still wasn’t showing in the gui outside of the existing tasks.
    I’m currently testing to see if I can recreate the issue or if having fixed the snapin with the hash error solves the issue.



383
Online

7.4k
Users

14.5k
Topics

136.7k
Posts