SOLVED FOG Client stops reporting & working

  • Server
    • FOG Version: 1.3.0 RC-14
    • OS: Fedora 21
    • Service Version: 0.11.5
    • OS: Win10 LTSB x64

    I noticed this issue because our computers stopped shutting down overnight. Looking into it, I find that the log file for my computer hasn’t been written to since the 19th, 5 days ago. This started happening when we upgraded from RC-10 to RC-14. Yes big jump, but I was not at work for a month so nobody did it while I was gone. From logs, it looks like my computer has been on since the 19th of this month. Attached are the log and some screenshots.

    Looking at the fog user log, It looks like the display manager is at fault.









  • Senior Developer

    @michael_f @Wayne-Workman @Hanz a patch has been developed and verified for this issue. It will be released with 0.11.6 (along with a few other new features) once release candidate testing finishes for the client.

  • Senior Developer

    @michael_f @Wayne-Workman @Hanz a patch has been developed and verified for this issue. It will be released with 0.11.6 (along with a few other new features) once release candidate testing finishes for the client.

  • Senior Developer

    @michael_f A windows crash would explain it (especially if the disk was in the middle of a write request). I’ll push out a patch to try and address this issue.

  • @Wayne-Workman Maybe i’ve got the clue. I too have a blanked settings.json file.

    The windows eventlog shows an unexpected shutdown at 14:17:56

    Windows Explorer shows, that settings.json was last changed at the same time

    So I guess the file was opened to get updated while windows crashed.

  • it’s all smoke and mirrors until someone finds a better clue.

  • @Hanz I still disagree. Seeing as the images were updated then deployed I still lean towards .NET being the culprit, in what way I have no idea.

  • @Wayne-Workman then, without updates, it seems like that would rule out the .NET framework updates…I will add that I only noticed this after changing from RC-23 to Tom’s working RC_24 candidate. After uninstall/reinstall the .json file was correct, then reboot wiped it. So far, after reverting to RC_23 doesn’t seem to exhibit this behavior after several reboots.

  • @Hanz said in FOG Client stops reporting & working:

    isn’t it all unfortunately as I stated before, but I think most of the above is due to empty .json file.

    This is absolutely the same bug I and one other have experienced. The server address cannot be found because it is lost from the settings.json file, because the settings.json file gets completely wiped out.

    The only difference? I don’t have WSUS here, and these computers in this building do not automatically update. We make a new image every summer for every model, and image everything every summer - so because of this we don’t worry about updates during the school year.

  • @Joe-Schmitt event viewer, I unfortunately lost that log after uninstall and reinstall of client…

    reset token, and uninstall-reinstalled client
                        11/16/2016 8:44 AM Main Overriding exception handling
    11/16/2016 8:44 AM Main Bootstrapping Zazzles
    11/16/2016 8:44 AM Controller Initialize
    11/16/2016 8:44 AM Zazzles Creating main thread
    11/16/2016 8:44 AM Zazzles Service construction complete
    11/16/2016 8:44 AM Controller Start
    11/16/2016 8:44 AM Service Starting service
    11/16/2016 8:44 AM Middleware::Configuration ERROR: Invalid parameters
    11/16/2016 8:44 AM Service ERROR: ServerAddress not found! Exiting.
    11/16/2016 8:52 AM Main Overriding exception handling
    11/16/2016 8:52 AM Main Bootstrapping Zazzles
    11/16/2016 8:52 AM Controller Initialize
    11/16/2016 8:52 AM Zazzles Creating main thread
    11/16/2016 8:52 AM Zazzles Service construction complete
    11/16/2016 8:52 AM Controller Start
    11/16/2016 8:52 AM Service Starting service
    11/16/2016 8:52 AM Middleware::Configuration ERROR: Invalid parameters
    11/16/2016 8:52 AM Service ERROR: ServerAddress not found! Exiting.

    this isn’t it all unfortunately as I stated before, but I think most of the above is due to empty .json file.

    all the “chinese/gibberish” was during the authentication portion when client first starts up…

  • Senior Developer

    @Hanz where did that error log come from? Event viewer? If its from the client’s log please upload the entire log file.

  • @Wayne-Workman I also had this occur, oddly enough after running a snapin that forces the client to update via WSUS. I recently added .NET 4.6.1 to WSUS.

    yup. here’s event viewer from that client
    The performance counter name string value in the registry is not formatted correctly. The malformed string is 뢭傞碻�鬋쏄军⍮휒㾲籍봯禵ᦂ瞣㛂⯦⿟闚椂↛抋篾귩恫嗋᜸⸑蓣톽�쾐൐⨨莐潋㌍胡빌폍伤밧縒�鬋웘趲ප᥀巽뼉랥躎衙┱禌㾬됯㏨궒外㈦慙ﭯ潘ﯦ랢ꤎ懨䄩賩ꗼ�唹훐⩦�ẏ⾟疽褎⌼凴䞛枢ﴙᚅ옝햌Ш↊ᱰ貼Ӗ醑᥆㔥㎦Ὢ苻☍▩臄ꮨ僭烻�﫿釶⧐첨륗ꃕ⽷㢭�ᬓ욄䤘똎ꖷ뜕뷲훴洬勳꓅뛰“ⵏ敺铨苌哅쇇寒楣ᗳ땦샄쿆ﳸᚅᄽ鱔�⬈잃䯧浯厃ݖ�麳䋺꺋릎揍쟏ކꞯ봮蒵慻ኋ풭聕慻ꛓ젭ฦ崘ᆶ໰㾾ّ腳㗢⧹籛䖩蹯ඝ䎈﨨旑䮋굯䗙阪ꍢ佦빏桟᧑禛捲굃쥏�쑚胬�좮圠ߺ쭷䳭瓕㐕꺯墽䣣씳鳎綸ø쿹淗팅嫵䛕髠텊떪㠷醛ୟꯊ핚隂➼㋳쏔晶럤ﷲ顝༷㛲ㇿ灳ᰲ䷯矂퉷₾�䎡똪胉ᜬ馿ꗶ㏏켗ꔙ柄⽱៟ꉬ蟊贘톙鷤留閸ꗯ䀡膒릹㣩﫛㮃廹죘쓈⛛曬夐뙶빖摷帛룒䚄�됯㍍땫ᢦ戼㑦灰ﻻ�㻩䌲㔔뎤틕慻쪃鈐�㝚ิ뿮⯷浨リ齃鹸⿧ྵ᩽抃똭�ﹹ辗켟䛶渊㝛躴㙎匔ꋅ့畹⅟䅆汐攳섡증뿋蕖멥ꉘᏎ휻︇栿媼楔ꜣ転亥ി医�⿗뽟﷼꼆鹣㣎ᶉꌾ轭끿뙲師߿꼔檑ꢖꪶ羚緓晴埻욽得⡇넃¶鲱鶤᯾ﱘ뤗鈋裂煉⒑貵뎸כּ뉗욻푍붆杍촛깺㛯揝뮳ꄠꍢ饋㛎䳅輧뭩롫㳄䪄☙c豤ᖳ澷督㉁⶝る鞀멒곮綦묭驼浡⩐䘣᧣㿷묶䠗쩆ꎄл㻶ﻟ뉃ڻ耼萐葽쉅탺⻬顈譍钕Ć펰�员䰙⊰벹꨷槚糈줣搗끘贵�㇒ꮃ壕║㫇럐瓥摮˷猉ત榢懺뷵쏝銻㗝舂㊤๶咧썛嶳붯�अ劵庛╘杼ꩥ�㒽죉褓瀍䏨脖죜׮볚⒉案랴´뉗槻ʏ䑈㮜㍯쩏齣ௌ᪒嫖ᵤ솭挖錝쯝໰嚡燀⶚ꪌ뙼ྥ눷腻斤틠ᩆꊥ�껙⮇巙紼糭현䨙ᩰ涏஻⠒츞掕幪ᒻ찄晅⿨Ῐ�₼ⷩ�봔揗䑦౫덇쟞쨘춣ힼ⛏룛�䤅◩ᑐ宺ꦝ긵睤�薨䭆딐ﰶ췘ք唀劯텆辑䳙㉘骓篙帯㦛菇䑧ላ鳂ʢ͉糨賐൨僔蔱㇢䴝쎄㝣ᘓ찤䯕漣枾拂泇갦᧱な驒懑磷嫬苕➄뤦浧뷚ẋ嚳蚵ꡥ塅₌鈝㾒⁞껡ᤦ䝅握贚夑䁱ᣏꖶ込短␗咎꫁䢖랣䱾뮿愆衰↛赣刬㬞幦沐躻㔨쌷챴禘붍곈樝⠍툚띏瘶␗棐�ⶑ淯↮䤗邷骪톕힪뛇ם飉簯찍끑蕋�᫮遱ꉆᱭⓂþ们虈뛩䧏씂�襾焢ꆪꔔ썍괽寊氥瞭京�ਖ਼쟴྇茶�翯竽岽䡽ꢎ樳䓰砠શ뗳베宪㚷鹲懹戲�꥙囹鞾㭚ଫꈍ쵓╏系跿塀㋐봏뎲揝�붨䅃�㙥ῼ. The first DWORD in the Data section contains the index value to the malformed string while the second and third DWORDs in the Data section contain the last valid index values.

    I got a bunch of “chinese” in my client log when this occurred

    also had another warning saying something about registry being open and in use as well

    Hope it helps guys.

  • Ever since deploying the debug 0.11.5 client that Joe gave me, the issue has not reappeared. It’s been about 20 or so days, I think that’s plenty of time. It could be possible that it is a bug in the client, but at this point and being unable to reproduce, I think it’s a non issue.

    For all those that experience the settings.json file going blank, just uninstall and reinstall the fog client. You might do this manually or with a very carefully crafted startup script via GPO, that checks for the presence of a dummy file to indicate if it should uninstall/reinstall or not.

  • @Tom-Elliott Thanks for the explanation. I am now in the process of upgrading all four fog instances I manage to trunk. This way I might be helpful to advance this further. Keep it rolling!

  • @jhuebner 0.11.0 was NOT by any means the version that shipped with FOG 1.2.0.

    The reasoning a file might go corrupt in the new client particularly with 0.11 on 1.2.0 (and potentially others) is because the client get’s configuration information from the Server. This configuration information has the potential to change each cycle, but to ensure the client operates during the time when a server may be unreachable we update information internally. If this data is corrupted the client usually cleans itself up to ensure something doesn’t go wrong, but if it can’t do that and the file is “open” there is a possible layer for the file to become corrupted.

    We haven’t figured out WHAT is causing this and have been working to try to replicate the problem. To me, it seems like a windows update, but this is only my opinion.

  • Thanks for your quick reply.

    I was using the Agent 0.11.0 with 1.2.0, I think that’s the one that came with 1.2.0. I am of the opinion that, whatever version, nothing should kill the settings file so that the agent/zazzles does not work anymore. I am in the good position that our production environment is more of a lab-style character (university campus) - so the world does not end when the agent is not working. I just have to do more manual work which I of course try to avoid.

    Besides that: I really like the project and appreciate the work you put into this!

  • @jhuebner The new client never worked with the version 1.2.0 of FOG. The new client was ONLY introduced during the trunk builds. It’s no wonder why they got corrupted, they had nothing to work from if you really did have the new client and FOG 1.2.0.

  • I don’t know if this is related:

    I had an issue where the content of the settings.json got corrupted on half of the computers and I had to reinstall the agent manually. I don’t know what killed the contents of the file. That happened with the stable version of fog (1.2.0). After reinstalling the agent I upgraded to trunk and am now waiting to see what happens next.

  • @Wayne-Workman I’m only giving the information as I’m seeing it.

    This appears to have started in late July, early August, correct? Which would’ve been around the time (potentially) an image was being updated? I don’t know.

    I still am leaning towards some windows update, and particularly a Windows Update in regards to the .NET Framework.

    It’s the only guesses I can have at this point and with the “debug client” we should’ve seen something by now if it was server related. I suppose there could still be an issue in the client, but considering the client running has probably usurped the number of cycles before the issue became present, I would imagine this is not the issue either.

  • @Tom-Elliott Between our summer deployment and now, we’ve ran zero windows updates. We can’t risk downtime, and windows updates are not trustworthy enough, and we have too large a gap of downtime in the summer to not utilize for imaging an updated image with.

  • I’m very strongly thinking this related to a Windows update. Particularly one in regards to the .NET framework.