FOG 1.5.6: Windows unattended.xml is intermittently failing to work



  • I am encountering an issue where FOG 1.5.6 is not fully overwriting the beginning of a target drive on deploy.

    I’m not entire sure what’s going on, all I know is, sometimes my Windows 10 deployments are failing with strange errors.

    To fix these machines, I have to boot up a linux rescue CD, use DD to zero the first 100MB of the target disk, then deployments work without issue. Incorrect, it’s just intermittently failing. Wiping the disk has no effect.

    So FOG is leaving junk behind on the drives that is interfering with Windows 10 somehow. It would be nice to have an option in for deployment to zero a portion of the drive (I like to nuke the first 100MB, but making this flexible would be great.)



  • I’m back! New results!

    I prepared my same image again, just to be sure it’s ship-shape. I captured the image with ‘Multiple Partition Image - Single Disk (Not Resizable)’ setting on the image.

    4 out of 4 deploys worked flawlessly. Think we can call this case closed. Resizing NTFS partitions is dubious with the linux based tools and is causing intermittent issues with UEFI systems.

    Why this doesn’t affect legacy deployments, you got me. Ask Microsoft? I’m going to just put a powershell script to have windows do my resize operations after setup completes.

    Thanks to all who helped me track this issue down to it’s cause. Back to work, for me!

    For developers looking to remedy this issue, I’d look into this error message (which I’m not getting anymore with that no-resize setting!!):

    The protective MBR's 0xEE partition is oversized!  Auto-repairing.
    

    I’d bet money this has something to do with it breaking on UEFI systems.


  • Moderator

    @Cheetah2003 I didn’t look for my script from a few years ago, but the basic concept is here: https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/diskpart-scripts-and-examples Where you run the diskpart with the /s parameter and give it a text file with the extend commands.

    And this one that shows you the extend commands: https://www.cloudsigma.com/windows-partition-auto-expand-script/

    No fancy powershell commands needed as well as having to deal with PS execution security.

    I think with these two tests we’ll have a better idea if fog resize is messing with the disk partition structure.



  • @george1421 said in FOG 1.5.6: Windows unattended.xml is intermittently failing to work:

    @Cheetah2003 Ok what happens if you create your master image on a 70GB disk (smaller than anything currently in a productive environment) and make the last partition your C: drive. Then capture and deploy as single disk non-resizable. Maybe the FOG resize code is make things unstable for UEFI (not sure why your case is unique at the moment).

    Before fog support single disk resizable I had code in my setup complete.cmd file that would instruct windows to expand the last partition to the size of the disk. I have to look to see if I can find that code again, but it basically automated diskpart to expand the partition.

    Yeah, I considered using SetupComplete.cmd to automated that resize myself. I was studying all the fancy powershell commands to manipulate partitions. Automating the resize with powershell script should be mostly trivial.

    I’ll do you one better, I’ve squeezed down my image to 50GB. When it’s ready, I will try both some raw disk captures/deploys and no-resize partition capture/deploy as well and report back. Should have more information on Monday.


  • Moderator

    @Cheetah2003 Ok what happens if you create your master image on a 70GB disk (smaller than anything currently in a productive environment) and make the last partition your C: drive. Then capture and deploy as single disk non-resizable. Maybe the FOG resize code is make things unstable for UEFI (not sure why your case is unique at the moment).

    Before fog support single disk resizable I had code in my setup complete.cmd file that would instruct windows to expand the last partition to the size of the disk. I have to look to see if I can find that code again, but it basically automated diskpart to expand the partition.



  • New information, and alas it’s not good.

    On a hunch, given the circumstances, I decided to switch the images from partition captures to just a raw disk capture (dd.) Oh god it’s slow.

    But 5 out of 5 deployments worked, no hiccups. I’m afraid I believe FOG is damaging the UEFI partitions some of the time?

    What should I try next in my diagnosis? I can work with DD/raw disk images, just an extra step later in our process and it’s kind of slow. But it’s working?

    Could this be related to that strange error message I always get when capturing UEFI/GPT disks? I do get this message several times during a capture. It’s never caused a problem in the past, but maybe it has something to do with this?

    The protective MBR's 0xEE partition is oversized!  Auto-repairing.
    

    I’m presently preparing a new image that’s as small as I can get it, to do more deployments in mass, to get a better sample of success vs. failure rate (five isn’t a great sample size.)



  • Same issue. Very consistently 50% of the computers fail, with the message in that screenshot.

    Deployed to 8 identical machines, half of them boot to setup normally, the other half fail with the error. Curious I tried clicking ‘ok’ on some of the machines that failed, but they just reboot and fail again. Only redeploying will give a chance for it to work.

    Got some different machines, by a different manufacturer, got 4 of those, deployed the image. 2 boot normally, 2 fail at the error.

    Anyone got any ideas what to try next? I’m stumped. I posted on M$ technet about this as well.



  • Just chiming in. That subtle tiny change to the unattended.xml made no difference. I’m starting to suspect the target machines, at this point.

    Today, I will try completely different target machines, and see if I still get the problem.


  • Moderator

    @Cheetah2003 said in FOG 1.5.6: Windows unattended.xml is intermittently failing to work:

    processorArchitecture=“amd64” This is how it was in the old file.
    processorArchitecture=“wow64” This is how it is in the new file.

    I’m in the middle of meeting this afternoon but those are not equivalent.
    amd64 refers to 64 bit and wow64… maybe the 32 bit environment inside amd64.

    I’ve also used this site in the past to generate the answer files: https://www.windowsafg.com/

    Either way when using fog remove the parts of the unattend.xml file that deal with disk partitioning, that’s fog’s realm. Just remove those bits from the answer file.



  • @george1421 Initially I thought there’s no difference. Then I spotted one tiny little change:
    processorArchitecture=“amd64” This is how it was in the old file.
    processorArchitecture=“wow64” This is how it is in the new file.

    I will try regenerating images using this new file and see if it makes any difference.



  • @george1421 So I just got a patch for WSIM off Technet. I’ll regenerate my answer files and report back results.



  • @george1421 I understand what you’re saying.

    However, I did generate this answer file with WSIM. This is what it spits out when you generate a catalog from installation media and flip that copyprofile option to True in the specialize step, then tell it to generate the file. So I’m a little (hopefully understandably) skeptical when you tell me my file is incomplete. This is what WSIM spits out.

    I was thinking the publickeytokens are not quite correct and it might be confusing setup some of the time. Why it never breaks on legacy deployments, your guess is as good as mine. Unfortunately, I can’t get WSIM to build a catalog for the current installation media. I’m on technet trying to resolve that problem. It’s a fun one… WSIM complains I need a specific version of ADK, but when I click ‘help->about’ on WSIM, it is the exact version it says I need. Fun times. Have a screenshot of that joy:
    alt text

    The generated file and it’s tokens has been an issue in the past, where previously generated unattended.xml would not work with newer versions, due to mismatches on those tokens. That hasn’t been an issue for about 3 or 4 update cycles, and previously, it would not even sysprep at all if the tokens were bad. I was hoping re-generating my unattended.xml with WSIM, using 1903 image as a reference would solve this issue. But I can’t get that going right now (See screenshot.)

    Thank you for moving the thread to a more appropriate place. I edited the topic to reflect the issue more clearly, as well.


  • Moderator

    [MOD note]: I moved this topic out of the developer/bug forum to the Windows Problem forum where its more closely aligned with issues in that forum.


  • Moderator

    @Cheetah2003 Don’t take this the wrong way, but at this time I don’t really care about the copy profile bit, that is not where your issue is. Actually if you look at my unattend.xml file it has the copy profile flag set to true anyway.

    My point is your unattend.xml file is incomplete (IMO). Whereas I asked if my unattend.xml file would work more reliably since it IS complete and I know from experience works with both bios and uefi platforms. And just for clarity I have 2 images that are built by MDT exactly the same. One is for bios target systems and one is for uefi target systems. They are exactly the same and use the same unattend.xml file. The only difference is the VM that MDT builds the golden image on. One is bios based and one is uefi based.



  • @george1421 said in FOG 1.5.6: Deploy is leaving remnants of previous data:

    @Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:

    If y’all don’t wanna help anymore, that’s perfectly understandable. I’ll eventually figure it out.

    I don’t believe anyone said about not helping, I wanted to clarify that at this point the actual issue is in opposition to this thread subject line.

    As for the copyprofile… My experience is that it doesn’t work or at least appears to not work as it did with Win7 in how the profile was managed. But that, I guess is just my experience.

    Yeah. There was a point where it stopped working briefly, I think 1709 was the ugly release that broke copyprofile, but it was fixed a few weeks later. don’t quote me on that, this was a couple years ago. foggy memory.

    As to the topic. I agree, this should move elsewhere and be renamed. I’ve been editing the initial post as we’ve discovered things. It can probably just be entirely renamed and moved elsewhere. It’s not a FOG problem.

    However, my initial ‘request’ still stands. I think it would be useful if FOG’s deploy could be configured to zero a portion or all of the target drive before dumping the image onto the target.

    I would still be interested with my unattend.xml file if OOBE/WinSetup runs correctly in uefi mode.

    I appreciate this sentiment. But my process requires CopyProfile. I’m not really interested in reinventing my entire image creation process. I want to fix it so it works like it always has, not change to completely new arrangement. You’ve no idea how loudly I screamed on technet for microsoft to fix CopyProfile back when they broke it a couple years ago. Gawd I hope they didn’t screw something up with it again.

    Not convinced it’s entirely CopyProfile’s problem, the fact things work half the time on UEFI deployments, and work 100% of the time on CSM deployments suggests something…else.

    Someday, I’ll migrate away from CopyProfile. But while it still works, I plan on using it. It really simplifies my entire workflow, and avoids having to author a bunch of ugly powershell scripts to replicate what CopyProfile does. I’ll cross that bridge when I absolutely have to.


  • Moderator

    @Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:

    If y’all don’t wanna help anymore, that’s perfectly understandable. I’ll eventually figure it out.

    I don’t believe anyone said about not helping, I wanted to clarify that at this point the actual issue is in opposition to this thread subject line.

    As for the copyprofile… My experience is that it doesn’t work or at least appears to not work as it did with Win7 in how the profile was managed. But that, I guess is just my experience.

    I would still be interested with my unattend.xml file if OOBE/WinSetup runs correctly in uefi mode.



  • @george1421 said in FOG 1.5.6: Deploy is leaving remnants of previous data:

    Secondly that unattend.xml file and its copyprofile tag does nothing with windows 10. M$ removed the copyprofile feature when windows 10 was released. This alone has caused so much angst with image creators.

    ??? It works. It’s always worked, since day 1 of Windows 10’s release. I did have some issues at one point with it malfunctioning, but updates from M$ fixed it again.

    It copies the profile exactly as it says it will. So I dunno man.

    But you are right about one thing, this is not a FOG issue. I said that in my previous post. We’ve figured that out.

    If y’all don’t wanna help anymore, that’s perfectly understandable. I’ll eventually figure it out.


  • Moderator

    @Cheetah2003 I can’t believe that is your unattend.xml file and you can actually build computers. That is incomplete in my mind for many reasons. Understand I’m not saying it won’t work, I just can’t believe that it worked until now.

    Secondly that unattend.xml file and its copyprofile tag does nothing with windows 10. M$ removed the copyprofile feature when windows 10 was released. This alone has caused so much angst with image creators.

    With that said, here is an example of my sanitized unattend.xml file. It hasn’t changed since Win7 days and works correctly for both win7/win10/bios/uefi. https://forums.fogproject.org/topic/11920/windows-10-1803-sysprep-problem/7 Possibly there is something in my unattend.xml file that is needed for uefi deployments. I haven’t experimented with throwing things out until it broke since it worked, I really didn’t need to mess with it.

    It would be interesting to know if you following the directions in that post including the sysprep command if you would still have similar results. I’m almost 100% sure your problem here is NOT a FOG issue, but rather a sysprep/M$ Windows problem.



  • @george1421 @Sebastian-Roth First, want to thank both of you for helping me with these issues. You’ve both been great. Despite this moving away from FOG issues to more of a Windows issue.

    Anyway, as per @george1421 's recommendation, here’s setuperr.log. As expected from Microsoft, it’s rather cryptic and doesn’t really tell me what’s gone wrong.

    setuperr.log:

    2019-07-02 09:49:33, Error                        [setup.exe] [Action Queue] : Unattend action failed with exit code 4
    2019-07-02 09:49:33, Error                        [setup.exe] Execution of unattend GCs failed; hr = 0x0; pResults->hrResult = 0x8030000b
    


  • @george1421 Thanks for the info.

    However… Sysprep verifies the answer file at sysprep invocation. This is typically when a ‘problem’ will show up. This is partly why this problem is so frustrating.

    Additionally, this is the SAME exact answer file I use for six different Windows images. It works flawlessly on all 4 legacy images (for BIOS/Legacy/CSM targets.) It’s only the 2 UEFI images (Pro and Home) that have issues, and even then, only sometimes. Sometimes it works great. Sometimes it pukes during that first boot. It makes no sense.

    Lastly, my answer file is simplicity itself, the only thing in it is the directive to copy admin profile to the newly created user during setup. That’s it, there’s nothing else in there.

    Here it is, if you don’t believe me!

    <?xml version="1.0" encoding="utf-8"?>
    <unattend xmlns="urn:schemas-microsoft-com:unattend">
        <settings pass="specialize">
            <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                <CopyProfile>true</CopyProfile>
            </component>
        </settings>
        <cpi:offlineImage cpi:source="wim:c:/users/administrator/desktop/install.wim#Windows 10 Home" xmlns:cpi="urn:schemas-microsoft-com:cpi" />
    </unattend>
    

    I will however study the log files for an affected machine, because you’re right, there may be a clue there.

    @Sebastian-Roth Here’s the requested picture:

    Just an FYI, in order to fetch that picture, we deployed the UEFI image to 8 identical computers. 4 failed, 3 succeeded, 1 was still working it when i got picture.


Log in to reply
 

402
Online

6.2k
Users

13.6k
Topics

128.2k
Posts