FOG 1.5.6: Windows unattended.xml is intermittently failing to work
-
@Cheetah2003 Well that’s interesting to hear. I am not aware of anyone else having reported this lately and I really hope we can figure this out.
The FOS scripts run the following command to clear the old data from the drive:
sgdisk -Z /dev/...
Up until now I assumed that would be save enough. Can you be more specific on the details where you get the impression that disks are not cleaned (enough)? When you say “deployments are failing with strange errors” what does that mean? Please give us more information.
-
It’s possible I’m incorrect in my assumption that there is data remnants causing issues.
Basically the error I get, during Windows first-boot is it’s complaining my unattended setup answer file is incorrect. But it’s not doing it for every deployment. Seems random. I think my image is just messed up.
However. It would still be nice to be able to nuke a portion of the drive with all zeros if needed for some reason. Maybe an optional place to insert a ‘dd’ command line during the deployment. I don’t think it’s entirely necessary, but could be handy.
I was honestly just going to pull at the threads of this thing to find the deployment scripts and slide in a dd drive nuking command myself. But it may be unnecessary. The randomness of my issue made me believe it’s data remnants. But hand-nuking drives yield no better result.
I’ll mark this as solved since it appears to be me, not FOG.
EDIT: Can’t seem to change the solved/unsolved status of this topic. So, just tag it solved for now, since I’m pretty sure it’s my mistake, not FOG.2nd EDIT: I did want to mention, I am only encountering this issue on UEFI deployments. Normal legacy (BIOS) deployments are unaffected. Which is partly why I was suspecting FOG of messing something up, or leaving junk behind that Windows was finding and choking on. On UEFI, I will mention, I have not successfully created a PXE boot environment that works on UEFI machines, so I do have to flip target machines back to legacy BIOS mode to do the deployment, then flip it back to UEFI/Secure Boot afterwards to boot up the machine. This is when the issue comes up. Since the error I’m getting is complaining about my unattended answer file being messed up, this seems relevant. I’m using the same answer file for all my images (there’s 6 of them.) and only the UEFI deployments are choking on it. So… not sure what’s going on there. Going to regenerate my answer file with the Windows ADK today to see if that will fix things. Maybe something changed in 1903 UEFI that the legacy deployments are unaffected by.
One more edit haha: On the unattended setup answer file, in the past, if this file was malformed in anyway, sysprep would bitch right out of the gate and refuse to run. I am not encountering that, it’s only when the image is deployed and booted (UEFI only) that something goes weird.
-
@Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
I was honestly just going to pull at the threads of this thing to find the deployment scripts and slide in a dd drive nuking command myself.
Nice you found this already. I was gonna suggest you try this out to see if it works for you.
I have to admit that I am not much of a Windows guy really. Probably good to ask @george1421 and others here in the forums about issues with sysprep and stuff like that. But you will need to provide a bit more details as well I am afraid.
it’s only when the image is deployed and booted (UEFI only) that something goes weird.
What exactly goes wrong? Error message?!
-
@Sebastian-Roth said in FOG 1.5.6: Deploy is leaving remnants of previous data:
What exactly goes wrong? Error message?!
I told you before. It’s complaining my unattended setup answer file is invalid during the first boot of the deployed image (the part where Windows figures out what it’s running on.) And it doesn’t do it every time, only like… 70% of the time, and only on UEFI deployments. Legacy deployments are unaffected. Same answer file, same setup for the Windows image, except the UEFI/legacy partition layout differences, and DOS vs. GPT style partition tables.
I’m also attacking this from the Windows angle too, trying to make a new answer file to put into my images, but of course, Microsoft’s tools are malfunctioning on me as well.
-
@Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
It’s complaining my unattended setup answer file is invalid during the first boot of the deployed image
Please give us the exact error message. Best if you can take a picture and post here. The more precise the info the better we can help…
-
@Sebastian-Roth said in FOG 1.5.6: Deploy is leaving remnants of previous data:
@Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
It’s complaining my unattended setup answer file is invalid during the first boot of the deployed image
Please give us the exact error message. Best if you can take a picture and post here. The more precise the info the better we can help…
Sure I can do this. I’ll get a pic with my cell phone on one of the machines we deployed to. Should have it sometime tomorrow.
-
@Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
And it doesn’t do it every time, only like… 70% of the time, and only on UEFI deployments.
First let me say that 1903 is a very different beast than 1809. While it has the Windows 10 name, its as different as Win7 was to XP.
With that said, when the system errors out. It should tell you a bit more than sysprep is damaged. There are log files in c:\windows\panther (where you unattend.xml file should be) that might give you a better clue to what it doesn’t like. You might be able to inspect these or at least copy them to a flash drive to look at them off-line with the Shift-F10 method to call up a command window and notepad.
-
@george1421 Thanks for the info.
However… Sysprep verifies the answer file at sysprep invocation. This is typically when a ‘problem’ will show up. This is partly why this problem is so frustrating.
Additionally, this is the SAME exact answer file I use for six different Windows images. It works flawlessly on all 4 legacy images (for BIOS/Legacy/CSM targets.) It’s only the 2 UEFI images (Pro and Home) that have issues, and even then, only sometimes. Sometimes it works great. Sometimes it pukes during that first boot. It makes no sense.
Lastly, my answer file is simplicity itself, the only thing in it is the directive to copy admin profile to the newly created user during setup. That’s it, there’s nothing else in there.
Here it is, if you don’t believe me!
<?xml version="1.0" encoding="utf-8"?> <unattend xmlns="urn:schemas-microsoft-com:unattend"> <settings pass="specialize"> <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <CopyProfile>true</CopyProfile> </component> </settings> <cpi:offlineImage cpi:source="wim:c:/users/administrator/desktop/install.wim#Windows 10 Home" xmlns:cpi="urn:schemas-microsoft-com:cpi" /> </unattend>
I will however study the log files for an affected machine, because you’re right, there may be a clue there.
@Sebastian-Roth Here’s the requested picture:
Just an FYI, in order to fetch that picture, we deployed the UEFI image to 8 identical computers. 4 failed, 3 succeeded, 1 was still working it when i got picture.
-
@george1421 @Sebastian-Roth First, want to thank both of you for helping me with these issues. You’ve both been great. Despite this moving away from FOG issues to more of a Windows issue.
Anyway, as per @george1421 's recommendation, here’s setuperr.log. As expected from Microsoft, it’s rather cryptic and doesn’t really tell me what’s gone wrong.
setuperr.log:
2019-07-02 09:49:33, Error [setup.exe] [Action Queue] : Unattend action failed with exit code 4 2019-07-02 09:49:33, Error [setup.exe] Execution of unattend GCs failed; hr = 0x0; pResults->hrResult = 0x8030000b
-
@Cheetah2003 I can’t believe that is your unattend.xml file and you can actually build computers. That is incomplete in my mind for many reasons. Understand I’m not saying it won’t work, I just can’t believe that it worked until now.
Secondly that unattend.xml file and its copyprofile tag does nothing with windows 10. M$ removed the copyprofile feature when windows 10 was released. This alone has caused so much angst with image creators.
With that said, here is an example of my sanitized unattend.xml file. It hasn’t changed since Win7 days and works correctly for both win7/win10/bios/uefi. https://forums.fogproject.org/topic/11920/windows-10-1803-sysprep-problem/7 Possibly there is something in my unattend.xml file that is needed for uefi deployments. I haven’t experimented with throwing things out until it broke since it worked, I really didn’t need to mess with it.
It would be interesting to know if you following the directions in that post including the sysprep command if you would still have similar results. I’m almost 100% sure your problem here is NOT a FOG issue, but rather a sysprep/M$ Windows problem.
-
@george1421 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
Secondly that unattend.xml file and its copyprofile tag does nothing with windows 10. M$ removed the copyprofile feature when windows 10 was released. This alone has caused so much angst with image creators.
??? It works. It’s always worked, since day 1 of Windows 10’s release. I did have some issues at one point with it malfunctioning, but updates from M$ fixed it again.
It copies the profile exactly as it says it will. So I dunno man.
But you are right about one thing, this is not a FOG issue. I said that in my previous post. We’ve figured that out.
If y’all don’t wanna help anymore, that’s perfectly understandable. I’ll eventually figure it out.
-
@Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
If y’all don’t wanna help anymore, that’s perfectly understandable. I’ll eventually figure it out.
I don’t believe anyone said about not helping, I wanted to clarify that at this point the actual issue is in opposition to this thread subject line.
As for the copyprofile… My experience is that it doesn’t work or at least appears to not work as it did with Win7 in how the profile was managed. But that, I guess is just my experience.
I would still be interested with my unattend.xml file if OOBE/WinSetup runs correctly in uefi mode.
-
@george1421 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
@Cheetah2003 said in FOG 1.5.6: Deploy is leaving remnants of previous data:
If y’all don’t wanna help anymore, that’s perfectly understandable. I’ll eventually figure it out.
I don’t believe anyone said about not helping, I wanted to clarify that at this point the actual issue is in opposition to this thread subject line.
As for the copyprofile… My experience is that it doesn’t work or at least appears to not work as it did with Win7 in how the profile was managed. But that, I guess is just my experience.
Yeah. There was a point where it stopped working briefly, I think 1709 was the ugly release that broke copyprofile, but it was fixed a few weeks later. don’t quote me on that, this was a couple years ago. foggy memory.
As to the topic. I agree, this should move elsewhere and be renamed. I’ve been editing the initial post as we’ve discovered things. It can probably just be entirely renamed and moved elsewhere. It’s not a FOG problem.
However, my initial ‘request’ still stands. I think it would be useful if FOG’s deploy could be configured to zero a portion or all of the target drive before dumping the image onto the target.
I would still be interested with my unattend.xml file if OOBE/WinSetup runs correctly in uefi mode.
I appreciate this sentiment. But my process requires CopyProfile. I’m not really interested in reinventing my entire image creation process. I want to fix it so it works like it always has, not change to completely new arrangement. You’ve no idea how loudly I screamed on technet for microsoft to fix CopyProfile back when they broke it a couple years ago. Gawd I hope they didn’t screw something up with it again.
Not convinced it’s entirely CopyProfile’s problem, the fact things work half the time on UEFI deployments, and work 100% of the time on CSM deployments suggests something…else.
Someday, I’ll migrate away from CopyProfile. But while it still works, I plan on using it. It really simplifies my entire workflow, and avoids having to author a bunch of ugly powershell scripts to replicate what CopyProfile does. I’ll cross that bridge when I absolutely have to.
-
@Cheetah2003 Don’t take this the wrong way, but at this time I don’t really care about the copy profile bit, that is not where your issue is. Actually if you look at my unattend.xml file it has the copy profile flag set to true anyway.
My point is your unattend.xml file is incomplete (IMO). Whereas I asked if my unattend.xml file would work more reliably since it IS complete and I know from experience works with both bios and uefi platforms. And just for clarity I have 2 images that are built by MDT exactly the same. One is for bios target systems and one is for uefi target systems. They are exactly the same and use the same unattend.xml file. The only difference is the VM that MDT builds the golden image on. One is bios based and one is uefi based.
-
[MOD note]: I moved this topic out of the developer/bug forum to the Windows Problem forum where its more closely aligned with issues in that forum.
-
@george1421 I understand what you’re saying.
However, I did generate this answer file with WSIM. This is what it spits out when you generate a catalog from installation media and flip that copyprofile option to True in the specialize step, then tell it to generate the file. So I’m a little (hopefully understandably) skeptical when you tell me my file is incomplete. This is what WSIM spits out.
I was thinking the publickeytokens are not quite correct and it might be confusing setup some of the time. Why it never breaks on legacy deployments, your guess is as good as mine. Unfortunately, I can’t get WSIM to build a catalog for the current installation media. I’m on technet trying to resolve that problem. It’s a fun one… WSIM complains I need a specific version of ADK, but when I click ‘help->about’ on WSIM, it is the exact version it says I need. Fun times. Have a screenshot of that joy:
The generated file and it’s tokens has been an issue in the past, where previously generated unattended.xml would not work with newer versions, due to mismatches on those tokens. That hasn’t been an issue for about 3 or 4 update cycles, and previously, it would not even sysprep at all if the tokens were bad. I was hoping re-generating my unattended.xml with WSIM, using 1903 image as a reference would solve this issue. But I can’t get that going right now (See screenshot.)
Thank you for moving the thread to a more appropriate place. I edited the topic to reflect the issue more clearly, as well.
-
@george1421 So I just got a patch for WSIM off Technet. I’ll regenerate my answer files and report back results.
-
@george1421 Initially I thought there’s no difference. Then I spotted one tiny little change:
processorArchitecture=“amd64” This is how it was in the old file.
processorArchitecture=“wow64” This is how it is in the new file.I will try regenerating images using this new file and see if it makes any difference.
-
@Cheetah2003 said in FOG 1.5.6: Windows unattended.xml is intermittently failing to work:
processorArchitecture=“amd64” This is how it was in the old file.
processorArchitecture=“wow64” This is how it is in the new file.I’m in the middle of meeting this afternoon but those are not equivalent.
amd64 refers to 64 bit and wow64… maybe the 32 bit environment inside amd64.I’ve also used this site in the past to generate the answer files: https://www.windowsafg.com/
Either way when using fog remove the parts of the unattend.xml file that deal with disk partitioning, that’s fog’s realm. Just remove those bits from the answer file.
-
Just chiming in. That subtle tiny change to the unattended.xml made no difference. I’m starting to suspect the target machines, at this point.
Today, I will try completely different target machines, and see if I still get the problem.