Bluescreen/Corrupt Drive Issues Post Imaging

explosivo98

Hello, I’ve recently been having a lot of problems with systems being deployed with certain images on my server developing bluescreen issues either immediately after imaging, or anywhere from a week or more after deploying. I wasn’t initially thinking this was due to the image being used but it’s been happening pretty consistently with a few of our images and no others despite it all going on the same hardware. It’s actually become a pretty significant issue for us and I’ve been trying to chase down the cause of it but the ones that we get back are showing registry errors and I can’t even work with the drives, they’re totally shot.

I think right now my main theory is that since I’m updating and capturing the images on physical hardware that some of the systems I’m using for it are corrupt in some way, and those corrupt files are making their way onto the image which is killing some drives that are receiving it. On Friday I attempted to capture a new version of the image from an old backup I had of the problematic one and just found out this morning that it isn’t even able to be deployed right now with the <stdin> is not compressed error so I think that system I used to capture is completely gone and need to try again.

I did recently update to 1.5.9 in an effort to hopefully curb some of these problems but it doesn’t seem to have changed much. I have a few questions; I stumbled on a forum post from from 2016 from someone looking to develop a plugin that allows scanning of images for potential corruptions but all the links were dead and I haven’t been able to find anything out about this by searching around, is this something that can be done today? If there is a way to easily identify which images need to be re-done or uncover problems as they’re imaged that would be super helpful.

Secondly, I’ve been using FOG for a couple years now without too many problems but this is my first foray into imaging so I’ve sort of been doing all this on the fly. Is there a better way to be capturing these images to avoid these kinds of problems from happening? Like from a VM or something? I guess I should be doing SFC/DISM disk checks and all that before capturing images moving forward at least.

george1421

@explosivo98 said in Bluescreen/Corrupt Drive Issues Post Imaging:

certain images on my server developing bluescreen issues either immediately after imaging,

I’m trying to blame the FOG server here but I’m not able to say specifically its a fog problem especially since the system bluescreens some time in the future. Once the target computer reboots post image deployment FOG is out of the picture. It should either boot or not. So this makes me think its something in your image (or something that is happening to the system post OOBE).

I can say from my experience that I always develop my golden image on a virtual machine (for hardware independence as well as snap-shotting capabilities). This golden image is not allowed to access the internet and has delivery optimization restricted so it doesn’t try to get updates from other windows 10 systems on the network.

What OS (probably windows 10 since you said bluescreen) version are you deploying?

Are you sysprepping the image before capture with FOG?

explosivo98

@george1421 Windows 10, and yeah for the record I don’t think that FOG is doing anything wrong throughout the process, it seems to be working as it should be but I’m only now realizing that there’s likely issues with the hardware or images themselves that might’ve gotten passed down through multiple iterations of images. Developing on a VM sounds like a much better idea than what I’m doing and I may need to look into it, I basically have a rack of 10 or so systems that are supposed to only be used for imaging but hardware shortages mean sometimes we need them for other purposes so they get overwritten with images quite often. I always assumed there would be other issues related to not being created through a VM and not on the hardware it’s being deployed to but I guess not. No, I don’t do sysprep before capturing.

Sebastian Roth

@explosivo98 It’s hard to get a full picture from remote but we can still give it a try. First trying to answer your questions.

On Friday I attempted to capture a new version of the image from an old backup I had of the problematic one and just found out this morning that it isn’t even able to be deployed right now with the <stdin> is not compressed error so I think that system I used to capture is completely gone and need to try again.

More often than not there is more information on the error screen that you expect. Please take a picture of that particular error screen and post that here!

I stumbled on a forum post from from 2016 from someone looking to develop a plugin that allows scanning of images for potential corruptions but all the links were dead and I haven’t been able to find anything out about this by searching around, is this something that can be done today?

Not as far as I know about the plugin. There is a kind of easy check you can do manually just to check if the images files are intact. Something like zcat /images/ImageName/d1p1.img > /tmp/testfile would at least show you if there is an obvious error in the file - it would fail with an error of this file is truncated or otherwise messed up.

Overall I don’t think this is of too much help. It’s either something you see while it deploys or it’s within the image and FOG is not to blame.

Secondly, I’ve been using FOG for a couple years now without too many problems but this is my first foray into imaging so I’ve sort of been doing all this on the fly. Is there a better way to be capturing these images to avoid these kinds of problems from happening? Like from a VM or something? I guess I should be doing SFC/DISM disk checks and all that before capturing images moving forward at least.

I probably get this wrong but what did you use FOG for if not for imaging?

As George mentioned using a VM to build your golden master has advantages but you might need to take care of drivers then. On the other hand capturing from hardware is not wrong altogether. If your hosts are all same make and model you might just one of those as master. In my old work we’ve been doing this for years and it worked great.

There are things you might want to check (incomplete list!):

Fast boot disabled
sfc /scannow
chkdsk
…

explosivo98

@sebastian-roth thank you, there’s a lot of helpful info here. I know details are light and I apologize, but working from home so I don’t have the exact error codes right now but have someone in our lab now so hopefully should get some info later. And to clarify i’ve only used FOG for imaging (I think that’s all we can use it for :p) but I meant that in general this is the first time I’ve been doing imaging so I don’t really know about standards and practices that might be normal for imaging (like sysprep etc)

george1421

@explosivo98 said in Bluescreen/Corrupt Drive Issues Post Imaging:

No, I don’t do sysprep before capturing.

So you then can only deploy the image to the same model as you are capturing from. Just realize that M$ recommends to use sysprep even when deploying to the same model computer.

For virtualization you can use vmware, virtualbox, proxmox, and hyper-v (if you feel lucky). I do have a tutorial on how to deploy a single golden image to multiple different hardware by injecting the model specific drivers during FOG deployment.

explosivo98

@george1421 I’ve actually managed to deploy this image to a range of different hardware without doing the sysprep, although I guess the fact that this thread exists might indicate that’s not the best idea… I would very much be interested in the guide you mentioned, I think next time I’m in the office I’ll have to seriously look into doing it through vmware/vbox.

Bluescreen/Corrupt Drive Issues Post Imaging

128

12.5k

17.5k

156.2k