Just chiming in. That subtle tiny change to the unattended.xml made no difference. I’m starting to suspect the target machines, at this point.
Today, I will try completely different target machines, and see if I still get the problem.
Just chiming in. That subtle tiny change to the unattended.xml made no difference. I’m starting to suspect the target machines, at this point.
Today, I will try completely different target machines, and see if I still get the problem.
@george1421 said in FOG 1.5.6: Windows unattended.xml is intermittently failing to work:
@Cheetah2003 Ok what happens if you create your master image on a 70GB disk (smaller than anything currently in a productive environment) and make the last partition your drive. Then capture and deploy as single disk non-resizable. Maybe the FOG resize code is make things unstable for UEFI (not sure why your case is unique at the moment).
Before fog support single disk resizable I had code in my setup complete.cmd file that would instruct windows to expand the last partition to the size of the disk. I have to look to see if I can find that code again, but it basically automated diskpart to expand the partition.
Yeah, I considered using SetupComplete.cmd to automated that resize myself. I was studying all the fancy powershell commands to manipulate partitions. Automating the resize with powershell script should be mostly trivial.
I’ll do you one better, I’ve squeezed down my image to 50GB. When it’s ready, I will try both some raw disk captures/deploys and no-resize partition capture/deploy as well and report back. Should have more information on Monday.
@Quazz said in FOG 1.5.6: Auto resize is unpredictable:
@Cheetah2003 OS-built recovery partitions have the partition flags that keeps them fixed size.
@Quazz Argh. As I said several times, this isn’t a OS built recovery partition. I built it myself. Are you even reading my posts???
@Eric-Johnson Welcome. And yeah, what you’re describing sounds very similar to the issue I had with the previous version of FOG that required I move my recovery partition to be before the OS partition, making the OS partition last on the disk for resize to work properly.
@Sebastian-Roth Sure sure. I’ll do some experiments and report back any findings. Might be a few days, so I hope you’re not in a hurry.
@Sebastian-Roth said in FOG 1.5.6: Auto resize is unpredictable:
@Cheetah2003 Are you still keen to look into this?
I’d be happy to. What do you want me to do?
Also, for what it’s worth, I’m not sure multi-partition resizing is really necessary. I can’t really think of any use cases for this ‘feature.’
The percentage thing described earlier sounds pretty dubious, especially if you’re capturing 5 partitions from a 50GB disk… and the recovery partition is 20% of that space (10GB)… you don’t need that taking 20% of a target drive. That would be kinda crazy.
So really, IMHO, a percentage of the original drive captured from seems kinda not-useful. I still think this should be controllable entirely from the image specification. But I think that would require the image specification to actually pull info out of the captured image to offer the user options for how to handle the partitions contained within that image. Probably a pretty big rewrite of that entire part of the system. I’d love to see this, but yeah, it’s going to be a big task from my perspective.
So I’ll be happy to peek/test whatever you need help with, as time permits, but I’m a little unsure of the goal.
I’m back! New results!
I prepared my same image again, just to be sure it’s ship-shape. I captured the image with ‘Multiple Partition Image - Single Disk (Not Resizable)’ setting on the image.
4 out of 4 deploys worked flawlessly. Think we can call this case closed. Resizing NTFS partitions is dubious with the linux based tools and is causing intermittent issues with UEFI systems.
Why this doesn’t affect legacy deployments, you got me. Ask Microsoft? I’m going to just put a powershell script to have windows do my resize operations after setup completes.
Thanks to all who helped me track this issue down to it’s cause. Back to work, for me!
For developers looking to remedy this issue, I’d look into this error message (which I’m not getting anymore with that no-resize setting!!):
The protective MBR's 0xEE partition is oversized! Auto-repairing.
I’d bet money this has something to do with it breaking on UEFI systems.
@george1421 said in FOG 1.5.6: Windows unattended.xml is intermittently failing to work:
@Cheetah2003 Ok what happens if you create your master image on a 70GB disk (smaller than anything currently in a productive environment) and make the last partition your drive. Then capture and deploy as single disk non-resizable. Maybe the FOG resize code is make things unstable for UEFI (not sure why your case is unique at the moment).
Before fog support single disk resizable I had code in my setup complete.cmd file that would instruct windows to expand the last partition to the size of the disk. I have to look to see if I can find that code again, but it basically automated diskpart to expand the partition.
Yeah, I considered using SetupComplete.cmd to automated that resize myself. I was studying all the fancy powershell commands to manipulate partitions. Automating the resize with powershell script should be mostly trivial.
I’ll do you one better, I’ve squeezed down my image to 50GB. When it’s ready, I will try both some raw disk captures/deploys and no-resize partition capture/deploy as well and report back. Should have more information on Monday.
New information, and alas it’s not good.
On a hunch, given the circumstances, I decided to switch the images from partition captures to just a raw disk capture (dd.) Oh god it’s slow.
But 5 out of 5 deployments worked, no hiccups. I’m afraid I believe FOG is damaging the UEFI partitions some of the time?
What should I try next in my diagnosis? I can work with DD/raw disk images, just an extra step later in our process and it’s kind of slow. But it’s working?
Could this be related to that strange error message I always get when capturing UEFI/GPT disks? I do get this message several times during a capture. It’s never caused a problem in the past, but maybe it has something to do with this?
The protective MBR's 0xEE partition is oversized! Auto-repairing.
I’m presently preparing a new image that’s as small as I can get it, to do more deployments in mass, to get a better sample of success vs. failure rate (five isn’t a great sample size.)
@Sebastian-Roth said in FOG 1.5.6: Auto resize is unpredictable:
Please go ahead, dive in the code and find out.
Yeah, I understand you’re just an open source project, and most likely short handed. If time permits, I would love to study under the hood and figure out how it all works.
But like everyone else, I have to survive first. So we’ll see. I have non-production fog server on my desktop at home, already (I set that up to get all those screen shots for you guys!), so maybe as time and interest permits, I will do just that!
And I apologize for the bold. Just been asking that question ALL WEEK and never got a response until I asked again in bold. So… hard to buy ‘shouting not needed’ when that’s what finally got a reply.
Anyway, thanks for all your help, I do appreciate it!
@Quazz said in FOG 1.5.6: Auto resize is unpredictable:
There was some discussion about which direction to go forward in and ideally we’d do things in a more clever way, but for the time being we landed on using partition flags (such as boot, hidden, reserved) to determine whether or not they should be attempted to resized.
Me I’m rather unconcerned with auto-detection of partition types. It’s going to be dubious at best, under any circumstance. This is why I started this thread with the suggestion: We need to be able to specify how to handle partition resizing in the image specification, for better control over the behavior.
So putting aside the detection or lack there of… a manual scheme would be wonderful. This is what I’ve been advocating for all along. I’ve already figured out how to ‘trick’ FOG into leaving my recovery partition alone. I mentioned that in the very first post, as well.
And I just have to ask again: What should it be doing when it encounters more than one resizeable partition? How does it decide how to expand these on a target disk? No one seems interested in answering this?
If I capture 5 partitions, and 2 are auto-resized… how should I expect them to be put onto a target disk? This is where the unpredictablity comes in. In my experience, in that scenario, the target disk layout is pretty much random. Sometimes I get totally minimal partitions on both of them, sometimes it expands one to some odd size, and leaves the other one at minimum. Sometimes it expands the last partition to fill the entire disk. It’s completely unpredictable! So what should it actually be doing?
I’m asking all this stuff, cuz at the end of the day… I actually would like my recovery partition resized to minimum size and left that way on the target disk. It’s not supposed to be written to ever again anyway. But until I understand how the resize mechanisms actually work, how to control their behavior, I’m pretty much stuck with an unpredictable result.
@Quazz said in FOG 1.5.6: Auto resize is unpredictable:
@Cheetah2003 So did your recovery partition have no label at all or what was it exactly? (should clarify that label is basically the user editable “name” of the partition such as “System Reserved”)
If it didn’t have a label, then I’m kind of confused about how it wouldn’t get resized since it most certainly should have.
It has a label. I put the label when I create the partition. It’s just called “RECOVERY” However, perhaps you’re confused. “System reserved” would be a partition type. That is too long for a label anyway. Doesn’t seem like FOG is even looking at the volume labels. They’re not in any of the files? Like the operating system partition’s label is “OS” but you don’t see that anywhere either. The ‘names’ you’re seeing in those two files is the partition type. If you fire up ‘fdisk’ in linux and change the type of a partition, you’ll note the UUID types line up precisely to the names fog has in those two files. So I dunno what it’s doing there, the ‘name’ field seems redundant, since it’s just a text representation of the UUID partition type.
I will also clarify why I said ‘incorrect’ in my previous post. I was refering to the prior version of FOG doing anything based on a volume label. It did not. As I explained, my original setup I put my Recovery partition LAST on the disk. And would get resized, and the operating system partition directly before it would not. So I moved my recovery partition to be BEFORE the operating system partition. This fixed my issue in the previous version of fog, it would only attempt to resize the last partition, regardless of it’s label. Now it’s resizing both of them.
@Quazz said in FOG 1.5.6: Auto resize is unpredictable:
There’s actually something very strange in this case, which is that your partitions and min.partitions file is the same. (maybe you accidentally pasted the wrong one?)
Yes, I thought that looked strange and I did recheck I was pasting the right file. Alas, that is 100% accurate. That’s what was in my fog server’s folder.
EDIT: I notice there is a difference between the files now. But I swear, they were the same before. This is several captures later, so I dunno. I’ve been trying to debug another issue and have recaptured that image several times.
On another note: I’m not sure why I can’t get someone to answer this for me: What is the expected behavior when you have more than one resizeable partition?
Same issue. Very consistently 50% of the computers fail, with the message in that screenshot.
Deployed to 8 identical machines, half of them boot to setup normally, the other half fail with the error. Curious I tried clicking ‘ok’ on some of the machines that failed, but they just reboot and fail again. Only redeploying will give a chance for it to work.
Got some different machines, by a different manufacturer, got 4 of those, deployed the image. 2 boot normally, 2 fail at the error.
Anyone got any ideas what to try next? I’m stumped. I posted on M$ technet about this as well.
Just chiming in. That subtle tiny change to the unattended.xml made no difference. I’m starting to suspect the target machines, at this point.
Today, I will try completely different target machines, and see if I still get the problem.