PC unbootable after capture fails
-
@dolf I am not sure why resizing isn’t working for you. I’ve created hundreds of images with fog - most re-sizable - for Windows 7, 8, 8.1, 10, ubuntu, CentOS, Fedora - I’ve not had the problems that you’ve had. All my co-workers use resizable. We have probably 30 different hardware models from various manufacturers at work, they all work fine with fog. Many community members here use resizable images, seldom do issues with resizing come up.
We need to troubleshoot what’s going on with your particular setup - and see what can be done.
I particularly think something is wrong with the MBR. After deploying a resizable image (captured by fog), you can boot to a linux live disk and likely be able to mount the HDD and read all the files just fine, copy to and fro, and run other diagnostics. I really doubt that the resizing is breaking it, I really think it’s something with the MBR.
As a sort of test, after capturing a resizable image with fog, you can trade out the mbr fog captured with the mbr that CloneZilla captured, set permissions, and try to deploy. See what happens.
-
@Wayne-Workman Good to hear that it works for you. The fact that it usually works, but didn’t work for me is the definition of an edge case. And things should not break when edge cases happen.
I just realized that I unknowingly tested exactly what you suggested, and that’s probably why it worked. When I try to resize the problematic image, however, I get this: gparted_details_bad.htm
Still, GParted wins, because it safely terminates before destroying the disk. FOG should, too.
This discussion shows that most people aren’t really sure why this happens. We could use the following algorithm to work around the problem (expanding on what GParted does):
increment := "1GB or a certain percentage of the disk size" partition = /dev/sda2 calibrate partition target_size := check file system on partition for errors and fix them and get estimate of smallest supported shrunken size if there are errors stop do simulate resizing to target_size target_size += increment while simulation fails and target_size < disk_size if target_size < disk_size // this means the simulation must have succeeded for the current value of target_size actually resize the file system actually resize the partition // note that file systems and partitions are not the same thing, and are not necessarily the same size... TODO: this is yet another edge case to consider // if all simulations failed, we just don't resize the disk, and the capture process can still continue uninterrupted
-
Sorry, actually no, the image where the resize succeeded has the same mbr, but fewer files in sda2 (about 10GB less than the one that fails to resize).
The suggestion for making the capture process safer still holds, though
I even tested it: If I resize to 70GB instead of the minimum (about 66GB), it works just fine. I suspect that it isn’t possible to know exactly what the minimum size of an NTFS partition will be without simulating. That’s probably why the authors of ntfsresize include messages like this (emphasis mine):
- Estimating smallest shrunken size supported …
- You might resize at 71189536768 bytes or 71190 MB (freeing 178764 MB).
- Please make a test run using both the -n and -s options before real resizing!
Luckily, simulation takes about 10 seconds for a 250GB drive, so it won’t be a large performance hit.
-
@dolf I agree with all of that. How good are you with shell script?
-
@dolf While I understand what you’re saying, I don’t think it should continue going. I agree it should not, in the least, actually resize the partition unless we know absolutely all will continue fine down the road (which is not very practical, as I don’t know of a way to “dry_run” the fog system before actually performing tasks to test for all these edge cases. The reason there are different image types (resize, non-resize, raw) is to allow people to use what will suit them best. If resize is going to cause issues, I think it wise to fail to upload, but not attempt altering the disk.
Can you post the contents of your image’s (broken please) d1.fixed_size_partitions file? I suspect what’s occurring is an unexpected partition is resizing, thus moving the start sector of the next partition. That I can fix, though I don’t know where to begin.
-
@Wayne-Workman I’m not great at shell scripting. I google about 5 pages for every line I write. I mostly do Python, PHP and C.
@Tom-Elliott I’ll have to dissapoint you
1
-
Another thing comes to mind as well.
FOG Does run some math to calculate the smallest size of the partition plus a little more (wiggle room if you will). I may need to see an upload again using debug and at the point it’s testing (once complete) break out and see what is showing for the ntfsresize variable.
lsblk and fdisk -l would also, possibly, be extremely helpful as well (before AND after).
-
I know it’s a long thread, but here it is: https://forums.fogproject.org/topic/8059/pc-unbootable-after-capture-fails/10
-
Just to show that it does work if you make the wiggle room a tad (where tad=4GB) bigger: gparted_details_70GB.htm
That’s using the same “broken” image. Everything works perfectly on that image, so I wouldn’t really call it broken.
chkdsk
agrees with me. It does, however, contain massive software packages with millions of files. -
@dolf Was the system defragged before it was uploaded?
I ask because: … http://tuxera.com/forum/viewtopic.php?f=2&t=31012
I don’t know if this was/is the case, just may be worth a shot?
-
To add further, the part where it’s talking about shifting the data on the drive in a strange format the MFT segments are being moved around and possible extend partial bits to beyond the partition layout. Or so I believe, I don’t really know, but it would leave some understanding as to why a slightly larger partition layout would work.
-
I would tend to think that those “massive” files you’re talking about just didn’t have the room to be shifted around with the target being 2GB free. Defragging before uploading could solve that - and make your image perform better too.
-
I didn’t defrag, but I analyzed the fragmentation, and it reported
1% fragmented
.
However, last night the hard disk of the PC I originally used to develop this image started acting up.chkdsk /R /F /V /X
on reboot returned no error or bad sectors, but the DELLPre-boot System Assessment
reports that the HDD hasError Code 2000-0142
. I couldn’t find what that code means, other than that the HDD hasfailed
. I think it’s probably a problem with the HDD’s electronics, rather than the disk surface, because the diagnostic utility only took a minute, so it obviously didn’t scan the disk surface. I’m replacing the disk now, to check. -
And the truth comes out.
-
Or does it? I just started from scratch on a new image with a new HDD, and I’m having the same problem… MFT gets corrupted. I’ll do some more thorough tests when I have time.
-
I’m still experiencing this problem. Running trunk these days. However, I have found another workaround to upload resizable images:
- Capture a non-resizable image
- Make a resizable image, and copy the files manually from the non-resizable image
- Create these files manually:
d1.fixed_size_partitions
Just contains “:1”d1.minimum.partitions
I make the minimum size of the resizable partition quite a bit larger than the uncompressed size ofd1p2.img
Can check this withgzip -l d1p2.img
d1.original.fstypes
/dev/sda2 ntfsd1.original.swapuuids
empty
- Deploy
It works. Therefore it seems that resizing the partition before and after capturing with
partclone.ntfs
is not necessary, even for resizable images. -
This problem persists in 1.3.0-RC-11.
-
@dolf Can we recap what’s going on?
You have a Windows 7 (pro, home, enterprise?) image already, it is resizable.
You deploy this, and update the deployed machine.
You then recapture.
And it fails to recapture?All of my images at work are resizable, and often we will deploy them, update them, and re-capture them without issue. We did this with Windows 7, Windows 8.1, and now with Windows 10.
What’s special about your image?
Boot into linux somehow on a computer that has your image deployed to it’s disk. You can either use a FOG debug deploy task or just any live Linux disk.
Give us the output of these commands once you do:
lsblk
fdisk -l
lspci
-
Windows 7 Enterprise. I don’t know what is special about this image. The problem started after installing a bunch of large software packages on the master image. Fragmentation was less than 1%.
It fails when trying to resize the partition, as described in previous posts. When doing the same thing with CloneZilla, it works. The fixed-size option in FOG also works, because it doesn’t try to resize the partition. As far as I can see, resizing the partition is not necessary for capturing images with partclone, even when restoring to a smaller drive. You might as well remove that step from the capturing process, which will speed things up significantly.
So now I just capture it into a temporary image using the “fixed size” option, and then move the files manually to an image which is configured as “resizable”.
mint mint # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 232.9G 0 disk ├─sda1 8:1 0 100M 0 part └─sda2 8:2 0 232.8G 0 part sdb 8:16 1 7.5G 0 disk /cdrom ├─sdb1 8:17 1 1.6G 0 part └─sdb2 8:18 1 2.3M 0 part sr0 11:0 1 1024M 0 rom loop0 7:0 0 1.5G 1 loop /rofs
mint mint # fdisk -l Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x95d3684b Device Boot Start End Sectors Size Id Type /dev/sda1 * 2048 206847 204800 100M 7 HPFS/NTFS/exFAT /dev/sda2 206848 488396799 488189952 232.8G 7 HPFS/NTFS/exFAT
mint mint # lspci 00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09) 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) 00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04) 00:16.3 Serial controller: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4) 00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b4) 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4) 00:1f.0 ISA bridge: Intel Corporation Q67 Express Chipset Family LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller (rev 04) 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
I omitted the RAM drives and /dev/sdb, since I was booting from a LinuxMint USB drive.
-
@dolf said:
As far as I can see, resizing the partition is not necessary for capturing images with partclone, even when restoring to a smaller drive.
I don’t think this is true.
You can try using a tool called
testdisk
which might be able to fix the MFT issue for you. To find out what exactly caused this we need the exact error message from when it happens (resize…?)!