Debian 9.0 capture fails AND destroys client image
-
@foguser438 Debian 9 has been out all of a week, so capturing an image has not yet been tested (from my side). I was under the impression you meant Debian 9 was the SERVER OS, not the OS of the system being captured (sorry about that).
Then might I recommend setting the “image type” from Resizable to Non-resizable? I’m getting the impression, that while FOS is reading one filesystem as ext4 (and allowing the partition to resize) the disk is actually partitioned using some other format which the resizing is writing over and failing. This is all just a guess of course, I really don’t know the exact issue or a good means to validate. (Particularly seeing as the error is happening immediately on the first partition).
-
Thanks @Tom-Elliott, yes non-resizable works but is obviously not desirable for the long-term. I appreciate the newness of Debian 9 but I assumed this is how this process works. New client distros are released/installed, problems are found, feedback is provided. I look forward to your input after you have had time to spin up a Debian 9 image.
TIA
-
@foguser438 It is, I’m just describing what I know of things at the moment.
When you reprep your Debian 9 image system, please just make sure you set the format of the partitions to ext rather than LVM, XFS, HFS, or any other thing it may default too. /dev/sda1 is typically the “boot” partition and is really normal to be set as ext, but the rest of the file systems would be set as lvm which isn’t really allowable in a resizable mode.
-
@Tom-Elliott I always manually partition my images for thin clients as ext4. Never use any other fstype, especially not LVM. Been burnt too many times with Clonezilla with that one… So the fstype is ext4 and has always been. I also only use one partition / and no swap.
-
@foguser438 I’ve been following this thread since the first post very closely. I want to ask if you’ve tried to rebuild the Debian 9 image from scratch a 2nd time and tried capturing that? The original error does relate to inconsistency and recommends an file system check - instead of doing that, I default to trying a new fresh-built image.
-
@Wayne-Workman @Tom-Elliott : In the hopes that it may help you, here is what I have done to eliminate the possibility that this problem is related to a corrupt ext4 filesystem:
- Rebuilt from scratch three times the same image, tried the capture, same result.
- Booted client in rescue mode, ran e2fsck on /dev/sda1 the image, it is clean. Ran a capture, same result.
- Used Clonezilla to capture/deploy the image several times. No problems. In fact, I use the CZ image to restore.
- Successfully captured the image using non-resizable disk. Deployed this image to real thin client. Rebooted thin client and verified filesystem integrity. Tried to capture newly deployed thin client using resizable option, same result.
Therefore, I don’t think the ext4 filesystem is corrupt before the capture process. But it sure has problems after the capture process fails. I hope this helps. Please let me know what your results are when you capture a Debian 9 client.
Thanks
-
@foguser438 I have a debian 9.0 VM, I just haven’t tested uploading it yet. I prefer the VM method as I can more simply “break” and “fix” than having to reinstall every time. I’ll work on it this weekend.
-
I was able to, more or less, replicate the problem albeit in a different sector failing. I don’t know why this is failing unless the ext4 utility used to build the filesystem has a difference from what is normally used on the likes of Redhat, Debian 8, etc…
I don’t like the way I’ve worked around this either. It leaves too much potential for unknown issues, but I think it might work for our needs either way.
I’m approaching the problem by:
Check the filesystem as normal. Copy the error so we can present it to the user later if needed.
If the filesystem check fails as it seems to do, attempt to forcibly fix the problem. If the forcible fix doesn’t seem to help, then we know there must be a real problem. I’m still waiting for the upload to complete to find out if it works. Then I’ll test if deploying the image works as well. -
Good news Everyone (Futurama reference anybody?)…
The “fix” i’m testing seems to have worked on the capture side. Will test deploy shortly, but should work as well.
-
And success.
If you’d like to test for yourself, please download the newly built inits (these will be in 1.4.5 of course).
It does appear to take longer to boot than one would expect under normal conditions, but it DOES boot. I’m going to try recapturing now that the init’s have been updated.
To test for yourself, please try:
wget -O /var/www/fog/service/ipxe/init.xz https://fogproject.org/inits/init.xz wget -O /var/www/fog/service/ipxe/init_32.xz https://fogproject.org/inits/init_32.xz
-
Thanks @Tom-Elliott, I have successfully captured and deployed Debian 9 clients both in virtual space and to real hardware. So, it looks like your fix works. I did notice some slower booting as you mentioned but that seemed to be minimized after I went through a full capture-deploy-capture sequence. Perhaps there was residual file system ambiguities that was affecting the boot times. In any event, the boot times smoothed out over time. The only slightly odd thing I did notice is sometimes the client would boot twice before the capture would actually start. I cannot reproduce this all the time. Just thought I would mention it if it makes you think of anything.
Thanks again; will this fix make it into 1.4.5?