The future of partclone and therefore FOG as it is
-
@Junkhacker They were corrupt. I could see the partition types as uefi boot and such, but could not mount them.
I think I have something else going on here. I just tried to recapture a uefi golden image and I have an error about getPartitionLabel: command not found on line 486, then partclone exited with error code 139. So I’m going to have to do a bit more research, maybe uefi mode is causing something if bios mode worked for you.
I may need to start with something simple like win10 bios mode and see where it starts to fall down.
-
@george1421 is it possible that you tried to deploy a legacy image with partclone 0.3 and the --ignore_crc flag set, then captured it again, or something like that? or maybe captured without checksums with 0.3 and redeployed it with --ignore_crc? i know for certain that the second of those two will result in corrupt partition without reporting any problem. (it tries to skip the blocks in the image file where the checksum would be, but there it skips actual data because there’s no checksum)
-
downloaded the init_p3 file and tried again. got the UUID error again, but it boots fine.
-
@Junkhacker What I’m currently testing is an legacy image captured with the original inits and then deploying with the test inits from this thread. In that case the test inits (partclone 0.3.12) should have all of the
--ignore_crc
switches removed.I’m doing this to confirm we won’t have issues with previously captured images and the new inits.
Well after deploying a simple win7 bios mode its also failed to this same test computer. So at this time I need to start back from a known good state and change only one thing at a time.
-
@george1421 that’s the same as i’m doing. i just thought maybe in your testing an upload might have taken place with 0.3.12 by accident. i’ve been deploying images using my dev server, some of which i copied over from my production server. i haven’t seen any problems other than the UUID thing i posted.
-
@Junkhacker Well this is all good information. In your environment (except for the UUID issue) its deploying correctly. The other thing (thinking about all of the variables here) is that I’m running 1.5.4 on my production server. I’m pretty sure that 1.5.4 inits were created with a previous release of buildroot than what I’m building against. Its possible that buildroot updated applications too, but if everything appears to work in your environment this I can take one step to rule out buildroot differences causing this issue. As well it could be this old laptop that has issues. I’ll keep working on eliminating non-issues.
Thank you for your feedback.
-
@george1421 Great stuff you are working on this full on!! I try to follow up on what you post and test. Though I don’t have the time to engage in this at the moment.
It’s interesting you and Junkhacker seem to get different results and it would be very important to figure out why. I can’t imagine it being a UEFI vs legacy difference issue as partclone should just be deploying the actual contents of the partitions no matter what the boot type or partition layout looks like. But what do I know…
@george1421 said:
I just tried to recapture a uefi golden image and I have an error about getPartitionLabel: command not found on line 486, then partclone exited with error code 139.
Not sure which version you used because we removed getPartitionLabel stuff recently. Please make sure you use the very latest version (
master
branch).@Junkhacker The UUID error you see stems from an issue we had in the inits when I created the one you use for testing. This was fixed some weeks ago.
i tried the init you supplied, and i got a "failed to set disk guid (sgdisk -U) (restoreUUIDInformation)
Can you please post a picture of that?
-
@george1421 more info for you, my dev server is 1.5.5 my prod is 1.4.4. not sure what versions the images files would have all been captured on, but i’ve tested 1.4.4 or earlier (not sure) though captures with 1.5.5.
-
-
That looks like an error code of some sort, weird.
-
@Tom-Elliott yeah, definitely doesn’t look like a UUID (sorry the image is shrank and funny, was trying to get the image size down to upload)
-
@Junkhacker The following commit changed UUID stuff https://github.com/FOGProject/fos/commit/f49b0f7d5b0c90866a6dcdbbd5d59e529857242e
On MBR, the label-id is in the form of what appears to be an error code (0xrandomnumbersandletters) hence the output that you saw.
I don’t know for sure why it would fail, but something in that direction possibly.
edit: Ok, so, in d1.partitions it’s saved as 0x
labelid
whereis sgdisk expects it aslabelid
The original method used
blkid -po udev
which does not put 0x in front.Shouldn’t be too hard to fix
-
@Junkhacker Thanks for the picture. @Quazz Is exactly right about this commit having removed some of the UUID stuff as I noticed that we created an unnecessary file with UUID information that we have available in sfdisk output already. Therefore I went ahead and removed all the extra UUID stuff.
Can you please do me a favor and do a debug capture on your master client. When you get to the shell run
blkid -po udev /dev/sda | grep "PART_TABLE_UUID"
as well assfdisk -d /dev/sda | grep "label-id"
and post outputs here. -
The error itself is really just a warning, but what’s unsettling is that it appears that the registry generation file and it fails due to space.
We may need to add some additional free space.
-
In particular the BUILDROOT configs managing these:
BR2_TARGET_ROOTFS_EXT2_SIZE="100M" BR2_TARGET_ROOTFS_EXT2_INODES=0 BR2_TARGET_ROOTFS_EXT2_RESBLKS=5
Maybe we adjust to say 200M and give 100 reserved blocks? (Just thinking here.) Inodes should be plenty in regards to what is defaulted I think.
-
@Tom-Elliott said in The future of partclone and therefore FOG as it is:
Maybe we adjust to say 200M and give 100 reserved blocks?
I really don’t see a negative impact of this. It shouldn’t impact anything (other than give the filesystem a bit more room), since I believe this is the filesystem in RAM that gets allocated. I would say almost all systems to day have at least 1GB of ram, even tablets. You “might” run into issues with really old hardware that had 512MB of RAM or ARM based systems, but even then they usually start out with 1GB of RAM.
If you look at it from the initrd size it still shouldn’t make an impact on the init.xz file since this is “extra space” and that gets squished out when its compressed.
-
@Tom-Elliott @george1421 Yes adding more room shouldn’t hurt. But could you please tell me where you see an error happening which leads you to think that there is a space issue within FOS? I can’t see it.
Edit: Never mind, I just saw the other topic. Building new inits with 256 MB size right now. Will take roughly four hours to complete. Will update the official binaries on the web server soon.
I wonder why I didn’t get (or notice) those errors when doing the tests on my VM?!About the UUID issue, let’s first gather some more information before we decide what to do. @Junkhacker Would be great to get the two command outputs, thanks!
Please let us the UUID things here so we don’t loose the focus in this topic! @george1421 Did you figure out why you got different results than @Junkhacker when trying to deploy “old” images with partclone 0.3.12?
-
@Sebastian-Roth Sorry I didn’t have time yesterday because of some other issues and I’m traveling today. I’ll get back on the deployment differences on monday. Rebuilding my inits with 256MB size only added 15KB of size to the inits from my tests. So I don’t see a noticeable impact on size or delivery speeds. I don’t remember what the unpacked size differences were.
-
@george1421 Any news on this topic? I was hoping to engage more in this but as you see things keep popping up that need quick fixing…
-
@Sebastian-Roth Not with a resolution as of now. But here is an update with good to know stuff.
I did a git pull on the fos github and updated my local repository with the master. Then without thinking overwrote all of the edits I did to the scripts in my rootfs_overlay directory. Not what I wanted to do but I rebuilt the inits without any of my edits and only with the 0.3.12 partclone build. Unfortunatly it still does the same with Win7 unable to boot.
BUT along the way I discovered something.
Per our earlier discussion I change the buildroot setting of
BR2_TARGET_ROOTFS_EXT2_SIZE="100M"
to
BR2_TARGET_ROOTFS_EXT2_SIZE="256M"
When I pxe booted into FOS I received the following error:
RAMDISK: incomplete write (-28 != 4096) XZ-compressed data is corrupt Kernel panic - not syncing: VS: unable to mount root fs on unknown
I went into FOG Configuration-FOG Settings->TFTP Server->KERNEL RAMDISK SIZE and viewed the size. The size default is 127000. I had the setting at 255000 because I was testing something earlier. This 255MB is of course smaller than the ROOTFS size I set in build root of 256MB. So the decompression failed and the kernel panicked because it couldn’t mount the virtual hard drive. Setting KERNEL RAMDISK SIZE to 300000 resolved the issue. I did not test any smaller to see where it started to fail due to time restrictions.