Failure in Capturing an image
-
@ismith-hpu said in Failure in Capturing an image:
This worked before changes were made to the post-init script. Any idea?
Can you please post the full post init script here so we see what modifications you make!
The error we see in the video is definitely caused by a partclone parameter as mentioned by Tom. The issue is you use init file from 14.07.2019 (see in the video) which came with partclone 0.2.89 and modified the scripts to match parameters that only work with partclone 0.3.12 (which we updated to in September 2019).
So what you want to do is download the latest proper build of the inits and use those: https://dev.fogproject.org/blue/organizations/jenkins/fos/detail/master/105/artifacts
Not exactly sure about the error you posted on MacOS X. Please post the pull post-init script here so we know your mods.
Edit: Ahhh wait! The scripts only count NTFS adn EXT2/3/4 as possible resizable filesystems. So on your Mac it doesn’t find any and bails out. Which version of Mac OS X do you have? Resizable is a huge problem in Mac OS - see here: https://forums.fogproject.org/topic/13881/help-with-mac-imaging
-
@Sebastian-Roth the post init is just placing the downloaded funcs.sh file in the init before the imaging task begins.
It’s literally:
cp ${postinitpath}funcs.sh /usr/share/fog/lib/funcs.sh
@ismith-hpu please note if you download the artifacts init, you no longer should need to run the postinit script.
Thanks,
-
Removed the postinit
Replaced the inits with artifacts.Linux is working with capturing, still need to test deployment with Linux.
Still needing to test deployment & capture with Mac after the above changes.
-Edit 1-
Linux deployment works.
Mac deployment works, still needing to test capture. Want to write the master image before rewriting the golden image on Mac. -
Mac’s can deploy but cannot capture.
It instantly finishes writing the content but is not make any actual image, the size is saying 0.00KB on the file server:
Here is the boot process in order.
-
@ismith-hpu this looks like a deployment, but what does a capture look like?
-
@Tom-Elliott That is a capture task from the webUI, not a deployment.
Just did it via the ‘task selection’ from the host page, and same interaction happened.
-
@ismith-hpu But the machine, indeed, does have an OS on the disk? I can only assume the -a0 is a part of the problem. I don’t know what the argument is doing for the 0.3.12 version of partclone.
-
Yes, it’s a Mac.
The Mac will deploy an image fine (dd, raw, everything).
But capturing an image (dd, raw, everything) does not work.
-
@Tom-Elliott
https://github.com/Thomas-Tsai/partclone/blob/master/src/partclone.c
-aX --checksum-mode=X Checksum formula to use to add error detection\n"
" where X:\n"
" 0: No checksum (no slowdown, smallest image)\n"
" 1: CRC32 (Fast to compute, basic detection)\n" -
@ismith-hpu In the partclone pictures we see
/dev/nvme0n1
which is the whole disk. Shouldn’t it try to capture the partitions (e.g./dev/nvme0n1p1
…)? Not sure what exactly is going wrong here. Just the first thing that jumped at me.Years ago at my old working place we did capture and deploy Mac OS X perfectly fine, so I know this has worked at some point. But so many things have changed and I don’t have a Mac at hand to test.
Can you please schedule a debug capture task. Boot up the machine and hit ENTER twice to get to the shell. Now run
fdisk -l
, take a picture of the output and post here. -
That is what I told @Tom-Elliott in the priv-chat.
This is similar to what happened before and he fixed in a post-init but it’s not working again.
I am out of the office but will do that tomorrow.
Additionally this WAS working before I updated utilizing his post-init, as I can deploy the image fine that I capture before and I use to be capture fine, now it’s broken.
-
@Sebastian-Roth the Macs are being captured as raw, which captures the entire disk, not the partitions.
-
But it still needs to be able to write the partitions and data in the correct order.
If that wasn’t the case then why did we make a post init to properly identify the nvme0n1p1 when we did the:
lsblk -dpno KNAME -I 3,8,9,179,202,253,259 | uniq | sort -
readlink /sys/class/block/nvme0n1p1
disk=$(readlink /sys/class/block/nvme0n1p1)
disk=${disk%/}
disk=/dev/${disk##/
echo $disk
lsblk -no pkname /dev/nvme0n1p1When my capture/hardware hasn’t change.
Look back at the conversation in PM from 11 days ago and see if that maybe adds more context? I am a little confused myself x.x
-
The fixes we put in place where 2 fold.
First we put in the fix to address the issue of RAW imaging not passing a partition for partprobe. This was handled by passing a variable to test if the image is raw or not. If it’s raw, the flag gets set and the function returns the disk as it was sent to the function.
Second we are using a more implicit means to return the disk when the partition information is passed.
The only real difference that I’m seeing here, is that the 0.3.12 partclone doesn’t like doing the the translation.
THis is okay, leave your post init script in place. Edit the funcs.sh file and remove the -a0 (or -a1 if this is still in place) from the file.
Download the proper inits again using:
wget -O /var/www/fog/service/ipxe/init.xz https://fogproject.org/inits/init.xz wget -O /var/www/fog/service/ipxe/init_32.xz https://fogproject.org/inits/init_32.xz
Then you should be all set. The init’s will have the proper funcs.sh for 0.2.89 partclone as well as the changes that were in the init’s you downloaded from our dev server.
-
@Tom-Elliott
Trying this now, will attempt to see if it deploys properly. -
@Tom-Elliott said in Failure in Capturing an image:
wget -O /var/www/fog/service/ipxe/init_32.xz https://fogproject.org/inits/init_32.xz
It now works completely.
Thank you very much.
-
@Tom-Elliott said in Failure in Capturing an image:
@ismith-hpu But the machine, indeed, does have an OS on the disk? I can only assume the -a0 is a part of the problem. I don’t know what the argument is doing for the 0.3.12 version of partclone.
It’s to ensure compatibility with 0.2.89 images, allowing them to be deployed with 0.3.12.
0.2.89 had a broken checksum system, so we force it disabled on 0.3.12 in order to allow those images a normal deployment. It’s also faster and slightly smaller images, certainly not complaining about that part.
It is curious that 0.3.12 doesn’t seem to work in this instance, though I wonder if a non-resizable (not raw) capture has been tried?