Development FOG not capturing image - PartClone update
-
Hey guys, since PartClone got updated in 1.5.7.86 I can no longer take images.
-
I am having a few issues and I don’t know how to resolve them. And they are making me wonder if this configuration is even viable for general usage.
One of the issues I am having is if I enable the Apache rewrite to HTTPS, when I try to inventory a machine or deploy an image and when FOS adds/checks the MAC address, I get an error that states “No viable mac to use.” If I disable HTTPS rewrite, it works first time, every time. I don’t know if a FOG URI needs to be excluded from the rewrite, like I had to do for iPXE to boot. I have a hunch this might be an easy fix?
The second major issue I am having is even if I disable HTTPS rewrite and the SSL certificate checks, I get the error in the attached image. I even just did a base Debian install using 1.5.7.88, none of my extra code, and I still got the error. Don’t know if I am doing something wrong? I know all the Linux installs I have been testing with have more than enough space to captures images, even a full disk, non-resized image. And it’s typically the ‘raw’ partition or the ‘ntfs’ partition of a Windows Server image. It’s all very strange…
-
@ty900000 Sorry we’ve lost track of this. Tom just pushed a change to the repo some days ago that might address your issue. Please download the latest
init.xz
/init_32.xz
files from our build server and see if that works. Otherwise we need to do a debug capture task to get more of the error message. -
No worries! The holidays took up so much time, I haven’t had much time to work on this to expand it to non-RedHat distros.
I updated both init.xz and init_32.xz and get a similar error. I had been seeing this before, too. I definitely have a large enough drive for the image. 140-ish GB /images with the image that needs to be taken is only a 50GB disk. I noticed this when I updated to 1.5.7.86 and then subsequent updates. I noticed PartClone got updated and that’s when things started to break. If I do a new install of 1.5.7 with the old version of PartClone, everything works fine. How do I enable debug capture for iPXE? Thanks!!
-
@ty900000 I just pushed another update, though it may be a little while before the artifacts are ready for testing.
I’m fairly sure the issue here has to be the FIFO. I’ve also gotten rid of the “Maybe check the fog server to ensure disk space is good to go” by providing the available disk space. It also adds the exact command that partclone is trying to use so we can see what’s going on.
2060 is just the case statement, so I don’t think it’s failing because of the case. I think it’s failing because the FIFO was still open. To combat this, I’ve added a 5 second wait to let the disk settle and release the information for the FIFO so we can remove it to recreate it later on.
-
I pulled the latest init and got a different error this time
-
@ty900000 Okay, do you mind running the capture using Debug? Cancel the task, and go to create it like you normally would, but before submitting it, there’s a checkbox that says Schedule as Debug.
It does mean a little extra work for you in that you will need to press enter twice to get to the shell.
At the shell type:
fog
Then you will need to press enter until the image completes. This method should at least allow you to capture the image. This is why I was adding the sleeps between. I see, now, that it’s not anything to do with that. I can’t imagine it’s the -a0 though. (I suppose maybe but I’m not quite sure right now).
-
@Tom-Elliott I think it’s more likely to be caused by
partclone.imager
being broken in current 0.3.12Note how the detected size of the partition is 0 by partclone.
-
@Quazz Yeah, but it’s broke to the -a0 and quite possibly the -c option I think.
It’s strange as the -c seems almost redundant here.
Though, when I ran into the issue (which prompted me to try running in debug so I could more directly narrow down the issue), from debug everything worked without an issue.
-
@Tom-Elliott I am fairly confident the -a0 is a bug, since it is listed in its options, but isn’t picked up for use.
-c was removed for dd (it’s implied I guess??)
Interesting you should mention it not occuring in debug. I have seen this problem before, but that was on… unreliable devices so didn’t think much of it when I couldn’t replicate it on other devices.
-
@ty900000 said in FOG/Apache PKI/Certificate Authentication:
I pulled the latest init and got a different error this time
Wait a second. Where did you pull it from? Did you use these ones? https://dev.fogproject.org/blue/organizations/jenkins/fos/detail/master/113/artifacts
-
@Sebastian-Roth He did, I can see the changes I created in the output.
-
I stepped through everything until it halted. Pressing [Enter] here doesn’t do anything.
-
@Tom-Elliott I really wonder why we don’t see other people report this error. Were you actually able to replicate this? Maybe this is just some RAM issue that causes binaries to fail on this particular machine!?
By the way, @ty900000 would you mind opening a new topic for this? Better to keep things sorted. I can move all the related messages over…
-
Would you mind trying the latest inits from: https://dev.fogproject.org/job/fos/job/master/lastSuccessfulBuild/
The init.xz and init_32.xz should be good.
Essentially I’m having a check on the partclone to be used and removing a couple of arguments as they are not built during the configuration and build of partclone.
-
Yes! It worked perfectly. I’ve tested it a bunch of times and it works great. I do get this output after one of the partitions. It doesn’t affect anything it seems, but I’ve just noticed it.
-
@Tom-Elliott Are you able to replicate the issue as seen in the pictures?
@ty900000 Does this happen on several machines? All the same model or different ones? -
To start, I am using Hyper-V for everything. Yes, I do get that above image when I try to capture other images - either Windows or Linux. When I attempt to deploy the Windows image (the original image I’ve been trying to take), I get this error. But it does seem to complete. It does something similar for the the Linux image.
-
@ty900000 @Sebastian-Roth
I haven’t replicated, but to be fair I also haven’t watched that closely. We did image one machine yesterday and all seemed fine.Looking at my images folder, however, I do notice that I’m missing the “imager” partition from my image. Luckily I had another image of the machine that did have the missing partition.
I pushed another fix and believe the issue was as @Quazz noted is the -c argument was missing. Strange as that is, as the -c argument doesn’t appear to be a part of the spec list (unless somebody already added that to the patch for partclone and I didn’t know it?)
This will take a while to build of course as I only just pushed it.
-
@Tom-Elliott said in Development FOG not capturing image - PartClone update:
Strange as that is, as the -c argument doesn’t appear to be a part of the spec list
I think the -c is important to make partclone.imager actually use the parclone image format.
I haven’t replicated.
My guess is that this is something specific to Hyper-V or maybe even just @ty900000’s setup. Not saying we shouldn’t try to figure this out and eventually fix if it’s in the inits. My feeling is that this is not about partclone command line parameters or anything.
@ty900000 Please do me a favor and play with the the image’s setting Image Manager. Try Partclone Zstd if you have used Gzip since and even more so try out Partclone Uncompressed! Capture the image with these changed settings once more and let us know if it makes any difference.