Deployment stuck in a loop, never finishes imaging?
-
Am I reading this wrong? This is weirding me out. Is there a reason I don’t have a partclone log?
-
@salted_cashews the log that’s gets generated during the imaging process is on the client booted to FOS. it’s gone as soon as the computer reboots.
-
@Junkhacker Oh I see, is FOS the preboot environment?
-
@salted_cashews yes. it’s the minimal Linux OS that loads over the network to do the imaging tasks
-
@Junkhacker Thanks for the info!
-
@salted_cashews FOS is the Fog Operating System that runs on the target computer. It is linux based and is built from bzImage (kernel) and init.xz (virtual HD) .
If you run a debug capture/deployment you can access this log file. It only exists on the virtual ram drive that FOS uses.
-
@george1421 This is really interesting, is this why I’m able to almost-ssh into the box during an image/network boot?
-
@salted_cashews Yes. It IS an operating linux OS. If you boot into debug mode and then give root a password you can ssh into the box as root and run the debug deployment/capture remotely. I use this method when debugging/developing post install scripts.
-
@salted_cashews Are you sure the image was captured with Zstd as well? If you change that option in the image setting you need to re-capture it!
Running a debug deploy task and ssh into it (you need to set a root password within the booted FOS environment on your client machine using
passwd
command) to look at the partclone.log is definitely a good idea. -
@Sebastian-Roth I’m 100% positive it was captured using Zstd, the only thing I can think of is something we did on the image before capture or a network issue during.
-
@salted_cashews See if you can grad the partclone.log file and hope we get some more information from that.
-
@salted_cashews I find it really strange that the filesystem does not seem to be clean. Are you sure the filesystem was clean when you initially captured the image? Sure the machine was not in some kind of hibernation when it was PXE booted to be captured?
-
@Sebastian-Roth Indeed, the image was a CentOS 7 image that had been rebooted (hibernation on the OS is disabled via the GUI). This had happened with one other image as well, and I remember us running some basic “clean up” tasks beforehand. It’s possible these mucked up the file system or something. Let me see if I can trace back exactly what we did.
-
@salted_cashews said in Deployment stuck in a loop, never finishes imaging?:
Let me see if I can trace back exactly what we did.
Maybe
.bash_history
…? -
@Sebastian-Roth To my dismay the host was just “nuked” this morning. On the bright side I’m testing another deploy debug and I’m SSHd into the guy. Is it possible to have the root password set via
passwd
by default on a deploy/capture? I’d love to just be able to jump in like this at-will. -
@salted_cashews Take a look at Tom’s post here: https://forums.fogproject.org/post/88286
Though I have not tested this myself lately it should still work I reckon.
-
@Sebastian-Roth Thank you sir, as far as the logs are concerned this is what they report:
Partclone v0.2.89 http://partclone.org Starting to restore image (-) to device (/dev/sda3) note: Storage Location 10.10.100.252:/images/, Image name PPS_v9.0R2-dev_CentOS we need memory: 208468 bytes image head 4160, bitmap 200208, crc 4100 bytes Calculating bitmap... Please wait... get device size 53687091200 by ioctl BLKGETSIZE64, done! File system: EXTFS Device size: 6.6 GB = 1601624 Blocks Space in use: 4.5 GB = 1097323 Blocks Free Space: 2.1 GB = 504301 Blocks Block size: 4096 Byte read ERROR:No such file or directory
Following this I get a bunch of errors about “inode” something or other, and then the eventual reboot.
-
@salted_cashews Please run the following command on your FOG server:
file /images/PPS_v9.0R2-dev_CentOS/d1p3.img
Post output here.
-
/images/PPS_v9.0R2-dev_CentOS/d1p3.img: data
-
@salted_cashews Please run the same for all image files in that directory:
file /images/PPS_v9.0R2-dev_CentOS/d1p*