FOG: Not detecting target disks correctly (/dev/sda vs /dev/xvda)

JonesDK

Hi guys,

All new to the FOG universe, so please forgive me if I’m all wrong

Objective: Trying to deploy one CentOS8 Golden image to new VM’s on XC-NG and physical machines.

Env:
OS: All CentOS 8 based.
Hyper visor: XCP-NG 8
FOG: 1.5.7.89 (Dev branch due to CentOS

Issue:
The capturing of the image from the golden image VM went fine. However, the capture function seems to see (or being hardcoded) disks as /dev/sda1 and /dev/sda2 instead of the correct /dev/xvda1 and /dev/xvda2.

/images/goldenimage8
-rwxrwxrwx. 1 root root         2 Dec 29 22:55 d1.fixed_size_partitions
-rwxrwxrwx. 1 root root         0 Dec 29 22:55 d1.has_grub
-rwxrwxrwx. 1 root root   1048576 Dec 29 22:55 d1.mbr
-rwxrwxrwx. 1 root root       194 Dec 30 00:32 d1.minimum.partitions
-rwxrwxrwx. 1 root root        17 Dec 30 00:34 d1.original.fstypes
-rwxrwxrwx. 1 root root         0 Dec 29 22:55 d1.original.swapuuids
-rwxrwxrwx. 1 root root 121741198 Dec 29 22:55 d1p1.img
-rwxrwxrwx. 1 root root        74 Dec 29 22:55 d1p2.img
-rwxrwxrwx. 1 root root       194 Dec 30 00:34 d1.partitions

The content of d1.minimum.partitions:

label: dos
label-id: 0x038fb017
device: /dev/**sda**
unit: sectors

/dev/**sda1** : start=        2048, size=      382149, type=83, bootable
/dev/**sda2** : start=     2099200, size=     8386560, type=8e

I have tried to edit the files to point to the correct /dev/ path as a quick fix, but it seems that FOG do not use these files doing the deploying process.

The deployment process gives the following errors: error1.jpg and error3.jpg. error2.jpg is not so much an error, but more a ohh crap, it seems to restore to /dev/sdaX.

How do I fix this - for an normal Windows admin, it seems like some sort of detecting of available disks on the target system is not working as expected, and the system is just defaulting to sda.

Any help would be appreciated
/Jonas

george1421

Well this is an interesting one. Its possible that FOG doesn’t fully support the new disk structure of Centos8.

With that said we need to collect a bit if information on what FOS Linux is seeing vs what Centos8 is reporting.

On your golden / reference image run and post the outputs of these commands.
lsblk
df -h
cat /etc/fstab

Then schedule a deploy to a target system, but before you schedule the task, check the debug checkbox, then schedule the task.
PXE boot the target computer
After a few screens of text where you need to clear with the enter key you will be dropped to a fos linux command prompt
… (thinking)

OK I think I’m heading down the wrong path, if this was hardware related then this would be the right path, but in this case its the logical (OS) side. So the difference between /dev/sda and /dev/xvda1 is how the OS sees the devices. FOS linux sees /dev/sda and Centos8 sees /dev/xvda which is OK. Now the hypervisor may be the “hardware” bit I’m speaking about (sorry about bouncing back and forth here).

… (thinking)

Lets keep going with the above, once you are at the FOS Linux command prompt key in:
lsblk

And post all here. Lets see what FOS Linux sees on this hypervisor.

george1421

Additional research partition type 8e is LVM. I think the issue here is related to the disk type being presented to FOS Linux. In the case of LVM it can’t do expand an LVM volume, so you can only deploy the golden image to hard disks the same size or larger than the golden image. Also if you deploy to a larger hard disk the LVM volume will not expand to the size of the disk. You will have to do that by hand or by a startup script in your golden image.

JonesDK

Hi @george1421

Thanks for the quick reply!

lsblk

NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda        202:0    0    5G  0 disk
├─xvda1     202:1    0    1G  0 part /boot
└─xvda2     202:2    0    4G  0 part
  ├─cl-root 253:0    0  3.5G  0 lvm  /
  └─cl-swap 253:1    0  512M  0 lvm  [SWAP]

[root@localhost ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
devtmpfs             1.9G     0  1.9G   0% /dev
tmpfs                851M     0  851M   0% /dev/shm
tmpfs                851M   17M  835M   2% /run
tmpfs                851M     0  851M   0% /sys/fs/cgroup
/dev/mapper/cl-root  3.5G  1.3G  2.3G  36% /
/dev/xvda1           976M  126M  783M  14% /boot
tmpfs                171M     0  171M   0% /run/user/0

**df -h**
Filesystem           Size  Used Avail Use% Mounted on
devtmpfs             1.9G     0  1.9G   0% /dev
tmpfs                851M     0  851M   0% /dev/shm
tmpfs                851M   17M  835M   2% /run
tmpfs                851M     0  851M   0% /sys/fs/cgroup
/dev/mapper/cl-root  3.5G  1.3G  2.3G  36% /
/dev/xvda1           976M  126M  783M  14% /boot
tmpfs                171M     0  171M   0% /run/user/0

**cat /etc/fstab**
#
# /etc/fstab
# Created by anaconda on Sun Dec 29 20:44:58 2019
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
/dev/mapper/cl-root     /                       xfs     defaults        0 0
UUID=69d49bae-c6b4-49e9-afba-17a45b536e97 /boot                   ext4    defaults        1 2
/dev/mapper/cl-swap     swap                    swap    defaults        0 0

lsblk from FOG debug command line on target system

Reply to #2:
That’s fine - the golden image is super small and contain only bare minimum. The rest will be setup and done with scritps - I hope Normally I only work with PowerShell and SCCM…

george1421

@JonesDK ok in the last picture from the fos linux engine. You need to be at the cmd prompt to key in lsblk. Press enter until it doesn’t prompt you to press enter any more then key in lsblk

JonesDK

@george1421 ok, 2 sec…
Here u go

JonesDK

Ok, just checked: Golden image disk size 5GB, so 10GB on the target seems to be okay.

george1421

@JonesDK Ok on this target system there is a 10GB disk with 2 partitions. A 6GB and a 4GB to fos linux they are known as /dev/sda. So it “should” image correctly.

This is probably going to fail again when you image it, but it will give us a chance to look at the very first error that gets generated.

Because you picked a debug deploy you can start imaging by keying in fog In this mode you will single step through the deployment process. The deployment will stop at each break point setup in the code. You will need to press the [enter] key to go onto the next step. Start the deployment process and stop at the first error and give us a screen shot. So far what you have given us it “should” deploy OK. So there is something else going on here the developers will need to understand.

JonesDK

All good so fare…

JonesDK

The smallest sda2 ever:

Is that normal or an error doing the capture process?

And we have an error:

JonesDK

And that was it - no other errors or strange things (saying the guy with 0 sucessfull deployments)

george1421

@JonesDK Now at this point you are at an error and a breakpoint. If you hit ctrl-c it should toss you back to the fos linux command prompt. At this time key in the lsblk command again so we can see the partitions as it created them on disk.

The second front will be back on the fog server the developers will need to see the output of these posted here.

cat /images/goldenimage8/d1.partitions
cat /images/goldenimage8/d1.fixed_size_partitions
cat /images/goldenimage8/d1.fixed_size_partitions

Now looking at your original post I’m confused because on the golden image the partition “content” size doesn’t match what FOG captured. These are the compressed contents of the partitions. You see partition 1 has 121.7MB in compressed size. Partition 2 has 74bytes in size. So something happened during image capture.

-rwxrwxrwx. 1 root root 121741198 Dec 29 22:55 d1p1.img
-rwxrwxrwx. 1 root root 74 Dec 29 22:55 d1p2.img

If you look at the output of the df -h command on the reference image partition 1 [ d1p1.img ] should be ~120MB and partition 2 [d1p2.img] should be ~1GB.

So based on this I’m going to say something happened with the image capture when partition 2 was being uploaded to the fog server.

So what I’ll recommend is that you abort the debug deploy and terminate the task on the FOG server web ui then schedule another image capture of your reference image. Go ahead and tick the debug box when you schedule the capture task against your reference image. Once at the FOS Linux command prompt key in fog and single step through the capture process. I’m expecting an error to be thrown on the second partclone screen. When partclone throws an error is just writes the error where ever the cursor is on the screen. Its a bit of a mess, but the developers will need to see that error.

Just be aware the developers are on holiday until after the first of the year. They do check in now and again, but not on any consistent basis. So just be aware if its something that they need to dig into their response may be delayed. I can say they are getting to release FOG 1.5.8 in January so if this is a problem in the code they will want to get it sorted out before 1.5.8 is released.

JonesDK

first reply

JonesDK

cat d1.partitions

label: dos
label-id: 0x038fb017
device: /dev/sda
unit: sectors

/dev/xvda1 : start=        2048, size=     2097152, type=83, bootable
/dev/xvda2 : start=     2099200, size=     8386560, type=8e

cat d1.fixed_size_partitions

JonesDK

@george1421 said in FOG: Not detecting target disks correctly (/dev/sda vs /dev/xvda):

df -h

Here u go

Filesystem           Size  Used Avail Use% Mounted on
devtmpfs             1.9G     0  1.9G   0% /dev
tmpfs                851M     0  851M   0% /dev/shm
tmpfs                851M   17M  835M   2% /run
tmpfs                851M     0  851M   0% /sys/fs/cgroup
/dev/mapper/cl-root  3.5G  1.3G  2.3G  36% /
/dev/xvda1           976M  126M  783M  14% /boot
tmpfs                171M     0  171M   0% /run/user/0

george1421

@JonesDK As I posted last, lets go ahead and recapture your golden image in debug mode to try to understand why it only copied 74 bytes of your root partition.

Also this is outside of the issue you have at the moment and just a few comments

If you recreate your golden image, but manually provision the disk using standard partitions instead of an LVM volume FOG will be able to expand the root partition to the size of the target drive. The centos 8 default is to create a LVM volume (kind of a partition inside of a partition). But for a dedicated VM standard partitions will work just fine if you only have 4 or less partitions (including the boot partition).
Centos as well as RHEL have a configuration script called kickstart (not to be confused with the windows kixstart batch processor). If you close one eye and squint with the other it kind of works like MDT does for windows, but not really. It may not fit your needs here, but just be aware its available for centos.
Since you are a windows guy, linux has a much older equivalent batch processor call the shell. There are a number of command shells out there, but the most common is called bash. Think of the bash sell as a cross between a DOS batch file programming and Powershell. You can use bash as well as some of the other command line utilities to dynamically configure your centos server. If you use a FOG postinstall script, you can write FOG variables to a config file during imaging and then have your configuration script read that config file to aid in configuring the target computer. Linux has something equivalent to the windows services applet structure. The current command line version is called systemctl that is for systemD type systems. There is an older services manager for SysV type systems. (hang with me here, I am going to a point). If you place a bash file in /etc/init.d it will be executed when the system boots. So if you name the file in /etc/init.d something like S99Configure (the name is not arbitrary) S means to run at startup. The 99 is just a sequential number. The scripts in that directory are run alphabetically. So S99 will be run after a script that starts S30. The rest of the text is just used to idenitfy the script and to also aid in running order. It is allowed to have 2 or more scripts that start with S99… as long as the entire name is unique. So to configure the system on the first startup after imaging create a bash script /etc/init.d/S99Configure and put the bash commands in it to customize your target image. The last line of the bash script should be to remove the S99Configure script so it only runs once.

Sebastian Roth

@JonesDK Great you have added so many details in your initial post already, well done! Reading through the whole topic over and over I find it a bit hard to figure out where some of the information (results from commands executed) stem from. So please bear with me if I get some things wrong.

From what I see the main issue here is that FOG/FOS does not handle LVM. This is something on the list that we never found the time to actually implement and most users work around it by using/switching to standard partition layout. That said, I still fancy the idea of properly implementing LVM support some time in the future.

Now about “/dev/sda vs /dev/xvda”: The device names on Linux systems depend on the subsystem used in general. If you run a VM in VirtualBox you usually have sda same as if you have SCSI/SATA/HD/SSD drives in physical machines. I am not sure but I think this is the same for VMware and maybe as well in Hyper-V as they all use some kind of (emulated) SCSI subsystem layer. Now Xen (XCP-ng) uses a differend subsystem (xen_blk) and therefore has different device names like xvda. And there are more, like /dev/nvme0n1 and /dev/mmcblk0… FOS should be able to “convert” from one to the other and vice versa. It does not really convert but simply ignores the sda/xvda/nvme/mmcblk information but only enumerates disks and partitions. There is no need to manually edit d1.partitions d1.minimum.partitions!

-rwxrwxrwx. 1 root root 74 Dec 29 22:55 d1p2.img

The size of the image file shows that FOS is not able to capture the LVM partitions housed within the second partition of your golden master.

All in all I’d say it would be nice to at least detect LVM and print out an error on capture already! I will see if I can squeeze that in before we push out the next FOG release.

george1421

Cross linking posts since the issues are almost the same: https://forums.fogproject.org/topic/14078/1-5-7-89-partclone-doesn-t-capture-an-image-in-dd-mode-wrong-options-in-fog-upload

shruggy

Update. Image updated and can be tested now.

@JonesDK I’m the topic starter from the post cross-linked by @george1421. Here is the FOS image I built. Could you please test it and report if it works for you?

I’ve only built an 64-bit image. If your FOG server is on CentOS put it into the /var/www/fog/service/ipxe directory (backup the old init.xz beforehand). Change the ownership to be the same as of init_32.xz file in there and don’t forget to restore the SELinux context:

cd /var/www/fog/service/ipxe
sudo mv init.xz init.xz.bak
sudo mv ~/Downloads/init.xz . 
sudo chown --reference=init_32.xz init.xz
sudo restorecon init.xz

Now try to capture the image with FOG. On my system I was using the image type “Multiple Partition Image - Single Disk (Not Resizable)” and as image manager “Partclone Zstd” with compression level 7.

Quazz

@shruggy partclone.imager will generate partclone like files using a dd-like approach and is the one used in FOG when selecting raw image.

FOG: Not detecting target disks correctly (/dev/sda vs /dev/xvda)

162

11.6k

17.1k

154.5k