FOG: Not detecting target disks correctly (/dev/sda vs /dev/xvda)



  • Hi guys,

    All new to the FOG universe, so please forgive me if I’m all wrong 🙂

    Objective: Trying to deploy one CentOS8 Golden image to new VM’s on XC-NG and physical machines.

    Env:
    OS: All CentOS 8 based.
    Hyper visor: XCP-NG 8
    FOG: 1.5.7.89 (Dev branch due to CentOS 😎

    Issue:
    The capturing of the image from the golden image VM went fine. However, the capture function seems to see (or being hardcoded) disks as /dev/sda1 and /dev/sda2 instead of the correct /dev/xvda1 and /dev/xvda2.

    /images/goldenimage8
    -rwxrwxrwx. 1 root root         2 Dec 29 22:55 d1.fixed_size_partitions
    -rwxrwxrwx. 1 root root         0 Dec 29 22:55 d1.has_grub
    -rwxrwxrwx. 1 root root   1048576 Dec 29 22:55 d1.mbr
    -rwxrwxrwx. 1 root root       194 Dec 30 00:32 d1.minimum.partitions
    -rwxrwxrwx. 1 root root        17 Dec 30 00:34 d1.original.fstypes
    -rwxrwxrwx. 1 root root         0 Dec 29 22:55 d1.original.swapuuids
    -rwxrwxrwx. 1 root root 121741198 Dec 29 22:55 d1p1.img
    -rwxrwxrwx. 1 root root        74 Dec 29 22:55 d1p2.img
    -rwxrwxrwx. 1 root root       194 Dec 30 00:34 d1.partitions
    

    The content of d1.minimum.partitions:

    label: dos
    label-id: 0x038fb017
    device: /dev/**sda**
    unit: sectors
    
    /dev/**sda1** : start=        2048, size=      382149, type=83, bootable
    /dev/**sda2** : start=     2099200, size=     8386560, type=8e
    

    I have tried to edit the files to point to the correct /dev/ path as a quick fix, but it seems that FOG do not use these files doing the deploying process.

    The deployment process gives the following errors: error1.jpg and error3.jpg. error2.jpg is not so much an error, but more a ohh crap, it seems to restore to /dev/sdaX.

    How do I fix this - for an normal Windows admin, it seems like some sort of detecting of available disks on the target system is not working as expected, and the system is just defaulting to sda.

    Any help would be appreciated 🙂
    /Jonas

    error3.jpg error2.png error1.png


  • Senior Developer

    Latest inits should have all the fixes included. Marking as solved.



  • @JonesDK Sorry for the delay. You can try it out now (as described below).


  • Senior Developer

    @JonesDK Thanks to @shruggy we found out that the new partclone version used introduced a bug that breaks raw (!) capturing of partitions. While I didn’t even know people use FOG to capture LVM in raw mode it’s definitely good we found this. More details in this here: https://github.com/FOGProject/fos/issues/35


  • Senior Developer


  • Moderator

    @shruggy partclone.imager will generate partclone like files using a dd-like approach and is the one used in FOG when selecting raw image.



  • Update. Image updated and can be tested now.

    @JonesDK I’m the topic starter from the post cross-linked by @george1421. Here is the FOS image I built. Could you please test it and report if it works for you?

    I’ve only built an 64-bit image. If your FOG server is on CentOS put it into the /var/www/fog/service/ipxe directory (backup the old init.xz beforehand). Change the ownership to be the same as of init_32.xz file in there and don’t forget to restore the SELinux context:

    cd /var/www/fog/service/ipxe
    sudo mv init.xz init.xz.bak
    sudo mv ~/Downloads/init.xz . 
    sudo chown --reference=init_32.xz init.xz
    sudo restorecon init.xz
    

    Now try to capture the image with FOG. On my system I was using the image type “Multiple Partition Image - Single Disk (Not Resizable)” and as image manager “Partclone Zstd” with compression level 7.


  • Moderator


  • Senior Developer

    @JonesDK Great you have added so many details in your initial post already, well done! Reading through the whole topic over and over I find it a bit hard to figure out where some of the information (results from commands executed) stem from. So please bear with me if I get some things wrong.

    From what I see the main issue here is that FOG/FOS does not handle LVM. This is something on the list that we never found the time to actually implement and most users work around it by using/switching to standard partition layout. That said, I still fancy the idea of properly implementing LVM support some time in the future.

    Now about “/dev/sda vs /dev/xvda”: The device names on Linux systems depend on the subsystem used in general. If you run a VM in VirtualBox you usually have sda same as if you have SCSI/SATA/HD/SSD drives in physical machines. I am not sure but I think this is the same for VMware and maybe as well in Hyper-V as they all use some kind of (emulated) SCSI subsystem layer. Now Xen (XCP-ng) uses a differend subsystem (xen_blk) and therefore has different device names like xvda. And there are more, like /dev/nvme0n1 and /dev/mmcblk0… FOS should be able to “convert” from one to the other and vice versa. It does not really convert but simply ignores the sda/xvda/nvme/mmcblk information but only enumerates disks and partitions. There is no need to manually edit d1.partitions d1.minimum.partitions!

    -rwxrwxrwx. 1 root root 74 Dec 29 22:55 d1p2.img

    The size of the image file shows that FOS is not able to capture the LVM partitions housed within the second partition of your golden master.

    All in all I’d say it would be nice to at least detect LVM and print out an error on capture already! I will see if I can squeeze that in before we push out the next FOG release.


  • Moderator

    @JonesDK As I posted last, lets go ahead and recapture your golden image in debug mode to try to understand why it only copied 74 bytes of your root partition.

    Also this is outside of the issue you have at the moment and just a few comments

    1. If you recreate your golden image, but manually provision the disk using standard partitions instead of an LVM volume FOG will be able to expand the root partition to the size of the target drive. The centos 8 default is to create a LVM volume (kind of a partition inside of a partition). But for a dedicated VM standard partitions will work just fine if you only have 4 or less partitions (including the boot partition).

    2. Centos as well as RHEL have a configuration script called kickstart (not to be confused with the windows kixstart batch processor). If you close one eye and squint with the other it kind of works like MDT does for windows, but not really. It may not fit your needs here, but just be aware its available for centos.

    3. Since you are a windows guy, linux has a much older equivalent batch processor call the shell. There are a number of command shells out there, but the most common is called bash. Think of the bash sell as a cross between a DOS batch file programming and Powershell. You can use bash as well as some of the other command line utilities to dynamically configure your centos server. If you use a FOG postinstall script, you can write FOG variables to a config file during imaging and then have your configuration script read that config file to aid in configuring the target computer. Linux has something equivalent to the windows services applet structure. The current command line version is called systemctl that is for systemD type systems. There is an older services manager for SysV type systems. (hang with me here, I am going to a point). If you place a bash file in /etc/init.d it will be executed when the system boots. So if you name the file in /etc/init.d something like S99Configure (the name is not arbitrary) S means to run at startup. The 99 is just a sequential number. The scripts in that directory are run alphabetically. So S99 will be run after a script that starts S30. The rest of the text is just used to idenitfy the script and to also aid in running order. It is allowed to have 2 or more scripts that start with S99… as long as the entire name is unique. So to configure the system on the first startup after imaging create a bash script /etc/init.d/S99Configure and put the bash commands in it to customize your target image. The last line of the bash script should be to remove the S99Configure script so it only runs once.



  • @george1421 said in FOG: Not detecting target disks correctly (/dev/sda vs /dev/xvda):

    df -h

    Here u go 🙂

    Filesystem           Size  Used Avail Use% Mounted on
    devtmpfs             1.9G     0  1.9G   0% /dev
    tmpfs                851M     0  851M   0% /dev/shm
    tmpfs                851M   17M  835M   2% /run
    tmpfs                851M     0  851M   0% /sys/fs/cgroup
    /dev/mapper/cl-root  3.5G  1.3G  2.3G  36% /
    /dev/xvda1           976M  126M  783M  14% /boot
    tmpfs                171M     0  171M   0% /run/user/0
    


  • cat d1.partitions

    label: dos
    label-id: 0x038fb017
    device: /dev/sda
    unit: sectors
    
    /dev/xvda1 : start=        2048, size=     2097152, type=83, bootable
    /dev/xvda2 : start=     2099200, size=     8386560, type=8e
    

    cat d1.fixed_size_partitions

    2
    


  • first replyFogDebugCommandLine6.png


  • Moderator

    @JonesDK Now at this point you are at an error and a breakpoint. If you hit ctrl-c it should toss you back to the fos linux command prompt. At this time key in the lsblk command again so we can see the partitions as it created them on disk.

    The second front will be back on the fog server the developers will need to see the output of these posted here.

    cat /images/goldenimage8/d1.partitions
    cat /images/goldenimage8/d1.fixed_size_partitions
    cat /images/goldenimage8/d1.fixed_size_partitions
    

    Now looking at your original post I’m confused because on the golden image the partition “content” size doesn’t match what FOG captured. These are the compressed contents of the partitions. You see partition 1 has 121.7MB in compressed size. Partition 2 has 74bytes in size. So something happened during image capture.

    -rwxrwxrwx. 1 root root 121741198 Dec 29 22:55 d1p1.img
    -rwxrwxrwx. 1 root root 74 Dec 29 22:55 d1p2.img
    

    If you look at the output of the df -h command on the reference image partition 1 [ d1p1.img ] should be ~120MB and partition 2 [d1p2.img] should be ~1GB.

    So based on this I’m going to say something happened with the image capture when partition 2 was being uploaded to the fog server.

    So what I’ll recommend is that you abort the debug deploy and terminate the task on the FOG server web ui then schedule another image capture of your reference image. Go ahead and tick the debug box when you schedule the capture task against your reference image. Once at the FOS Linux command prompt key in fog and single step through the capture process. I’m expecting an error to be thrown on the second partclone screen. When partclone throws an error is just writes the error where ever the cursor is on the screen. Its a bit of a mess, but the developers will need to see that error.

    Just be aware the developers are on holiday until after the first of the year. They do check in now and again, but not on any consistent basis. So just be aware if its something that they need to dig into their response may be delayed. I can say they are getting to release FOG 1.5.8 in January so if this is a problem in the code they will want to get it sorted out before 1.5.8 is released.



  • And that was it - no other errors or strange things (saying the guy with 0 sucessfull deployments)



  • The smallest sda2 ever:
    FogDebugCommandLine4.png

    Is that normal or an error doing the capture process?

    And we have an error:
    FogDebugCommandLine5.png



  • All good so fare…

    FogDebugCommandLine3.png


  • Moderator

    @JonesDK Ok on this target system there is a 10GB disk with 2 partitions. A 6GB and a 4GB to fos linux they are known as /dev/sda. So it “should” image correctly.

    This is probably going to fail again when you image it, but it will give us a chance to look at the very first error that gets generated.

    Because you picked a debug deploy you can start imaging by keying in fog In this mode you will single step through the deployment process. The deployment will stop at each break point setup in the code. You will need to press the [enter] key to go onto the next step. Start the deployment process and stop at the first error and give us a screen shot. So far what you have given us it “should” deploy OK. So there is something else going on here the developers will need to understand.



  • Ok, just checked: Golden image disk size 5GB, so 10GB on the target seems to be okay.



  • @george1421 ok, 2 sec…
    Here u go 🙂FogDebugCommandLine2.png


Log in to reply
 

314
Online

7.2k
Users

14.4k
Topics

135.6k
Posts