Centos 7 UUID not updated during imaging - will not boot
-
@gerrit-anderson said in Centos 7 UUID not updated during imaging - will not boot:
/dev/sda1 : start= 1, size=209715199, Id=ee
Looks like the CentOS 7
sfdisk
command is not able to read GPT partition layout. This is just the protective MBR entry.Can you please schedule a debug capture task for the master, boot it up and when you get to the shell run
sfdisk -d /dev/sda
again. Take a picture and post here. -
@sebastian-roth So it looks like sfdisk may not support gpt? I ran a couple more commands, not sure if this shows exactly what you are looking for… These were ran in the debug shell.
-
@gerrit-anderson said in Centos 7 UUID not updated during imaging - will not boot:
These were ran in the debug shell.
Can’t get my head around this. You say your FOG server is 1.5.9. Do you use a custom FOS init?
Please run
sfdisk --version
in the debug shell and post version here. -
@sebastian-roth The sfdisk version is
sfdisk from util-linux 2.23.2
Yes, FOG 1.5.9 most recent download, CentOS 7.9 (latest version 7)
-
@gerrit-anderson said in Centos 7 UUID not updated during imaging - will not boot:
The sfdisk version is
sfdisk from util-linux 2.23.2
Hmm, this version is really old. We are at 2.35.1 at the moment and 2.23.x was used in buildroot back in 2013. I don’t think we had our FOS image build with such an old version at any point really. For GPT support you need 2.26.x at least I read in the man page. I have no idea how you can have a FOS init with this old versioned sfdisk command.
I suggest you download the latest inits from our server and try with those:
sudo -i cd /var/www/fog/service/ipxe/ mv init.xz init.xz.old mv init_32.xz init_32.xz.old wget https://fogproject.org/inits/init.xz wget https://fogproject.org/inits/init_32.xz chown fogproject:apache init*
-
@Gerrit-Anderson Just checked on the official FOS init used with FOG 1.5.9. Version of sfdisk/util-linux is 2.35.1. No idea where you got yours from.
Sure you run this in the FOG debug command shell? Scheduling a debug capture task and PXE boot the machine into it?!
-
@sebastian-roth I may not be understanding this fully, but I ran this command on my master image. That shouldn’t have any impact on anything related to FOG right? The sfdisk that I am using must be what ships with CentOS 7… I also ran this command on my FOG server, and its the same version, but that is also running CentOS 7.
-
@sebastian-roth Ahhh, no. I will do that now. I did not do this in the FOG debug shell. This was all done within CentOS. I apologize… Doing this now!
-
@gerrit-anderson said in Centos 7 UUID not updated during imaging - will not boot:
The sfdisk that I am using must be what ships with CentOS 7… I also ran this command on my FOG server, and its the same version, but that is also running CentOS 7.
Ahhh, now we are talking!! What I am asking you to do is, schedule a debug capture task and PXE boot the machine into it. Then you get to a command shell and run the
sfdisk -d /dev/sda
command there. -
@sebastian-roth sfdisk version is 2.35.1 Below is output of sfdisk -d /dev/sda
-
@Gerrit-Anderson Ok, those seem to match the ones we see in
d1.patitions
so it’s not something that is being messed up when capturing the UUIDs but looks like the IDs read/set by sfdisk are not the ones used by CentOS.While in the FOS debug command mode, run
blkid -po udev /dev/sda5
, take a picture and post that here. -
@sebastian-roth Results of blkid -po udev /dev/sda5 are below!
-
@Gerrit-Anderson Please tell us which UUID do you see in the CentOS /etc/fstab for this master VM machine?
-
@sebastian-roth Here is my fstab
-
@Gerrit-Anderson I have not looked into the UUID stuff in a long time, obviously. The
/dev/disk/by-uuid/...
and/etc/fstab
are both using the filesystem UUID. What we capture ind1.partitions
are partition UUIDs - which are different to FS UUIDs!So we need to look at partclone (used to clone the actual filesystem data) to see if it’s messing up the filesystem UUIDs or not. In general partclone is meant to clone everything including the filesystem UUID. So it really shouldn’t mess with it as far as I know (ref).
Please schedule a debug deploy task on a physical machine (as you said this doesn’t happen when deploying to a VM). PXE boot that machine and when you get to the command shell start the deployment using the command
fog
. You step through the whole process by pressing ENTER every so often and when it’s all done you will get back to a command shell. Here I need you to run the commandblkid
, take a picture and post that here.If you are keen, do the same debug deploy but on a VM. Step though the process and run
blkid
in the end. -
@sebastian-roth Looks like the mystery continues… Below are results!
Creating the master image on a physical machine allows me to sent to any other physical machine, doesn’t need to be like models. Sending the master image built on the physical machine to a HyperV VM causes the same issue as from a HyperV master image to a physical machine. The message is below, warning that the UUID doesn’t exist and cannot boot.
When running debug deploy tasks, below are the results for the physical machine and virtual machine UUID’s respectively. They look identical… This makes me wonder why CentOS doesnt think the disk UUID exists…
Physical Machine
Virtual Machine
The physical machine boots fine, virtual machine tries to boot and then goes to emergency mode.
From what I can tell, FOG is handling the UUID’s correctly…
-
@gerrit-anderson said in Centos 7 UUID not updated during imaging - will not boot:
From what I can tell, FOG is handling the UUID’s correctly…
Would say so too from what we discovered so far. It’s interesting you can capture/deploy from/to VM<->VM and machine<->machine but not “across”.
For further debugging I suggest to dig into the dracut emergency shell/mode more. First run
ls -al /dev/disk/by-uuid/
on the dracut command prompt and post a picture of that here. As well you might also follow the instrcutions printed, runjournalctl
and look through the log for hints on why it fails booting. Also take a look at the file/run/initramfs/rdsosreport.txt
mentioned. Feel free to share the file here and I’ll take a look as well.Edit:
Searching the web I found this: https://unix.stackexchange.com/questions/183859/initramfs-uuid-problems-after-cloningSounds like the initramfs might be generated differently on your VM and bare metal install. Both are missing a driver for the other one. If that turns out to be true you need to find out which one is missing and manually regenerate initramfs with dracut e.g. on your VM before capturing the image to be deployed to hardware.