deploy issue after copying image to an other FOG server
-
@boombasstic When you do a debug deploy what
Init Version
does it show? See the first picture you posted last time so you know what I mean. -
@sebastian-roth said in deploy issue after copying image to an other FOG server:
@boombasstic When updating the inits today I just noticed that I was on the wrong track. So far the inits on the webserver were still the ones not having the fix. I am sorry, didn’t remember I actually uploaded the fixed one separately: https://fogproject.org/inits/init-1.5.9-ignore_crc-fix.xz
As well you can re-download the current official inits that have the fix in them as well - as of now.
https://fogproject.org/inits/init.xz
https://fogproject.org/inits/init_32.xzWith the init-1.5.9-ignore_crc-fix.xz init Version was: 2020927
With the current official inits , the Version was : 20200906
Wit both of the the problem still occurs.
-
So i ran some tests following my intuition that the “windows not loading” symptom came from a lack of disk space, and fog not resizing correctly the Data partition of Windows.
Among other things the result of the sfdisk -d /dev/nvme0n1 led me to think about this.It show 68387676 blocks size for the C : system partition. which is the equivalent of the same partition size in the d1.minimum_partitions file.
its about 32GB which is the Used size on the partition.I booted with a Live Gparted, and expanded the partition, then the computer boots normally.
So i think the issue lies with the resizing, not with CRC check.
-
@boombasstic said:
I booted with a Live Gparted, and expanded the partition, then the computer boots normally.
So i think the issue lies with the resizing, not with CRC check.Ok that is a valid point. So we would need to find out why it does not expand on deploy. Do you see any errors or wanrings. Maybe do another debug deploy and take a close look at the point where it says “Filling the disk”.
With the current official inits , the Version was : 20200906
That still looks like you are not using the very latest inits I uploaded two days ago! Did you re-download the files after I posted this? I can’t verify right now but you might run the following command to get us the SHA1 sums (checking both locations just in case there is something wrong with the link):
sha1sum /var/www/html/fog/service/ipxe/init* sha1sum /var/www/fog/service/ipxe/init*
-
@sebastian-roth
there is no “filling disk” step in debug deploy task.
see picture below
i copied the latest inits once again and the Version show is still 20200906
here are the SHA1 sums
0d60fc165643a57c10120e0892f2a1e520b53809 /var/www/html/fog/service/ipxe/init_32.xz bea70a25cd1f0ddfd15bd1124acafaed79408d1e /var/www/html/fog/service/ipxe/init.xz 0d60fc165643a57c10120e0892f2a1e520b53809 /var/www/fog/service/ipxe/init_32.xz bea70a25cd1f0ddfd15bd1124acafaed79408d1e /var/www/fog/service/ipxe/init.xz
-
@boombasstic said in deploy issue after copying image to an other FOG server:
there is no “filling disk” step in debug deploy task.
That would be before the blue partclone screens. I just had a look at the code to see what message it should print at this stage. Should be “Attempting to expand/fill partitions…”.
To see even more information you can set
ismajordebug=1
in Host Kernel Arguments for this host you want to deploy to. It should print the partition table after it’s expanded to your disk.Something is going wrong with the init download on your side I think. I just did the following and get different hash sums:
shell> wget https://fogproject.org/inits/init.xz shell> wget https://fogproject.org/inits/init_32.xz shell> sha1sum init* e7a19cd587bd1d9f3a989308e66a421afcabe813 init_32.xz 350907ec31886fa2ca4f07ba203904d0c8efaf17 init.xz
-
getting the inits with wget the SHA1 sum are correct. maybe it was an issue with cahced files.
The version now shows 20210307
it still does not work
here is the filling disk step
the data partition after deploy still is
68387676 blocks while it should be 496336896 -
@boombasstic To see even more information on the expand/fill operation set
ismajordebug=1
in Host Kernel Arguments for this host you want to deploy to and schedule a debug deploy. Take a picture of the output on screen again. -
@sebastian-roth here are the picture of the debug deploy task with kernel arguments.
It shows an error message "sfdisk failed in (applySfdiskPartitions) -
@boombasstic Not sure what is happening here or why it fails. Now that we have the exact table it tries to apply I can try to replicate the issue.
Just to help me set this up you might post the contents of the current file
d1.minimum.partitions
you have in your image folder on the server. -
Here are the content of the partition files.
d1.partitions
label: gpt label-id: 7D3BE2EA-722A-43A9-AEA0-1F36F17095D8 device: /dev/nvme0n1 unit: sectors first-lba: 34 last-lba: 500118158 /dev/nvme0n1p1 : start= 2048, size= 532480, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=D1686215-19B9-4424-AE16-3405C9F3DEE1, name="EFI system partition", attrs="RequiredPartition GUID:63" /dev/nvme0n1p2 : start= 534528, size= 32768, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=935DDB3D-1186-4C7F-A7D0-903A2F7831CE, name="Microsoft reserved partition", attrs="GUID:63" /dev/nvme0n1p3 : start= 567296, size= 496336896, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=A1E84F32-54EE-481A-A40E-C8A6AE68AD64, name="Basic data partition" /dev/nvme0n1p4 : start= 496904192, size= 1165312, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=5D48B483-9C7A-4094-BDA1-AD3DC4C1070A, name="attrs=\x22RequiredPartition GUID:63" /dev/nvme0n1p5 : start= 498069504, size= 2048000, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=02B11D85-B439-4E37-9004-691A53DCD24F, name="Basic data partition", attrs="RequiredPartition GUID:63"
d1.minimum.partitions
label: gpt label-id: 7D3BE2EA-722A-43A9-AEA0-1F36F17095D8 device: /dev/nvme0n1 unit: sectors first-lba: 34 last-lba: 500118158 /dev/nvme0n1p1 : start= 2048, size= 532480, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=D1686215-19B9-4424-AE16-3405C9F3DEE1, name="EFI system partition", attrs="RequiredPartition GUID:63" /dev/nvme0n1p2 : start= 534528, size= 32768, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=935DDB3D-1186-4C7F-A7D0-903A2F7831CE, name="Microsoft reserved partition", attrs="GUID:63" /dev/nvme0n1p3 : start= 567296, size= 68387676, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=A1E84F32-54EE-481A-A40E-C8A6AE68AD64, name="Basic data partition" /dev/nvme0n1p4 : start= 496904192, size= 1031462, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=5D48B483-9C7A-4094-BDA1-AD3DC4C1070A, name="attrs=\x22RequiredPartition GUID:63" /dev/nvme0n1p5 : start= 498069504, size= 2048000, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=02B11D85-B439-4E37-9004-691A53DCD24F, name="Basic data partition", attrs="RequiredPartition GUID:63"
d1.fixed_size_partitions
:1:2:5:1:2:5
-
@boombasstic said in deploy issue after copying image to an other FOG server:
d1.partitions
label: gpt ... /dev/nvme0n1p1 ... name="EFI system partition", attrs="RequiredPartition GUID:63" /dev/nvme0n1p2 ... name="Microsoft reserved partition", attrs="GUID:63" /dev/nvme0n1p3 ... name="Basic data partition" /dev/nvme0n1p4 ... name="attrs=\x22RequiredPartition GUID:63" /dev/nvme0n1p5 ... name="Basic data partition", attrs="RequiredPartition GUID:63"
I think I figured out what is causing the problem in your case. Though I have no idea where this came from initially. Take a close look at the extra information of each partition above. All seem fine except partition number 4. The
name
andattrs
parameter seem to be a mixup with a strange\x22
(ASCII hex code for"
) in it.While
sfdisk
is happy to use the scrambled input like that our partition layout shrinking script will make/dev/nvme0n1p4 ... name="attrs=\x22RequiredPartition GUID:63", attrs=\x22RequiredPartition GUID:63"
out of it and that finally killssfdisk
when reading the partition layout file:>>> Created a new GPT disklabel (GUID: 7D3BE2EA-722A-43A9-AEA0-1F36F17095D8). /dev/sda1: Created a new partition 1 of type 'EFI System' and of size 260 MiB. Partition #1 contains a vfat signature. /dev/sda2: Created a new partition 2 of type 'Microsoft reserved' and of size 16 MiB. /dev/sda3: Created a new partition 3 of type 'Microsoft basic data' and of size 75.4 MiB. Partition #3 contains a ntfs signature. /dev/sda4: line 12: unsupported command
-
@boombasstic As a quick workaround you should be able to deploy this particular image (and let it expand properly) by manually adjusting the
d1.partitions
andd1.minimum.partitions
file:/dev/nvme0n1p4 ... name="Basic data partition", attrs="RequiredPartition GUID:63"
I will look into a possible fix in the scripts to better handle such scrambled input in the first place.
Ok, found the change in the AWK script related to this. Will have to look into this deeper and do more testing.
-
@boombasstic Did you give that a try?
-
@sebastian-roth
I am not sure to understand what you asked me to do.i tried changing the size of the data partition in the d1.minimum.partitions files, and set this partitions to fixed by adding its number in the d1.fixed_size_partitions
but it did not work.
-
@boombasstic I did not ask you to change the size or fixed information. What I meant is that the
name
andattrs
information is somehow messed up. So please editd1.partitions
andd1.minimum.partitions
file and you should find this at the end of the line for partition number 4:/dev/nvme0n1p4 ... name="attrs=\x22RequiredPartition GUID:63"
Change it so it looks like this:
/dev/nvme0n1p4 ... name="Basic data partition", attrs="RequiredPartition GUID:63"
-
I changed the d1.partitions and d1.minimum.partitions and it works.