deploy issue after copying image to an other FOG server


  • Hi,

    Im using FOG for a bit more than a year, and it works great. my FOG server is using FOG 1.5.7 on CentOS 7. I had to create a new server, but decided to try it on Debian. Installation was smooth and i installed the latest version 1.5.9, and copied my Windows images from one server to the other.
    However when i deploy my images, after reboot, the Windows OS seem to be “corrupted”. Windows starts but explorer crashes and windows is unusable.
    i tried different things, but with same result.
    I tried deploying images on different devices, they all act the same. Though deploy with my old server works fine with same image.
    Images capture from Debian server and deployed from it are fine, but i would like to use my images from CentOS server and not re-create 50 images.
    Is there a reason for same image to create such result on deploy with different version of FOG (1.5.7 / 1.5.9) running on different OSes (CentOS7 / Debian 10).

    Thanks for your help.


  • @sebastian-roth

    I changed the d1.partitions and d1.minimum.partitions and it works.

  • Senior Developer

    @boombasstic I did not ask you to change the size or fixed information. What I meant is that the name and attrs information is somehow messed up. So please edit d1.partitions and d1.minimum.partitions file and you should find this at the end of the line for partition number 4:

    /dev/nvme0n1p4 ... name="attrs=\x22RequiredPartition GUID:63"
    

    Change it so it looks like this:

    /dev/nvme0n1p4 ... name="Basic data partition", attrs="RequiredPartition GUID:63"
    

  • @sebastian-roth
    I am not sure to understand what you asked me to do.

    i tried changing the size of the data partition in the d1.minimum.partitions files, and set this partitions to fixed by adding its number in the d1.fixed_size_partitions

    but it did not work.

  • Senior Developer

    @boombasstic Did you give that a try?

  • Senior Developer

    @boombasstic As a quick workaround you should be able to deploy this particular image (and let it expand properly) by manually adjusting the d1.partitions and d1.minimum.partitions file:

    /dev/nvme0n1p4 ... name="Basic data partition", attrs="RequiredPartition GUID:63"
    

    I will look into a possible fix in the scripts to better handle such scrambled input in the first place.

    Ok, found the change in the AWK script related to this. Will have to look into this deeper and do more testing.

  • Senior Developer

    @boombasstic said in deploy issue after copying image to an other FOG server:

    d1.partitions

    label: gpt
    ...
    /dev/nvme0n1p1 ... name="EFI system partition", attrs="RequiredPartition GUID:63"
    /dev/nvme0n1p2 ... name="Microsoft reserved partition", attrs="GUID:63"
    /dev/nvme0n1p3 ... name="Basic data partition"
    /dev/nvme0n1p4 ... name="attrs=\x22RequiredPartition GUID:63"
    /dev/nvme0n1p5 ... name="Basic data partition", attrs="RequiredPartition GUID:63"
    

    I think I figured out what is causing the problem in your case. Though I have no idea where this came from initially. Take a close look at the extra information of each partition above. All seem fine except partition number 4. The name and attrs parameter seem to be a mixup with a strange \x22 (ASCII hex code for ") in it.

    While sfdisk is happy to use the scrambled input like that our partition layout shrinking script will make /dev/nvme0n1p4 ... name="attrs=\x22RequiredPartition GUID:63", attrs=\x22RequiredPartition GUID:63" out of it and that finally kills sfdisk when reading the partition layout file:

    >>> Created a new GPT disklabel (GUID: 7D3BE2EA-722A-43A9-AEA0-1F36F17095D8).
    /dev/sda1: Created a new partition 1 of type 'EFI System' and of size 260 MiB.
    Partition #1 contains a vfat signature.
    /dev/sda2: Created a new partition 2 of type 'Microsoft reserved' and of size 16 MiB.
    /dev/sda3: Created a new partition 3 of type 'Microsoft basic data' and of size 75.4 MiB.
    Partition #3 contains a ntfs signature.
    /dev/sda4: line 12: unsupported command
    

  • Here are the content of the partition files.

    d1.partitions

    label: gpt
    label-id: 7D3BE2EA-722A-43A9-AEA0-1F36F17095D8
    device: /dev/nvme0n1
    unit: sectors
    first-lba: 34
    last-lba: 500118158
    
    /dev/nvme0n1p1 : start=        2048, size=      532480, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=D1686215-19B9-4424-AE16-3405C9F3DEE1, name="EFI system partition", attrs="RequiredPartition GUID:63"
    /dev/nvme0n1p2 : start=      534528, size=       32768, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=935DDB3D-1186-4C7F-A7D0-903A2F7831CE, name="Microsoft reserved partition", attrs="GUID:63"
    /dev/nvme0n1p3 : start=      567296, size=   496336896, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=A1E84F32-54EE-481A-A40E-C8A6AE68AD64, name="Basic data partition"
    /dev/nvme0n1p4 : start=   496904192, size=     1165312, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=5D48B483-9C7A-4094-BDA1-AD3DC4C1070A, name="attrs=\x22RequiredPartition GUID:63"
    /dev/nvme0n1p5 : start=   498069504, size=     2048000, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=02B11D85-B439-4E37-9004-691A53DCD24F, name="Basic data partition", attrs="RequiredPartition GUID:63"
    

    d1.minimum.partitions

    label: gpt
    label-id: 7D3BE2EA-722A-43A9-AEA0-1F36F17095D8
    device: /dev/nvme0n1
    unit: sectors
    first-lba: 34
    last-lba: 500118158
    
    /dev/nvme0n1p1 : start=        2048, size=      532480, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=D1686215-19B9-4424-AE16-3405C9F3DEE1, name="EFI system partition", attrs="RequiredPartition GUID:63"
    /dev/nvme0n1p2 : start=      534528, size=       32768, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=935DDB3D-1186-4C7F-A7D0-903A2F7831CE, name="Microsoft reserved partition", attrs="GUID:63"
    /dev/nvme0n1p3 : start=      567296, size=    68387676, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=A1E84F32-54EE-481A-A40E-C8A6AE68AD64, name="Basic data partition"
    /dev/nvme0n1p4 : start=   496904192, size=     1031462, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=5D48B483-9C7A-4094-BDA1-AD3DC4C1070A, name="attrs=\x22RequiredPartition GUID:63"
    /dev/nvme0n1p5 : start=   498069504, size=     2048000, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=02B11D85-B439-4E37-9004-691A53DCD24F, name="Basic data partition", attrs="RequiredPartition GUID:63"
    
    

    d1.fixed_size_partitions

    :1:2:5:1:2:5
    
  • Senior Developer

    @boombasstic Not sure what is happening here or why it fails. Now that we have the exact table it tries to apply I can try to replicate the issue.

    Just to help me set this up you might post the contents of the current file d1.minimum.partitions you have in your image folder on the server.


  • @sebastian-roth here are the picture of the debug deploy task with kernel arguments.
    It shows an error message "sfdisk failed in (applySfdiskPartitions)

    WhatsApp Image 2021-03-12 at 17.10.28.jpeg

    WhatsApp Image 2021-03-12 at 17.10.36.jpeg

  • Senior Developer

    @boombasstic To see even more information on the expand/fill operation set ismajordebug=1 in Host Kernel Arguments for this host you want to deploy to and schedule a debug deploy. Take a picture of the output on screen again.


  • getting the inits with wget the SHA1 sum are correct. maybe it was an issue with cahced files.

    The version now shows 20210307

    WhatsApp Image 2021-03-11 at 09.53.14 (1).jpeg

    it still does not work

    here is the filling disk step

    WhatsApp Image 2021-03-11 at 09.53.14.jpeg

    the data partition after deploy still is
    68387676 blocks while it should be 496336896

  • Senior Developer

    @boombasstic said in deploy issue after copying image to an other FOG server:

    there is no “filling disk” step in debug deploy task.

    That would be before the blue partclone screens. I just had a look at the code to see what message it should print at this stage. Should be “Attempting to expand/fill partitions…”.

    To see even more information you can set ismajordebug=1 in Host Kernel Arguments for this host you want to deploy to. It should print the partition table after it’s expanded to your disk.

    Something is going wrong with the init download on your side I think. I just did the following and get different hash sums:

    shell> wget https://fogproject.org/inits/init.xz
    shell> wget https://fogproject.org/inits/init_32.xz
    shell> sha1sum init*
    e7a19cd587bd1d9f3a989308e66a421afcabe813  init_32.xz
    350907ec31886fa2ca4f07ba203904d0c8efaf17  init.xz
    

  • @sebastian-roth
    there is no “filling disk” step in debug deploy task.
    see picture below
    WhatsApp Image 2021-03-10 at 16.36.33.jpeg

    i copied the latest inits once again and the Version show is still 20200906

    here are the SHA1 sums

    0d60fc165643a57c10120e0892f2a1e520b53809  /var/www/html/fog/service/ipxe/init_32.xz
    bea70a25cd1f0ddfd15bd1124acafaed79408d1e  /var/www/html/fog/service/ipxe/init.xz
    0d60fc165643a57c10120e0892f2a1e520b53809  /var/www/fog/service/ipxe/init_32.xz
    bea70a25cd1f0ddfd15bd1124acafaed79408d1e  /var/www/fog/service/ipxe/init.xz
    
    
  • Senior Developer

    @boombasstic said:

    I booted with a Live Gparted, and expanded the partition, then the computer boots normally.
    So i think the issue lies with the resizing, not with CRC check.

    Ok that is a valid point. So we would need to find out why it does not expand on deploy. Do you see any errors or wanrings. Maybe do another debug deploy and take a close look at the point where it says “Filling the disk”.

    With the current official inits , the Version was : 20200906

    That still looks like you are not using the very latest inits I uploaded two days ago! Did you re-download the files after I posted this? I can’t verify right now but you might run the following command to get us the SHA1 sums (checking both locations just in case there is something wrong with the link):

    sha1sum /var/www/html/fog/service/ipxe/init*
    sha1sum /var/www/fog/service/ipxe/init*
    

  • So i ran some tests following my intuition that the “windows not loading” symptom came from a lack of disk space, and fog not resizing correctly the Data partition of Windows.
    Among other things the result of the sfdisk -d /dev/nvme0n1 led me to think about this.

    WhatsApp Image 2021-03-05 at 15.43.26.jpeg

    It show 68387676 blocks size for the C : system partition. which is the equivalent of the same partition size in the d1.minimum_partitions file.
    its about 32GB which is the Used size on the partition.

    I booted with a Live Gparted, and expanded the partition, then the computer boots normally.

    So i think the issue lies with the resizing, not with CRC check.


  • @sebastian-roth said in deploy issue after copying image to an other FOG server:

    @boombasstic When updating the inits today I just noticed that I was on the wrong track. So far the inits on the webserver were still the ones not having the fix. I am sorry, didn’t remember I actually uploaded the fixed one separately: https://fogproject.org/inits/init-1.5.9-ignore_crc-fix.xz

    As well you can re-download the current official inits that have the fix in them as well - as of now.

    https://fogproject.org/inits/init.xz
    https://fogproject.org/inits/init_32.xz

    With the init-1.5.9-ignore_crc-fix.xz init Version was: 2020927

    With the current official inits , the Version was : 20200906

    Wit both of the the problem still occurs.

  • Senior Developer

    @boombasstic When you do a debug deploy what Init Version does it show? See the first picture you posted last time so you know what I mean.


  • Senior Developer

    @boombasstic When updating the inits today I just noticed that I was on the wrong track. So far the inits on the webserver were still the ones not having the fix. I am sorry, didn’t remember I actually uploaded the fixed one separately: https://fogproject.org/inits/init-1.5.9-ignore_crc-fix.xz

    As well you can re-download the current official inits that have the fix in them as well - as of now.

    https://fogproject.org/inits/init.xz
    https://fogproject.org/inits/init_32.xz

299
Online

8.6k
Users

15.3k
Topics

143.4k
Posts