Dell 7730 precision laptop deploy GPT error message


  • Developer

    @george1421 On the other hand I am wondering why we have not had other people reporting this in the past. What if you have a PC with two drives, one for OS and one for data. You only ever want to image the OS disk but could happen that you deploy to the data disk?! Just thinking out loud here.


  • Developer

    @jmason I can’t give you a reference on this but it’s actually a likely cause (one that I have not though of before, grrrhhh) that disk enumeration can put your two disks in reverse order. This is known in Linux and usually circumnavigated through persistent block device naming.

    Try deploying a couple of times in a row always using the debug mode and run lsblk before starting the task. See if it’s exactly how we imagine it to be (changing disk order).



  • @george1421

    They are identical

    NAME            MAJ:MIN RM    SIZE RO TYPE MOUNTPOINT
    nvme0n1     259:0    0 953.9G  0 disk           
    |-nvme0n1p1 259:6    0   650M  0 part
    |-nvme0n1p2 259:7    0   128M  0 part
    |-nvme0n1p3 259:8    0 952.1G  0 part
    `-nvme0n1p4 259:9    0   990M  0 part
    nvme1n1     259:1    0   477G  0 disk           
    |-nvme1n1p1 259:2    0   128M  0 part
    |-nvme1n1p2 259:3    0   200M  0 part
    |-nvme1n1p3 259:4    0     1G  0 part
    `-nvme1n1p4 259:5    0 475.6G  0 part
    
    

    Interesting thing right after my post, I attempted to add /dev/nvme1n1 as the Host Primary Disk, booted the task in debug mode, had error it couldn’t find the hard drives. Cancelled task and then set it the /dev/nvme0n1 and got the same message. Cleared the Host Primary Disk field and restarted again. This time there was no Failed message…

    Erasing current MBR/GPT Tables …Done
    Restoring Partition Tables (GPT)…Done
    Erasing current MBR/GPT Tables …Done
    Restoring Partition Tables (GPT)…Failed
    

    but

    Erasing current MBR/GPT Tables …Done
    Restoring Partition Tables (GPT)…Done
    Erasing current MBR/GPT Tables …Done
    Restoring Partition Tables (GPT)…Done
    

    and the deploy started. It has completed deploying the nvme1n0 (windows drive) and the partitions to nvme1n1 (linux drive).

    Though it seems to be working, I am puzzled as to what changed and when all I did was chang the Hard Disk Primary parameter a few times and when setting it back to empty and restarting it worked.

    I guess it is possible that the deploy to machine had something misaligned somewhere as I have been attempting to deploy to it over and over. But I have also been restoring the original image using macrium and testing it before each FOG deploy attempt.

    Will also now attempt to deploy again to the same laptop.


  • Moderator

    @jmason Will you schedule a capture/deploy to both your master image computer and target computer, but schedule with the debug option.

    On both the source and destination computers pxe boot them. You will enter debug mode and be dropped to a linux command prompt. Att he linux command prompt key in lsblk Post the output on both systems. This will print the geometry of both the source and destination disk(s).



  • These are all identical systems, so could this mean some part of the capture process is possibly incorrect or something else?

    One thing i’ve noticed is under windows the devices are disk 0 (500GB Linux) and disk 1 (1TB Windows), under pxe boot nvme0n1 (1TB windows) and nvme1n1 (500GB Windows) not sure if that would make any difference, I’m thinking not since it’s UEFI.

    Would specifying one of these nvme drives as the Host Primary Disk make a difference?


  • Developer

    @jmason Well seems clear enough to me, is the source disk larger than the destination disk?



  • After stepping through the debug deploy as requested, I entered sgdisk -gl /images/DELL7730_Win10_Centos7/d2.mbr /dev/nvme1n1 and got some interesting output:

    Creating new GPT entries.
    Warning! Current disk size doesn't match that of the backup!
    Adjusting sizes to match, but subsequent problems are possible!
    
    Warning! Secondary partition table overlaps the last partition by 1000160625 blocks!
    You will need to delete this partition or resize it in another utility
    
    Problem: partition 3 is too big for the disk.
    
    Problem: partition 4 is too big for the disk.
    Aborting write operation!
    Aborting write of new partition table.


  • I just overlooked the debug checkbox…running now


  • Developer

    @jmason How did you schedule the task? Just go to the host’s settings in the web UI, click Basic Tasks, deploy and just before you create the task, tick the checkbox for debug. This way it very similar to a non-debug deploy but the difference is that you have to start manually (by running the very simple command fog - should work if you schedule the task as described!) and you are asked to step through the whole process instead of it going without interaction. Give it a try.

    If you have scheduled the task as described already, then I am wondering if you do PXE boot the laptop or USB boot?!



  • @Sebastian-Roth said in Dell 7730 precision laptop deploy GPT error message:

    Let’s try to tackle this. Please schedule another debug deploy task. Start up the client and fire up fog when you get to the shell. Step through and when you are back to the console after the error please type sgdisk -gl /images/DELL7730_Win10_Centos7/d2.mbr /dev/nvme1n1 (most probably returns an error as well. Please take a picture or copy&paste the error message if you are connected via SSH to the client)

    Now as I mentioned I am not a linux person, so I typed fog at the prompt after booting into debug.

    I get:

    ####
    #   An Error has been detected !
    ####
    
    Fatal Error: Unknown request type :: Null
    
    Kernel variables and settings:
    bzImage loglevel=4 initrd=init.xz root=/dev/ram0 rw ramdisk_size-127000 web=http://192.168.0.1/fog/ consoleblank=0 rootfstype=ext4 shutdown=1 mac=macaddressoflaptop ftp=192.168.0.1 storage=192.168.0.1:images/dev storageip=192.168.0.1 osid=50 irqpoll hostname=mylaptop isdebug=yes shutdown=1
     * Press [Enter] key to continue
    

    Then back to command prompt #

    Is there something I’m missing to run deploy in debug mode?

    Looks like I have to type in more than just fog to get the deploy to run in debug.

    I found these directions but they are about 3 years old:

    https://wiki.fogproject.org/wiki/index.php/Debug_Mode#Deploy_Debug

    Are these still correct for the current version?


  • Developer

    @jmason What I meant was that you cannot actually tell FOG you have a multi OS setup on your machine. Still doesn’t mean that FOG cannot handle it. In many cases it works pretty well. Even more nowadays when you really have a true UEFI install with GPT, EFI boot partition and no need for MBR loaders.

    I have not though of it this way but possibly you could make it two different images. But I would not want to go that route cause you would have to task your machines twice then.



  • @Sebastian-Roth Thanks for the reply, I will remake the image with Partclone Gzip and run the debug session as instructed tomorrow when I’m back in the office.

    Since you mention FOG not being ready to easily do a multiOS multidrive setup, could I get the same result running separate capture and deploy for the 2 different drives and would having the boot loader on only one drive cause a problem or not?


  • Developer

    @jmason You seem to get into the details of FOG fairly quickly! Well done. It’s great to see people reading the docs and forums and making their way. Let’s see, where shall I start.

    PartImage - switches on its own to Partclone Gzip after I attempt to deploy a captured image.

    Partclone Gzip is definitely the better choice. I know we still had an issue in FOG 1.5.5 which made it default to Partimage on a fresh install. You don’t really want that legacy stuff. Use Partclone Gzip oder Zstd.

    but there is no OS Dropdown selection item for a multisetup

    Absolutely right and I’d really like to add that at some point in the future. But it makes things a lot more complex and therefore we have not added it yet. In very many cases one or the other OS choice is working for multisetup but sometimes needs a bit of adjustment. That said, it’s still possible that FOG is not able to handle your setup (combination of boot loaders, partition layouts and all that) yet but if we find out what’s wrong I am happy to add a fix!

    So during the deploy PXE boot with Linux(50) selected as the OS, I receive: …

    Let’s try to tackle this. Please schedule another debug deploy task. Start up the client and fire up fog when you get to the shell. Step through and when you are back to the console after the error please type sgdisk -gl /images/DELL7730_Win10_Centos7/d2.mbr /dev/nvme1n1 (most probably returns an error as well. Please take a picture or copy&paste the error message if you are connected via SSH to the client)

    When selecting Windows as the OS for the image it appears to get further…

    That’s kind of interesting. While I have not had the time to actually look through the scripts to figure out the difference I find it funny that it proceeds but later on seems to have restored the partitions but is failing to actually push out the contents of the image files to those partitions. This part might be harder to debug so I’d stick to OS=Linux (50) for now. Let’s see how far we get.

    OS=Linux (50) should handle Linux legacy BIOS (MBR) boot loaders a bit better by the way. You say you think it’s true UEFI. So maybe this stuff is not much relevant at all. Well, let’s see what you get from the debug session mentioned above.



  • When selecting Windows as the OS for the image it appears to get further…it clears the screen…places a red on green Please wait… box and then displays:

    cat: /tmp/partclone.log: No such file or directory
    #
    # A warning has been detected
    #
    Image failed to restore and exited with code 1 (writeImage)
       Info:
       Args Passed: /images/DELL7730_Win10_Centos7/d2p1.img /dev/nvme1n1p1
    #
    # Will continue in 1 minute
    #
    

    after a minute it appeared to update the Args Passed d#p#.img and nvme#n#p# parameters.
    after a few times it appeared to move on to another process that scrolled by to fast to read
    this apparently happened a few times and each time updating the Args Passed.
    It again cleared the screen displayed please wait and updated the line:

    Args Passed:  /images/DELL7730_Win10_Centos7/d2p4.img* /dev/nvme1n1p4
    #
    # Will continue in 1 minute
    #
    * Clearing ntfs flag....Done
    * Resetting UUIDs for /dev/nvme1n1
    * Disk UUID being set to...
    * Partition type being set to....1:alphanumeric number
    * Partition uuid being set to....1:alphanumeric number
    

    …etc…
    then restarted.

    Afterwards windows appeared to boot, but centos would not reporting:

    error: no such device
    error: file '/vmlinuz-3.10...etc' not found
    error: you need to load the kernal first.
    

    Any ideas?


Log in to reply
 

327
Online

6.1k
Users

13.4k
Topics

126.4k
Posts