Dell 7730 precision laptop deploy GPT error message
First time FOG user coming from a dead clonezilla Ubuntu server. I am NOT a Linux expert, I inherited that server. FOG 1.5.5 running on Centos 7.6.1810 – new installations on an older dell server 3620 desktop. I’ve setup an isolated FOG configuration as I will only ever image our training systems.
The PCs are identical Dell 7730 precision laptops that UEFI boot (believe its true uefi). PXE boot appears to be working great. Secure Boot - Disabled. Sata operation within the bios is set to AHCI.
There are 2 GPT drives which appear to fog using lsblk in debug mode as /dev/nvme0n1 and /dev/nvme1n1 both apparently M.2 PCIe drives, and both with 4 partitions each.
Drives under windows appear as (disk0 centos7)(disk1 windows10)
Under Linux expecting (nvme0n1 centos7) (nvme1n1 windows10) but it’s random base on init of the drives.
It does have access to the internet via a 2nd wireless nic and static subnet. Read lots of posts on the forum here to get to this point where I have spent a few days attempting to resolve my issue by even more reading. I’m hoping I’ve just not understood something or haven’t been able to track down the correct settings.
I’m wanting to capture both drives at once and then deploy to all of the other laptops, but there is no OS Dropdown selection item for a multisetup and I have been unable to find directions on how to determine which selection to make. I’ve tried selecting the options as below but get the error restoring GPT partition tables:
Linux - (50) and with Windows 10 - (9)
Multiple Partition Image - All Disks (Not Resizable) - (3)
Everything - (1)
PartImage - switches on its own to Partclone Gzip after I attempt to deploy a captured image.
All files in my /images directory appear to be the proper size and the uuids appear to be correct.
So during the deploy PXE boot with Linux(50) selected as the OS, I receive:
Erasing current MBR/GPT Tables …Done
Restoring Partition Tables (GPT)…Done
Erasing current MBR/GPT Tables …Done
Restoring Partition Tables (GPT)…Failed
Error trying to restore GPT partition tables (restorePartitionTablesAndBootLoaders)
Args Passed: /dev/nvme1n1 2 /images/DELL7730_Win10_Centos7 50 all
CMD Tried: sgdisk -gl /images/DELL7730_Win10_Centos7/d2.mbr /dev/nvme1n1
Exit returned code 4
Kernel Variables and settings:
bzImage loglevel=4 initrd=init.xz root=/dev/ram0 rw ramdisk_size-127000 web=http://192.168.0.1/fog/ consoleblank=0 rootfstype=ext4 mac=macaddressoflaptop ftp=192.168.0.1 storage=192.168.0.1:images/ storageip=192.168.0.1 osid=50 irqpoll hostname=mylaptop chkdesk=0 img=DELL7730_Win10_Centos7 imgType=mpa imgPartitionType=all imgid=6 imgFormat=0 PIGZ_COMP=-6 hostearly=1 type down
I’m not sure what to do now to resolve the issue.
@Sebastian-Roth I am now, as what I expected is the case.
@jmason Yeah, better open a new one.
@Sebastian-Roth I’m working on attempting to image just one of the disks by specifying the primary host disk field to see if the /dev/nvme0n1 and /dev/nvme1n2 map correctly to the physical disk for this process. I’m thinking there may be an issue, but I can open another thread for that.
@jmason So do I get this right. Any more issues?
@Sebastian-Roth It only did the process once, I just semi panicked due to it being slower than when I was only running 1 at a time vs 10, and the pesky nvme drives starting and ending with different drives in the multi nvme drive system.
Weird that some of the laptops finish in the expected time about 45min was the avg, but I have some sitting at over 2hours on just one partition…and a few the elapsed time is frozen now.
After turning off and restarting a few of the laptops the others that appeared frozen started going again.
@jmason May I ask you to pay very close attention to the partition names you see in the output. Possibly best if you schedule a debug deploy job and whenever you get to a blue partclone screen please note down the partition like
sda1or so and as well the filesystem (NTFS I suppose).
Let us know if it really goes through sda1, sda2, sda3 and starts over with sda1 again. Can’t really imagine it is doing this.
Well I am watching the systems more closely now started up some new ones, perhaps its just much slower than what I expected with 10 hooked up vs the 2 I was testing with for a few months.
Potential BIG problem with hopefully an easy fix. I finally have a big training laptop update so captured my main image from the laptop again, and then hooked up 10 laptops to the switch to deploy as I have been doing… For some reason, when the deploy is complete it RESTARTS deploying all over again. I haven’t updated anything as far as OS or fog, all I did was create a BRAND new image to deploy. I thought it was taking an awful long time for them to complete when I watched one hit the end of the deploy cycle announce clone complete display the uuids and then start over with the first drive again. It appears to only repeat the deploy process one more time and then shut down as expected.
This post is deleted!
@Sebastian-Roth I was finally able to get everything set back up in the new office location. I downloaded and replaced the init.xz and the UUIDs appeared as expected when I performed a deploy from my original image.
I believe this may be finally solved. I can test more things if you need and somewhat faster now that I’m set back up again.
I can’t express my thanks enough! Kudos!!
@Sebastian-Roth We are in the middle of an office move, I will test and respond as soon as I have everything set back up. Hopefully before the end of the week.
@jmason Ok, I had a closer look at the UUID stuff and turns out that we had a general bug there as well as unneeded code. I did a bit of a cleanup while hopefully fixing the problem you saw with dual NVMe disk machines.
Be aware that I removed the need for
dX.original.uuidsaltogether as we have all the information in other files available already. So when you upload the image again (which you don’t have to for the simple deploy test to see if the UUID stuff is fixed!) you won’t have
@Sebastian-Roth Was looking in the images directory here is the contents of the d1.original.uuids file
/dev/nvme0n1 c0c5d1ae-844a-476e-81c6-7df5e3996ef1 1:d8b3acfb-02fa-4da3-9eee-b42d59256a4f /dev/nvme0n1p2 2:2607-0C5E 2:ee3b91d5-4673-4e49-9011-7d3e10182cbe /dev/nvme0n1p3 3:bc88509e-b6ed-49c0-9106-dc7976a67b2a 3:9e96fd40-79f2-4d79-b2e7-574ddd2b5ce6 /dev/nvme0n1p4 4:HtWBPV-9Aom-jBpz-4pyv-qy3A-lf2o-x58axC 4:fea80442-d73a-494d-be20-1aeee1f51158
vs d2.original.uuids files
/dev/nvme1n1 5c273d41-1202-4874-8a69-9af1285c6d77 /dev/nvme1n1p1 1:DEFC-1910 1:2b0507fe-9371-463b-832d-63c3aa24795e 2:9667e751-1aee-4f09-b9cc-8e1c16b3010b /dev/nvme1n1p3 3:382631A826316850 3:3adca3cc-702f-4084-9f16-3b8f241cf81e /dev/nvme1n1p4 4:2C0C9D570C9D1D40 4:e12d4c98-026a-406c-8043-0498c66be933
Not sure if this is helpful in any way but it looked slightly odd missing
/dev/nvme1n1p2, but may just be some debug info file you are using. Anyway thanks for all your work on this and I’ll keep checking in.
@jmason Hmmmm, thanks for the heads up! Not sure what is going on there but I am fairly sure the init_nvme.xz file on our webserver has not changed since I posted the link last. It’s only Tom and me having access at the moment and I don’t think he’s done anything to it. To me this sounds more like there is still an issue within the scripts that only appears in certain situations. Will try to find it on the weekend.
@Sebastian-Roth I decided to run these all again after installing FOG on our permanent server with the init_nvme.xz file. I did a new capture and deploy.
I ran deploy in debug mode and regular mode and only the
Disk UUID being set to........is still blank.
The partition type and partition uuid are being set for each partition. Not sure what would have caused that change, unless there was some quick update to the init_nvme.xz file from when I downloaded it Tuesday morning to the test server vs tuesday afternoon to the permanent server.
@Sebastian-Roth So I ran a deploy today not in debug mode and noticed that the UUID lines that were missing in debug mode actually showed what appeared to be UUIDs as the deployment concluded.
@jmason Will be a couple of days till I find enough time to re-examine this. Will let you know.
@Sebastian-Roth the *.size files were generated properly but the Disk UUID, Partition type, and Partition UUID lines still show just the
being set to..................................after writing an image to a drive. It still appears to be writing the images.
See if the *.size files are being generated properly when capturing the image as well as properly setting UUID and type after deploying the partition images.
The only odd thing was after writing the windows disk…all of the UUID/partition related set lines were just
Thanks for mentioning this! Definitely something else I missed. The whole scripting code is a huge thing and very easy to miss things here and there. I am still working on this but will have a new version ready for test in the next days.