Dell 7730 precision laptop deploy GPT error message

jmason

@Sebastian-Roth said in Dell 7730 precision laptop deploy GPT error message:

blockdev --getsize64 /dev/nvme0n1

512110190592

blockdev --getsize64 /dev/nvme1n1

1024209543168

cat /images/mydiskimage/*.size

512110190592
1024209543168

Sebastian Roth

@jmason said in Dell 7730 precision laptop deploy GPT error message:

cat /images/mydiskimage/*.size

512110190592
1024209543168

Argggg, my fault here. Can you please edit the *.size text files and make those look like this:
d1.size:

1:512110190592

d2.size:

2:1024209543168

(just add the the number and colon at the beginning of each file -> will fix that in the init_nvme.xz soon!)

jmason

@Sebastian-Roth No problem…it is running now

Partclone window to recap for the linux disk

/dev/nvme0n1p1 raw 134.2 MB
/dev/nvme0n1p2 FAT16 209.7 MB
/dev/nvme0n1p3 XFS 1.1GB
/dev/nvme0n1p4 raw 510.7 GB

windows disk

/dev/nvme1n1p1 FAT32 681.6 MB
/dev/nvme1n1p2 raw 134.2 MB
/dev/nvme1n1p3 NTFS 1.0 TB
/dev/nvme1n1p4 NTFS 1.0 GB

Sebastian Roth

@jmason Don’t actually need all the very details (filesystem and size). Just important we know that nvme0n1 was Linux this time. Please try deployment (does not have to be debug) a couple of times and you should see it deploy Linux to nvme1n1 properly as well if the Linux kernel detects the disks in a different order.

jmason

@Sebastian-Roth Running it again in deploy debug so I can make sure with lsblk that it initialized the drives in a different order and not have to hope it’s different.

It came up with windows as nvme0n1 this time and nvme1n1 as linux…

The only odd thing was after writing the windows disk…all of the UUID/partition related set lines were just . . . . . . . . . . . . . . . . . . . . . without any values

Resetting UUIDs for /dev/nvme0n1
Disk UUID being set to ........................................
Partition type being set to ..................................
Partition uuid being set to...................................
etc...
...
...
Resetting swap systems

but it appears to have moved on to nvme1n1 and is writing the linux image correctly.

Next steps?

Sebastian Roth

@jmason said in Dell 7730 precision laptop deploy GPT error message:

The only odd thing was after writing the windows disk…all of the UUID/partition related set lines were just

Thanks for mentioning this! Definitely something else I missed. The whole scripting code is a huge thing and very easy to miss things here and there. I am still working on this but will have a new version ready for test in the next days.

Sebastian Roth

@jmason Took a little while to get this all sorted. Can you please re-download the init_nvme.xz. It should include all current changes and fixes.

See if the *.size files are being generated properly when capturing the image as well as properly setting UUID and type after deploying the partition images.

jmason

@Sebastian-Roth the *.size files were generated properly but the Disk UUID, Partition type, and Partition UUID lines still show just the being set to.................................. after writing an image to a drive. It still appears to be writing the images.

Sebastian Roth

@jmason Will be a couple of days till I find enough time to re-examine this. Will let you know.

jmason

@Sebastian-Roth So I ran a deploy today not in debug mode and noticed that the UUID lines that were missing in debug mode actually showed what appeared to be UUIDs as the deployment concluded.

jmason

@Sebastian-Roth I decided to run these all again after installing FOG on our permanent server with the init_nvme.xz file. I did a new capture and deploy.

I ran deploy in debug mode and regular mode and only the Disk UUID being set to........ is still blank.
The partition type and partition uuid are being set for each partition. Not sure what would have caused that change, unless there was some quick update to the init_nvme.xz file from when I downloaded it Tuesday morning to the test server vs tuesday afternoon to the permanent server.

Sebastian Roth

@jmason Hmmmm, thanks for the heads up! Not sure what is going on there but I am fairly sure the init_nvme.xz file on our webserver has not changed since I posted the link last. It’s only Tom and me having access at the moment and I don’t think he’s done anything to it. To me this sounds more like there is still an issue within the scripts that only appears in certain situations. Will try to find it on the weekend.

jmason

@Sebastian-Roth Was looking in the images directory here is the contents of the d1.original.uuids file

/dev/nvme0n1 c0c5d1ae-844a-476e-81c6-7df5e3996ef1
 1:d8b3acfb-02fa-4da3-9eee-b42d59256a4f
/dev/nvme0n1p2 2:2607-0C5E 2:ee3b91d5-4673-4e49-9011-7d3e10182cbe
/dev/nvme0n1p3 3:bc88509e-b6ed-49c0-9106-dc7976a67b2a 3:9e96fd40-79f2-4d79-b2e7-574ddd2b5ce6
/dev/nvme0n1p4 4:HtWBPV-9Aom-jBpz-4pyv-qy3A-lf2o-x58axC 4:fea80442-d73a-494d-be20-1aeee1f51158

vs d2.original.uuids files

/dev/nvme1n1 5c273d41-1202-4874-8a69-9af1285c6d77
/dev/nvme1n1p1 1:DEFC-1910 1:2b0507fe-9371-463b-832d-63c3aa24795e
 2:9667e751-1aee-4f09-b9cc-8e1c16b3010b
/dev/nvme1n1p3 3:382631A826316850 3:3adca3cc-702f-4084-9f16-3b8f241cf81e
/dev/nvme1n1p4 4:2C0C9D570C9D1D40 4:e12d4c98-026a-406c-8043-0498c66be933

Not sure if this is helpful in any way but it looked slightly odd missing /dev/nvme0n1p1 and /dev/nvme1n1p2, but may just be some debug info file you are using. Anyway thanks for all your work on this and I’ll keep checking in.

Sebastian Roth

@jmason Ok, I had a closer look at the UUID stuff and turns out that we had a general bug there as well as unneeded code. I did a bit of a cleanup while hopefully fixing the problem you saw with dual NVMe disk machines.

Can you please test the current init.xz ( as well 32 bit if you need that) our build server spit out.

Be aware that I removed the need for dX.original.uuids altogether as we have all the information in other files available already. So when you upload the image again (which you don’t have to for the simple deploy test to see if the UUID stuff is fixed!) you won’t have dX.original.uuids files anymore.

jmason

@Sebastian-Roth We are in the middle of an office move, I will test and respond as soon as I have everything set back up. Hopefully before the end of the week.

jmason

@Sebastian-Roth I was finally able to get everything set back up in the new office location. I downloaded and replaced the init.xz and the UUIDs appeared as expected when I performed a deploy from my original image.

I believe this may be finally solved. I can test more things if you need and somewhat faster now that I’m set back up again.

I can’t express my thanks enough! Kudos!!

jmason

This post is deleted!

jmason

@Sebastian-Roth
Well I am watching the systems more closely now started up some new ones, perhaps its just much slower than what I expected with 10 hooked up vs the 2 I was testing with for a few months.

Potential BIG problem with hopefully an easy fix. I finally have a big training laptop update so captured my main image from the laptop again, and then hooked up 10 laptops to the switch to deploy as I have been doing… For some reason, when the deploy is complete it RESTARTS deploying all over again. I haven’t updated anything as far as OS or fog, all I did was create a BRAND new image to deploy. I thought it was taking an awful long time for them to complete when I watched one hit the end of the deploy cycle announce clone complete display the uuids and then start over with the first drive again.

~~It appears to only repeat the deploy process one more time and then shut down as expected.~~

Sebastian Roth

@jmason May I ask you to pay very close attention to the partition names you see in the output. Possibly best if you schedule a debug deploy job and whenever you get to a blue partclone screen please note down the partition like sda1 or so and as well the filesystem (NTFS I suppose).

Let us know if it really goes through sda1, sda2, sda3 and starts over with sda1 again. Can’t really imagine it is doing this.

jmason

@Sebastian-Roth It only did the process once, I just semi panicked due to it being slower than when I was only running 1 at a time vs 10, and the pesky nvme drives starting and ending with different drives in the multi nvme drive system.

Weird that some of the laptops finish in the expected time about 45min was the avg, but I have some sitting at over 2hours on just one partition…and a few the elapsed time is frozen now.

After turning off and restarting a few of the laptops the others that appeared frozen started going again.

Dell 7730 precision laptop deploy GPT error message

160

12.1k

17.3k

155.4k