Dell 7730 precision laptop deploy GPT error message
-
@Sebastian-Roth said in Dell 7730 precision laptop deploy GPT error message:
@jmason @Tom-Elliott
Adding a simple sector count check is not much of a thing to implement and it would work in most situations (at least those I can think of so far). Even if the two disks are same size it wouldn’t hurt because deploying to the “wrong” one is not a problem.This would definitely be true for me if my systems had 2 identical size hard drives as we would be imaging them both. I wouldn’t really care which one it picked as long as both were available at boot.
Could you make the functionality optional via some kind of check mark if multi-disk non-resizeable is selected? Then it wouldn’t affect everyone using that selection unless they so chose to do so.
-
@jmason said in Dell 7730 precision laptop deploy GPT error message:
Could you make the functionality optional via some kind of check mark if multi-disk non-resizeable is selected?
Probably can but I don’t see why this would effect other users at all. “All Disk” option is non-resizable and therefore trying to allocate the image to the right disk by using a sector count shouldn’t hurt anyone really.
-
@Sebastian-Roth Well if you move forward with this just let me know when you want some testing.
-
@Sebastian-Roth @Tom-Elliott One thing I realized today is that when the deploy fails it reboots and that gives the system a chance to initialize the way the master image expects.
Initially I assumed the key to this working for my setup was in making sure that the smaller drive was the first drive in the master image captured, so that it didn’t attempt to deploy the smaller image onto the larger drive and then fail when attempting to image the larger image onto the smaller drive. I’m not sure that is actually necessary.
So I hooked up 10 of my laptops to the switch today and deployed the group, about half failed the first startup, but on the next reboot all of them initialized the drives as the master image expected.
This might not work well for a system with more than 2 nvme drives being imaged, so I’ll still help test anything you guys come up with and need testing. But I’m fairly satisfied with even the failure and reboot and hoping it will init correctly on the next boot.
-
@jmason Well that is definitely not too bad of an idea. Just let it try often enough till it doesn’t fail anymore. While this will help you not getting under pressure time-wise it’s not an ideal solution. I will let you know when I get something to test ready.
-
@jmason said in Dell 7730 precision laptop deploy GPT error message:
Ubuntu showed the behavior on the 3rd with lsblk and 5th reboot with dmesg, while reboot 7 was different than all previous, I’ll move on to the other 2 ISOs next.
Since this issue is happening with a commercial versions of linux… I wonder if there is any value in calling Dell tech support? This could be the linux kernel doing this, or it could be the uefi firmware. I think you have enough evidence to say its either the hardware, uefi, or the linux kernel doing it. Your ubuntu test doesn’t use the latest kernel, but FOG does so you have a range of kernels where this problem exists.
-
@jmason Ok, got a bit of time to code and test over the weekend. Here is a first try.
Download the init file from our website manually and put in
/var/www/html/fog/service/ipxe/
on your FOG server. Make sure the file is owned by the apache webserver user (see user name of the other files in that directory)! Now edit the settings of one of the hosts you are trying to deploy to in the FOG web UI and set Host Init toinit_nvme.xz
. Schedule a deploy task and now keep an eye on the blue screen output of partclone. It should tell youNTFS
for windows partitions and probablyXFS
orEXT4
for the CentOS Linux partitions. See which one it does first. Note that down and do another two or three rounds till you see it deploying the other OS/disk first. -
This post is deleted! -
@Sebastian-Roth set it up as described and did a debug deploy. There was a new message I hadn’t seen before:
*Preparing Partition layout cat: '/images/myImageName/*.size' : No such file or directory
Same message appears after the Attempting to deploy image notice box.
In the Partclone window, the File System partitions for my linux ~500GB drive showed:
/dev/nvme0n1p1 as raw (134.2 MB) /dev/nvme0n1p2 as FAT16 (209.7 MB) /dev/nvme0n1p3 as XFS (1.1GB) /dev/nvme0n1p4 as raw (510.7 GB)
Just checking to see if this is running as expected…will post nvme1n1 partition info from this first run once the current one completes.
-
@jmason Hmmm, forgot to tell you that you need to re-upload the image before deployment. Sorry! On upload the *.size files will be generated.
-
This post is deleted! -
@Sebastian-Roth So with the newly captured image also created with the init_nvme.xz as Host Init, I brought the host for deploy up in debug mode .
I ensured that the init disk order nvme0n1 and nvme1n1 matched the same order for when I made the image (Just like I did last week when deploying to my 19 laptops).…well running again now with the proper init settings and host image.
-
@Sebastian-Roth Okay after attempting it the third time I’m sure that I have everything assigned appropriately. Still running in deploy-debug the message is confirmed.
After the Preparing Partition layout message I get the
An error has been detected!
box.No drive number passed (restore PartitionTablesAndBootLoaders) Args Passed: /dev/nvme0n1 /images/mydiskimage 50 all Kernel variables and settings: bzImage loglevel=4 initrd=init_nvme.xz root=dev/ram0 rw amdisk_size=127000 web=http://192.168.0.1/fog/ consoleblank=0 rootfstype=ext4 shutdown=1 mac=macaddressoflaptop ftp=192.168.0.1 storage=192.168.0.1:/images/ storageip=192.168.0.1 osid=50 irqpoll hostname=mylaptop chkdsk=0 img=mydiskimage imgType=mpa imgPartitionType=all imgid=11 imgFormat=0 PIGZ_COMP=-6 hostearly=1 isdebug=yes type=down shutdown=1
-
@jmason Ok, seems like I haven’t got it correct on the first try. Need your support now to figure out where I went wrong. In debug deploy, after it failed with the error message please run the following commands, take a picture or post the output here:
blockdev --getsize64 /dev/nvme0n1 blockdev --getsize64 /dev/nvme1n1 cat /images/mydiskimage/*.size
My guess is that the disks are not exactly the same size than the ones you have in the target machine but we’ll see.
-
@Sebastian-Roth said in Dell 7730 precision laptop deploy GPT error message:
blockdev --getsize64 /dev/nvme0n1
512110190592
blockdev --getsize64 /dev/nvme1n1
1024209543168
cat /images/mydiskimage/*.size
512110190592
1024209543168 -
@jmason said in Dell 7730 precision laptop deploy GPT error message:
cat /images/mydiskimage/*.size
512110190592
1024209543168Argggg, my fault here. Can you please edit the *.size text files and make those look like this:
d1.size:1:512110190592
d2.size:
2:1024209543168
(just add the the number and colon at the beginning of each file -> will fix that in the init_nvme.xz soon!)
-
@Sebastian-Roth No problem…it is running now
Partclone window to recap for the linux disk
/dev/nvme0n1p1 raw 134.2 MB /dev/nvme0n1p2 FAT16 209.7 MB /dev/nvme0n1p3 XFS 1.1GB /dev/nvme0n1p4 raw 510.7 GB
windows disk
/dev/nvme1n1p1 FAT32 681.6 MB /dev/nvme1n1p2 raw 134.2 MB /dev/nvme1n1p3 NTFS 1.0 TB /dev/nvme1n1p4 NTFS 1.0 GB
-
@jmason Don’t actually need all the very details (filesystem and size). Just important we know that nvme0n1 was Linux this time. Please try deployment (does not have to be debug) a couple of times and you should see it deploy Linux to nvme1n1 properly as well if the Linux kernel detects the disks in a different order.
-
@Sebastian-Roth Running it again in deploy debug so I can make sure with lsblk that it initialized the drives in a different order and not have to hope it’s different.
It came up with windows as nvme0n1 this time and nvme1n1 as linux…
The only odd thing was after writing the windows disk…all of the UUID/partition related set lines were just . . . . . . . . . . . . . . . . . . . . . without any values
Resetting UUIDs for /dev/nvme0n1 Disk UUID being set to ........................................ Partition type being set to .................................. Partition uuid being set to................................... etc... ... ... Resetting swap systems
but it appears to have moved on to nvme1n1 and is writing the linux image correctly.
Next steps?
-
@jmason said in Dell 7730 precision laptop deploy GPT error message:
The only odd thing was after writing the windows disk…all of the UUID/partition related set lines were just
Thanks for mentioning this! Definitely something else I missed. The whole scripting code is a huge thing and very easy to miss things here and there. I am still working on this but will have a new version ready for test in the next days.