• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Dell 7730 precision laptop deploy GPT error message

    Scheduled Pinned Locked Moved Solved
    FOG Problems
    4
    94
    23.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • george1421G
      george1421 Moderator @jmason
      last edited by

      @jmason First let me say, thank you for being so detailed and helping debug this issue. I know it takes quite a bit of time to do these iterative testing, so Thank You.

      So I see from the pictures you have a 20% rate where it looks like the nvme disk 1 inits before disk 0. Can we correlate the 6th and 9th order with the swap when shown by lsblk.

      I also see from the picture that the PCI address for nvme disks are not changing, at least the location vs name. My intuition is telling me that this problem is probably rooted in the linux kernel and or hardware/linux kernel race condition.

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

      jmasonJ 1 Reply Last reply Reply Quote 0
      • jmasonJ
        jmason @george1421
        last edited by jmason

        @george1421

        Here are two images showing the correlation. I only had to attempt deploy twice this time to get the difference to appear.

        nvme1n1-nvme0n1.PNG

        nvme0n1-nvme1n1.PNG

        well I just ran it a third time and got another difference, but it appears it might be just like the first image displayed with only the output in a different order.
        .
        nvme0n1-nvme1n1-1TB-500GB.PNG

        1 Reply Last reply Reply Quote 1
        • S
          Sebastian Roth Moderator
          last edited by

          @jmason Great stuff. Thanks for that as well. From those pictures it looks like the order of initialization (dmesg output nvme0n1 before nvme1n1 or vice versa) does not co-relate to the disks being in different order.

          1. picture: init nvme1n1 before nvme0n1 - nvme0n1 954 GB disk / nvme1n1 477 GB disk
          2. picture: init nvme0n1 before nvme1n1 - nvme0n1 477 GB disk / nvme1n1 954 GB
          3. picture: init nvme0n1 before nvme1n1 - nvme0n1 954 GB disk / nvme1n1 477 GB

          I suppose if you’d have three disks it could be any combination… 😞

          Let’s see what we can do about this. Can you please get a couple of different Linux Live ISOs and do exactly the same testing on those.

          • Debian: https://cdimage.debian.org/debian-cd/current-live/amd64/iso-hybrid/debian-live-9.7.0-amd64-xfce.iso
          • Ubuntu: http://releases.ubuntu.com/18.04.1/ubuntu-18.04.1-desktop-amd64.iso
          • Arch: https://mirror.orbit-os.com/archlinux/iso/2019.01.01/archlinux-2019.01.01-x86_64.iso
          • SystemRescueCD: https://osdn.net/projects/systemrescuecd/storage/releases/6.0.1/systemrescuecd-6.0.1.iso

          Please see if all of those behave exactly the same (random change on every reboot) or if the disk order seems stable. At lease boot each OS ten times.

          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

          jmasonJ 4 Replies Last reply Reply Quote 0
          • jmasonJ
            jmason @Sebastian Roth
            last edited by

            @Sebastian-Roth Grabbing the ISOs now, debian and ubuntu just updated their release.

            https://cdimage.debian.org/debian-cd/current-live/amd64/iso-hybrid/debian-live-9.8.0-amd64-xfce.iso
            http://releases.ubuntu.com/18.04.2/ubuntu-18.04.2-desktop-amd64.iso

            1 Reply Last reply Reply Quote 1
            • jmasonJ
              jmason @Sebastian Roth
              last edited by jmason

              @Sebastian-Roth

              I am unable to get Debian live to start up after the initial run without install menu.

              Ubuntu showed the behavior on the 3rd with lsblk and 5th reboot with dmesg, while reboot 7 was different than all previous, I’ll move on to the other 2 ISOs next.

              -1-ubuntulive-1.PNG

              -2-ubuntulive-2.PNG

              -3-ubuntulive-3.PNG

              -4-ubuntulive-4.PNG

              -5-ubuntulive-5.PNG

              -6- was like -3-

              -7-ubuntulive-7.PNG

              Tom ElliottT george1421G 2 Replies Last reply Reply Quote 0
              • Tom ElliottT
                Tom Elliott @jmason
                last edited by Tom Elliott

                @jmason what mode is the sata operation within the bios in?

                I’m guessing it’s in raid mode. Is it likely because RAID mode would want at least 2 HDDs regardless of mode of raid that the raid controller is changing how the disks present to the OS?

                Essentially, because it wants raid the order in which they’re listed wouldn’t matter to build the array and have things work properly. But because we aren’t in a RAID configuration we are seeing the issue?

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                1 Reply Last reply Reply Quote 0
                • jmasonJ
                  jmason
                  last edited by jmason

                  @Tom-Elliott I specifically have set the mode to AHCI for these systems.

                  1 Reply Last reply Reply Quote 0
                  • jmasonJ
                    jmason @Sebastian Roth
                    last edited by jmason

                    @Sebastian-Roth

                    First 4 reboots in archlinux showed all different inits it appears.

                    archliux1-4.png reboot -5- was like -1-

                    1 Reply Last reply Reply Quote 0
                    • jmasonJ
                      jmason @Sebastian Roth
                      last edited by

                      @Sebastian-Roth said in Dell 7730 precision laptop deploy GPT error message:

                      SystemRescueCD

                      5 reboots, 1st 4 times different init, 5th same as 1st

                      sysresccd1-5.png

                      1 Reply Last reply Reply Quote 0
                      • S
                        Sebastian Roth Moderator
                        last edited by

                        @jmason To me this seems to be enough evidence that it’s a general “issue” or known to work as intended. I suspect this to be “normal” as PCIe initialization probably returns the disks in different order. Weird thing is that I can’t find much about this being a particular issue with NVMe disks.

                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                        jmasonJ 1 Reply Last reply Reply Quote 0
                        • jmasonJ
                          jmason @george1421
                          last edited by

                          @george1421 said in Dell 7730 precision laptop deploy GPT error message:

                          @jmason said in Dell 7730 precision laptop deploy GPT error message:

                          There is a BIOS update with the following fixes, but don’t see anything related to our issue. I can go ahead and update the system with all the latest fixes available if requested.

                          I would do this no matter what even though the change log shows the fix primarily dealing with the usb-c dock.

                          I did load all available bios/firmware updates and retested the behavior and it is still the same.

                          1 Reply Last reply Reply Quote 0
                          • jmasonJ
                            jmason @Sebastian Roth
                            last edited by jmason

                            @Sebastian-Roth Is it feasible to have an option for multiple disk non-resizeable and some kind of checkbox/option to notify fog that the machines are identical drive wise/hardware wise and would it make a difference. It’s been a long time since I did any coding, and it wasn’t related to this at all, just throwing a thought out.

                            1 Reply Last reply Reply Quote 0
                            • S
                              Sebastian Roth Moderator
                              last edited by

                              @jmason Sorry if it sounded like I’d leave you alone now that we are fairly sure it’s just “normal” behaviour. I still think about how we can solve this for you and others. Though I still have not come up with a great solution to it I sort of postpone implementing a solution in hope of a flash of genius.

                              What is your deadline to get those devices imaged?

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              Tom ElliottT jmasonJ 2 Replies Last reply Reply Quote 0
                              • Tom ElliottT
                                Tom Elliott @Sebastian Roth
                                last edited by

                                @Sebastian-Roth everything I’ve found on this issue refers to using the disks uuid to identify which one to apply it to. That doesn’t help us much as every drive on a system would have its own uuid. So how do we identify which is which? I know it doesn’t help anything. Everything from Serial to Pata and nvme aren’t guaranteed to be a persistent naming scheme for Linux. Luckily SATA and PATA seem to follow the channel pattern on how they’re connected and named. With NVME being on a pcie channel this makes enumeration dependent on how fast a disk feels like revealing itself to the system.

                                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                jmasonJ 1 Reply Last reply Reply Quote 1
                                • S
                                  Sebastian Roth Moderator
                                  last edited by

                                  @Tom-Elliott You are spot on! The only thing I came up with so far is saving the disks sector sizes (in multiple disk mode only) and trying to match those on deployment again. Kind of ugly and possibly error-prone but could give it a try.

                                  Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                  Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                  1 Reply Last reply Reply Quote 0
                                  • jmasonJ
                                    jmason @Sebastian Roth
                                    last edited by

                                    @Sebastian-Roth said in Dell 7730 precision laptop deploy GPT error message:

                                    What is your deadline to get those devices imaged?

                                    I have until mid March before my first full implementation with these new training laptops. I can always image them individually via usb until a working solution is found (aka someone learns how to control the nvme and its feelings of revealing).

                                    1 Reply Last reply Reply Quote 0
                                    • jmasonJ
                                      jmason @Tom Elliott
                                      last edited by jmason

                                      @Tom-Elliott said in Dell 7730 precision laptop deploy GPT error message:

                                      @Sebastian-Roth everything I’ve found on this issue refers to using the disks uuid to identify which one to apply it to. That doesn’t help us much as every drive on a system would have its own uuid.

                                      When registering a system Host into Fog, you’d have to store the UUIDs of the drives and then specify which one would be your disk0/sda and disk/sdb, etc etc, … thinking out loud is all.

                                      Then on deploy if the UUID fields and their mappings are set you use that, otherwise operate as usual.

                                      Tom ElliottT 1 Reply Last reply Reply Quote 0
                                      • Tom ElliottT
                                        Tom Elliott @jmason
                                        last edited by

                                        @jmason The problem isn’t finding the UUID, it’s that the UUID for the disk will be different for each disk.

                                        What do I mean?

                                        One 7730 with 2 NVME drives will have different UUID’s.

                                        Another 7730 with 2 NVME drives (identically sized of course) will also have different UUID’s.

                                        Does this make sense?

                                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                        jmasonJ 1 Reply Last reply Reply Quote 0
                                        • jmasonJ
                                          jmason @Tom Elliott
                                          last edited by jmason

                                          @Tom-Elliott said in Dell 7730 precision laptop deploy GPT error message:

                                          @jmason The problem isn’t finding the UUID, it’s that the UUID for the disk will be different for each disk.

                                          What do I mean?

                                          One 7730 with 2 NVME drives will have different UUID’s.

                                          Another 7730 with 2 NVME drives (identically sized of course) will also have different UUID’s.

                                          Does this make sense?

                                          Yes it makes sense, but I failed in conveying my thought.

                                          My thought was there might be some way when you do a full registration on each host machine to have an option (requiring user input) to designate each nvme drive and its UUID to a fog specific parameter/field ( disk0/sda disk1/sdb etc…) mapping stored in the database.

                                          Then during deploy if the parameter(s) for the drives are present for the host machine, you would have info needed to match the images up based on the actual UUIDs and it wouldn’t matter what the init order of the nvme drives are.

                                          It would require user input to perform the mapping and be optional, and only checked/used for multi-disk non-resizeable.

                                          On registration, Do you wish to register you drives for use in multi-disk capture/deploy operations? Could maybe even have an option for the UUIDs to be entered manually from the web GUI, but it would be best to capture the UUIDs during the host registration.

                                          So the needed info would not be saved with the image, but with the Host machine information in the database.

                                          Not sure if that’s feasible, but just a thought.

                                          Tom ElliottT 2 Replies Last reply Reply Quote 0
                                          • Tom ElliottT
                                            Tom Elliott @jmason
                                            last edited by

                                            @jmason

                                            The problem is the NVME drives are loading randomly. Essentially one time a drive is coming up as NVME0N1 and the next it’s NVME1N1.

                                            Using the UUID would work, but only for the machine on which you capture the image. Basically, if you go down this route, you would essentially require an image for each machine.

                                            Unless you manage to gather all machines’ UUID information, this just isn’t feasible.

                                            Basically What I’m saying,

                                            First: 7730 500GB SSD NVME and 1TB SSD NVME. 500GB UUID 0000-xxxx-0000-xxxx, 1TB UUID 0001-xxxx-0000-xxxx
                                            Second: 7730 500GB SSD NVME and 1 TB SSD NVME. 500GB UUID 0001-xxxa-0001-xxxa, 1TB UUID 0002-xxxz-0000-xxxz

                                            You see what I mean?

                                            Each machine’s drives will have their own UUID’s. So simply put, you would need to know all machine’s UUID information, and inserted into the DB to clarify which one.

                                            Of course, our coding doesn’t, yet, support this either. I imagine it wouldn’t be too difficult to enable, but it basically removes the autonomous element at least for these machines.

                                            The NVME portion is changing and that’s the drive labeling that is determined. With SATA and PATA, this was also possible, but the channels (SATA0 - SATA4 – or how many you had on your machine) would enumerate to Linux in order of their channel number. This made /dev/sda always be on SATA0 and /dev/sdd on SATA4.

                                            In the case of PATA, the naming would also be adjusted based on enumeration, but the Master slot on channel 0 would be /dev/hda, while the Slave slot on channel 1 would be /dev/hdd

                                            Hopefully this helps clarify more what I was trying to get at.

                                            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 5
                                            • 2 / 5
                                            • First post
                                              Last post

                                            268

                                            Online

                                            12.0k

                                            Users

                                            17.3k

                                            Topics

                                            155.2k

                                            Posts
                                            Copyright © 2012-2024 FOG Project