• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Error Restoring GPT Partition Tables

    Scheduled Pinned Locked Moved Unsolved
    FOG Problems
    4
    52
    6.9k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tlehrian @Quazz
      last edited by

      @Quazz Hmmmm… very interesting. I guess I’m OK that it seems to be a random thing. I can keeping booting until it finally, randomly, picks the right drive to be the primary. At least it doesn’t seem to be a larger HW issue. I’ll be looking for a solution to this issue if one does eventually pop up. Thanks for the quick reply.

      1 Reply Last reply Reply Quote 0
      • george1421G
        george1421 Moderator @tlehrian
        last edited by

        @tlehrian said in Error Restoring GPT Partition Tables:

        I’m suspecting maybe the drives are being switched around and the process is trying to use the 256GB drive as the primary OS drive, when it won’t be large enough.

        Yes this is an issue we are tracking in the link Quazz provided. We are looking for someone with a little linux skills and a system with 2 nvme drives to help us debug. Are you that person?

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

        T 1 Reply Last reply Reply Quote 0
        • T
          tlehrian @george1421
          last edited by

          @george1421 Possibly. Unfortunately I have a small window of opportunity here where I could help test before we need to have all of our systems with two M2 drives up and running for our Fall semester (starting Aug 27). At that point, I’d no longer have any systems to play with.

          george1421G 1 Reply Last reply Reply Quote 0
          • george1421G
            george1421 Moderator @tlehrian
            last edited by

            @tlehrian I’m hoping that with a short time of debugging (just less than an hour of on and off testing) will help us (hopefully) find a solution.

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

            T 1 Reply Last reply Reply Quote 0
            • T
              tlehrian @george1421
              last edited by

              @george1421 Sure, I’d be happy to help where I can. After reading through the thread @Quazz shared, I guess I’m lucky that we’re trying to use the large NVME drive as our OS drive…

              george1421G 1 Reply Last reply Reply Quote 0
              • george1421G
                george1421 Moderator @tlehrian
                last edited by george1421

                @tlehrian Ok here is what we are seeing. When FOS Linux boots, the nvme drives initialize (become ready to the os) at different times. Sometimes drive A is ready first and other times drive B is ready first. Well when linux boots what ever drive inits first becomes /dev/nvme01 and the second one becomes /dev/nvme02. This is not an issue with FOG or linux, its an issue between the linux OS and the hardware.

                So what we need is to run a utility with a few commands to help us detect which drive is which in each state. The switch is totally at random so we can’t predict the order using linux. So what I need you (as a tester) to do is to pxe boot the target computer multiple times to record the settings when the drives are in a normal and then reversed order. If we are lucky you will see this swap within 10 pxe boots.

                Here is what I want you to do:

                1. Download this updated init from here: https://fogproject.org/inits/init_nvme-cli.xz
                2. Rename the original inits in `/var/www/html/fog/service/ipxe init.xz to init.xz.sav
                3. Move the downloaded file to that directory and save as init.xz
                4. Pick one of these dual drive nvme computers and schedule a deploy task to it. But before you hit the schedule task button tick the debug checkbox then schedule the task.
                5. PXE boot the target computer. After a few screens of text where you need to press enter to clear you will be dropped to the FOS Linux command prompt.
                6. At the FOS Linux command prompt run this command lsblk to note the size and order of the nvme disk. Use disk size to be your guide in determining the order. So this is state 1.
                  6.1 You can use these steps if you want to setup remote debugging. Its easier to do the copy and paste of commands from putty. You don’t need to, its just one option.
                  6.2 At the FOS Linux command prompt key in ip addr show and collect the IP address of the FOS Linux computer.
                  6.3 Give root a password with passwd. Just give it a simple password like hello. The password will be reset on the next reboot. So don’t worry.
                  6.4 From a windows computer use putty to ssh into the FOS Linux computer. Login as root and the password you created in step 6.3
                7. At the FOS Linux command prompt key in the following and post the results here nvme list. If the nvme command isn’t known then the downloaded inits are not in the right spot.
                8. Key in the following command and post the result(s) here nvme id-ctrl /dev/nvme0n1 -H and (I’m guessing at the name since I don’t have a dual nvme system, use the name from the lsblk command above) nvme id-ctrl /dev/nvme0n2 -H
                9. Now reboot the FOS Linux computer with ctrl-alt-del or key in reboot at the FOS Linux command prompt. The system should PXE boot right back into FOS Linux in debug mode.
                10. Use the lsblk command to determine the disk order. We are looks for the order of the drives when they switch places. If you can’t get them to switch then power off the system instead of rebooting to see if we can get them to switch. The key is to capture the output of the nvme command in both states.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                T 3 Replies Last reply Reply Quote 1
                • T
                  tlehrian @george1421
                  last edited by

                  @george1421 Ok. I should have a chance to do this later today. I’m hopeful this leads to a fix for the issue.

                  1 Reply Last reply Reply Quote 1
                  • T
                    tlehrian @george1421
                    last edited by

                    @george1421 Here is the information from State 1:

                    State 1:
                    > nvme list
                    
                    Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
                    ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
                    /dev/nvme0n1     S499NX0M113634       SAMSUNG MZVLB256HAHQ-000H2               1           2.95  GB / 256.06  GB    512   B +  0 B   EXD71HAQ
                    /dev/nvme1n1     S498NA0M403426       SAMSUNG MZVLB512HAJQ-000H2               1         149.49  GB / 512.11  GB    512   B +  0 B   EXA71HAQ
                    
                    
                    > nvme id-ctrl /dev/nvme0n1 -H
                    
                    NVME Identify Controller:
                    vid     : 0x144d
                    ssvid   : 0x144d
                    sn      : S499NX0M113634
                    mn      : SAMSUNG MZVLB256HAHQ-000H2
                    fr      : EXD71HAQ
                    rab     : 2
                    ieee    : 002538
                    cmic    : 0
                      [2:2] : 0     PCI
                      [1:1] : 0     Single Controller
                      [0:0] : 0     Single Port
                    
                    mdts    : 9
                    cntlid  : 4
                    ver     : 10200
                    rtd3r   : 186a0
                    rtd3e   : 7a1200
                    oaes    : 0
                      [8:8] : 0     Namespace Attribute Changed Event Not Supported
                    
                    oacs    : 0x17
                     [15:4] : 0x1   Reserved
                      [3:3] : 0     NS Management and Attachment Not Supported
                      [2:2] : 0x1   FW Commit and Download Supported
                      [1:1] : 0x1   Format NVM Supported
                      [0:0] : 0x1   Sec. Send and Receive Supported
                    
                    acl     : 7
                    aerl    : 7
                    frmw    : 0x16
                      [4:4] : 0x1   Firmware Activate Without Reset Supported
                      [3:1] : 0x3   Number of Firmware Slots
                      [0:0] : 0     Firmware Slot 1 Read/Write
                    
                    lpa     : 0x3
                      [1:1] : 0x1   Command Effects Log Page Supported
                      [0:0] : 0x1   SMART/Health Log Page per NS Supported
                    
                    elpe    : 255
                    npss    : 4
                    avscc   : 0x1
                      [0:0] : 0x1   Admin Vendor Specific Commands uses NVMe Format
                    
                    apsta   : 0x1
                      [0:0] : 0x1   Autonomous Power State Transitions Supported
                    
                    wctemp  : 354
                    cctemp  : 355
                    mtfa    : 50
                    hmpre   : 0
                    hmmin   : 0
                    tnvmcap : 256060514304
                    unvmcap : 0
                    rpmbs   : 0
                     [31:24]: 0     Access Size
                     [23:16]: 0     Total Size
                      [5:3] : 0     Authentication Method
                      [2:0] : 0     Number of RPMB Units
                    
                    sqes    : 0x66
                      [7:4] : 0x6   Max SQ Entry Size (64)
                      [3:0] : 0x6   Min SQ Entry Size (64)
                    
                    cqes    : 0x44
                      [7:4] : 0x4   Max CQ Entry Size (16)
                      [3:0] : 0x4   Min CQ Entry Size (16)
                    
                    nn      : 1
                    oncs    : 0x1f
                      [5:5] : 0     Reservations Not Supported
                      [4:4] : 0x1   Save and Select Supported
                      [3:3] : 0x1   Write Zeroes Supported
                      [2:2] : 0x1   Data Set Management Supported
                      [1:1] : 0x1   Write Uncorrectable Supported
                      [0:0] : 0x1   Compare Supported
                    
                    fuses   : 0
                      [0:0] : 0     Fused Compare and Write Not Supported
                    
                    fna     : 0
                      [2:2] : 0     Crypto Erase Not Supported as part of Secure Erase
                      [1:1] : 0     Crypto Erase Applies to Single Namespace(s)
                      [0:0] : 0     Format Applies to Single Namespace(s)
                    
                    vwc     : 0x1
                      [0:0] : 0x1   Volatile Write Cache Present
                    
                    awun    : 1023
                    awupf   : 0
                    nvscc   : 1
                      [0:0] : 0x1   NVM Vendor Specific Commands uses NVMe Format
                    
                    acwu    : 0
                    sgls    : 0
                      [0:0] : 0     Scatter-Gather Lists Not Supported
                    
                    subnqn  :
                    ps    0 : mp:7.02W operational enlat:0 exlat:0 rrt:0 rrl:0
                              rwt:0 rwl:0 idle_power:- active_power:-
                    ps    1 : mp:6.30W operational enlat:0 exlat:0 rrt:1 rrl:1
                              rwt:1 rwl:1 idle_power:- active_power:-
                    ps    2 : mp:3.50W operational enlat:0 exlat:0 rrt:2 rrl:2
                              rwt:2 rwl:2 idle_power:- active_power:-
                    ps    3 : mp:0.0760W non-operational enlat:210 exlat:1200 rrt:3 rrl:3
                              rwt:3 rwl:3 idle_power:- active_power:-
                    ps    4 : mp:0.0050W non-operational enlat:2000 exlat:8000 rrt:4 rrl:4
                              rwt:4 rwl:4 idle_power:- active_power:-
                    
                    > nvme id-ctrl /dev/nvme1n1 -H
                    
                    NVME Identify Controller:
                    vid     : 0x144d
                    ssvid   : 0x144d
                    sn      : S498NA0M403426
                    mn      : SAMSUNG MZVLB512HAJQ-000H2
                    fr      : EXA71HAQ
                    rab     : 2
                    ieee    : 002538
                    cmic    : 0
                      [2:2] : 0     PCI
                      [1:1] : 0     Single Controller
                      [0:0] : 0     Single Port
                    
                    mdts    : 9
                    cntlid  : 4
                    ver     : 10200
                    rtd3r   : 186a0
                    rtd3e   : 7a1200
                    oaes    : 0
                      [8:8] : 0     Namespace Attribute Changed Event Not Supported
                    
                    oacs    : 0x17
                     [15:4] : 0x1   Reserved
                      [3:3] : 0     NS Management and Attachment Not Supported
                      [2:2] : 0x1   FW Commit and Download Supported
                      [1:1] : 0x1   Format NVM Supported
                      [0:0] : 0x1   Sec. Send and Receive Supported
                    
                    acl     : 7
                    aerl    : 7
                    frmw    : 0x16
                      [4:4] : 0x1   Firmware Activate Without Reset Supported
                      [3:1] : 0x3   Number of Firmware Slots
                      [0:0] : 0     Firmware Slot 1 Read/Write
                    
                    lpa     : 0x3
                      [1:1] : 0x1   Command Effects Log Page Supported
                      [0:0] : 0x1   SMART/Health Log Page per NS Supported
                    
                    elpe    : 255
                    npss    : 4
                    avscc   : 0x1
                      [0:0] : 0x1   Admin Vendor Specific Commands uses NVMe Format
                    
                    apsta   : 0x1
                      [0:0] : 0x1   Autonomous Power State Transitions Supported
                    
                    wctemp  : 354
                    cctemp  : 355
                    mtfa    : 50
                    hmpre   : 0
                    hmmin   : 0
                    tnvmcap : 512110190592
                    unvmcap : 0
                    rpmbs   : 0
                     [31:24]: 0     Access Size
                     [23:16]: 0     Total Size
                      [5:3] : 0     Authentication Method
                      [2:0] : 0     Number of RPMB Units
                    
                    sqes    : 0x66
                      [7:4] : 0x6   Max SQ Entry Size (64)
                      [3:0] : 0x6   Min SQ Entry Size (64)
                    
                    cqes    : 0x44
                      [7:4] : 0x4   Max CQ Entry Size (16)
                      [3:0] : 0x4   Min CQ Entry Size (16)
                    
                    nn      : 1
                    oncs    : 0x1f
                      [5:5] : 0     Reservations Not Supported
                      [4:4] : 0x1   Save and Select Supported
                      [3:3] : 0x1   Write Zeroes Supported
                      [2:2] : 0x1   Data Set Management Supported
                      [1:1] : 0x1   Write Uncorrectable Supported
                      [0:0] : 0x1   Compare Supported
                    
                    fuses   : 0
                      [0:0] : 0     Fused Compare and Write Not Supported
                    
                    fna     : 0
                      [2:2] : 0     Crypto Erase Not Supported as part of Secure Erase
                      [1:1] : 0     Crypto Erase Applies to Single Namespace(s)
                      [0:0] : 0     Format Applies to Single Namespace(s)
                    
                    vwc     : 0x1
                      [0:0] : 0x1   Volatile Write Cache Present
                    
                    awun    : 1023
                    awupf   : 0
                    nvscc   : 1
                      [0:0] : 0x1   NVM Vendor Specific Commands uses NVMe Format
                    
                    acwu    : 0
                    sgls    : 0
                      [0:0] : 0     Scatter-Gather Lists Not Supported
                    
                    subnqn  :
                    ps    0 : mp:7.02W operational enlat:0 exlat:0 rrt:0 rrl:0
                              rwt:0 rwl:0 idle_power:- active_power:-
                    ps    1 : mp:6.30W operational enlat:0 exlat:0 rrt:1 rrl:1
                              rwt:1 rwl:1 idle_power:- active_power:-
                    ps    2 : mp:3.50W operational enlat:0 exlat:0 rrt:2 rrl:2
                              rwt:2 rwl:2 idle_power:- active_power:-
                    ps    3 : mp:0.0760W non-operational enlat:210 exlat:1200 rrt:3 rrl:3
                              rwt:3 rwl:3 idle_power:- active_power:-
                    ps    4 : mp:0.0050W non-operational enlat:2000 exlat:8000 rrt:4 rrl:4
                              rwt:4 rwl:4 idle_power:- active_power:-
                    
                    1 Reply Last reply Reply Quote 0
                    • T
                      tlehrian @george1421
                      last edited by

                      @george1421 And from State 2 (reversed):

                      > nvme list
                      
                      Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
                      ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
                      /dev/nvme0n1     S498NA0M403426       SAMSUNG MZVLB512HAJQ-000H2               1         149.49  GB / 512.11  GB    512   B +  0 B   EXA71HAQ
                      /dev/nvme1n1     S499NX0M113634       SAMSUNG MZVLB256HAHQ-000H2               1           2.95  GB / 256.06  GB    512   B +  0 B   EXD71HAQ
                      
                      > nvme id-ctrl /dev/nvme0n1 -H
                      
                      NVME Identify Controller:
                      vid     : 0x144d
                      ssvid   : 0x144d
                      sn      : S498NA0M403426
                      mn      : SAMSUNG MZVLB512HAJQ-000H2
                      fr      : EXA71HAQ
                      rab     : 2
                      ieee    : 002538
                      cmic    : 0
                        [2:2] : 0     PCI
                        [1:1] : 0     Single Controller
                        [0:0] : 0     Single Port
                      
                      mdts    : 9
                      cntlid  : 4
                      ver     : 10200
                      rtd3r   : 186a0
                      rtd3e   : 7a1200
                      oaes    : 0
                        [8:8] : 0     Namespace Attribute Changed Event Not Supported
                      
                      oacs    : 0x17
                       [15:4] : 0x1   Reserved
                        [3:3] : 0     NS Management and Attachment Not Supported
                        [2:2] : 0x1   FW Commit and Download Supported
                        [1:1] : 0x1   Format NVM Supported
                        [0:0] : 0x1   Sec. Send and Receive Supported
                      
                      acl     : 7
                      aerl    : 7
                      frmw    : 0x16
                        [4:4] : 0x1   Firmware Activate Without Reset Supported
                        [3:1] : 0x3   Number of Firmware Slots
                        [0:0] : 0     Firmware Slot 1 Read/Write
                      
                      lpa     : 0x3
                        [1:1] : 0x1   Command Effects Log Page Supported
                        [0:0] : 0x1   SMART/Health Log Page per NS Supported
                      
                      elpe    : 255
                      npss    : 4
                      avscc   : 0x1
                        [0:0] : 0x1   Admin Vendor Specific Commands uses NVMe Format
                      
                      apsta   : 0x1
                        [0:0] : 0x1   Autonomous Power State Transitions Supported
                      
                      wctemp  : 354
                      cctemp  : 355
                      mtfa    : 50
                      hmpre   : 0
                      hmmin   : 0
                      tnvmcap : 512110190592
                      unvmcap : 0
                      rpmbs   : 0
                       [31:24]: 0     Access Size
                       [23:16]: 0     Total Size
                        [5:3] : 0     Authentication Method
                        [2:0] : 0     Number of RPMB Units
                      
                      sqes    : 0x66
                        [7:4] : 0x6   Max SQ Entry Size (64)
                        [3:0] : 0x6   Min SQ Entry Size (64)
                      
                      cqes    : 0x44
                        [7:4] : 0x4   Max CQ Entry Size (16)
                        [3:0] : 0x4   Min CQ Entry Size (16)
                      
                      nn      : 1
                      oncs    : 0x1f
                        [5:5] : 0     Reservations Not Supported
                        [4:4] : 0x1   Save and Select Supported
                        [3:3] : 0x1   Write Zeroes Supported
                        [2:2] : 0x1   Data Set Management Supported
                        [1:1] : 0x1   Write Uncorrectable Supported
                        [0:0] : 0x1   Compare Supported
                      
                      fuses   : 0
                        [0:0] : 0     Fused Compare and Write Not Supported
                      
                      fna     : 0
                        [2:2] : 0     Crypto Erase Not Supported as part of Secure Erase
                        [1:1] : 0     Crypto Erase Applies to Single Namespace(s)
                        [0:0] : 0     Format Applies to Single Namespace(s)
                      
                      vwc     : 0x1
                        [0:0] : 0x1   Volatile Write Cache Present
                      
                      awun    : 1023
                      awupf   : 0
                      nvscc   : 1
                        [0:0] : 0x1   NVM Vendor Specific Commands uses NVMe Format
                      
                      acwu    : 0
                      sgls    : 0
                        [0:0] : 0     Scatter-Gather Lists Not Supported
                      
                      subnqn  :
                      ps    0 : mp:7.02W operational enlat:0 exlat:0 rrt:0 rrl:0
                                rwt:0 rwl:0 idle_power:- active_power:-
                      ps    1 : mp:6.30W operational enlat:0 exlat:0 rrt:1 rrl:1
                                rwt:1 rwl:1 idle_power:- active_power:-
                      ps    2 : mp:3.50W operational enlat:0 exlat:0 rrt:2 rrl:2
                                rwt:2 rwl:2 idle_power:- active_power:-
                      ps    3 : mp:0.0760W non-operational enlat:210 exlat:1200 rrt:3 rrl:3
                                rwt:3 rwl:3 idle_power:- active_power:-
                      ps    4 : mp:0.0050W non-operational enlat:2000 exlat:8000 rrt:4 rrl:4
                                rwt:4 rwl:4 idle_power:- active_power:-
                      
                      > nvme id-ctrl /dev/nvme1n1 -H
                      
                      NVME Identify Controller:
                      vid     : 0x144d
                      ssvid   : 0x144d
                      sn      : S499NX0M113634
                      mn      : SAMSUNG MZVLB256HAHQ-000H2
                      fr      : EXD71HAQ
                      rab     : 2
                      ieee    : 002538
                      cmic    : 0
                        [2:2] : 0     PCI
                        [1:1] : 0     Single Controller
                        [0:0] : 0     Single Port
                      
                      mdts    : 9
                      cntlid  : 4
                      ver     : 10200
                      rtd3r   : 186a0
                      rtd3e   : 7a1200
                      oaes    : 0
                        [8:8] : 0     Namespace Attribute Changed Event Not Supported
                      
                      oacs    : 0x17
                       [15:4] : 0x1   Reserved
                        [3:3] : 0     NS Management and Attachment Not Supported
                        [2:2] : 0x1   FW Commit and Download Supported
                        [1:1] : 0x1   Format NVM Supported
                        [0:0] : 0x1   Sec. Send and Receive Supported
                      
                      acl     : 7
                      aerl    : 7
                      frmw    : 0x16
                        [4:4] : 0x1   Firmware Activate Without Reset Supported
                        [3:1] : 0x3   Number of Firmware Slots
                        [0:0] : 0     Firmware Slot 1 Read/Write
                      
                      lpa     : 0x3
                        [1:1] : 0x1   Command Effects Log Page Supported
                        [0:0] : 0x1   SMART/Health Log Page per NS Supported
                      
                      elpe    : 255
                      npss    : 4
                      avscc   : 0x1
                        [0:0] : 0x1   Admin Vendor Specific Commands uses NVMe Format
                      
                      apsta   : 0x1
                        [0:0] : 0x1   Autonomous Power State Transitions Supported
                      
                      wctemp  : 354
                      cctemp  : 355
                      mtfa    : 50
                      hmpre   : 0
                      hmmin   : 0
                      tnvmcap : 256060514304
                      unvmcap : 0
                      rpmbs   : 0
                       [31:24]: 0     Access Size
                       [23:16]: 0     Total Size
                        [5:3] : 0     Authentication Method
                        [2:0] : 0     Number of RPMB Units
                      
                      sqes    : 0x66
                        [7:4] : 0x6   Max SQ Entry Size (64)
                        [3:0] : 0x6   Min SQ Entry Size (64)
                      
                      cqes    : 0x44
                        [7:4] : 0x4   Max CQ Entry Size (16)
                        [3:0] : 0x4   Min CQ Entry Size (16)
                      
                      nn      : 1
                      oncs    : 0x1f
                        [5:5] : 0     Reservations Not Supported
                        [4:4] : 0x1   Save and Select Supported
                        [3:3] : 0x1   Write Zeroes Supported
                        [2:2] : 0x1   Data Set Management Supported
                        [1:1] : 0x1   Write Uncorrectable Supported
                        [0:0] : 0x1   Compare Supported
                      
                      fuses   : 0
                        [0:0] : 0     Fused Compare and Write Not Supported
                      
                      fna     : 0
                        [2:2] : 0     Crypto Erase Not Supported as part of Secure Erase
                        [1:1] : 0     Crypto Erase Applies to Single Namespace(s)
                        [0:0] : 0     Format Applies to Single Namespace(s)
                      
                      vwc     : 0x1
                        [0:0] : 0x1   Volatile Write Cache Present
                      
                      awun    : 1023
                      awupf   : 0
                      nvscc   : 1
                        [0:0] : 0x1   NVM Vendor Specific Commands uses NVMe Format
                      
                      acwu    : 0
                      sgls    : 0
                        [0:0] : 0     Scatter-Gather Lists Not Supported
                      
                      subnqn  :
                      ps    0 : mp:7.02W operational enlat:0 exlat:0 rrt:0 rrl:0
                                rwt:0 rwl:0 idle_power:- active_power:-
                      ps    1 : mp:6.30W operational enlat:0 exlat:0 rrt:1 rrl:1
                                rwt:1 rwl:1 idle_power:- active_power:-
                      ps    2 : mp:3.50W operational enlat:0 exlat:0 rrt:2 rrl:2
                                rwt:2 rwl:2 idle_power:- active_power:-
                      ps    3 : mp:0.0760W non-operational enlat:210 exlat:1200 rrt:3 rrl:3
                                rwt:3 rwl:3 idle_power:- active_power:-
                      ps    4 : mp:0.0050W non-operational enlat:2000 exlat:8000 rrt:4 rrl:4
                                rwt:4 rwl:4 idle_power:- active_power:-
                      
                      george1421G 1 Reply Last reply Reply Quote 0
                      • george1421G
                        george1421 Moderator @tlehrian
                        last edited by

                        @tlehrian Excellent you have been able to prove the case. Now we just need to paw through the data and see if we can find a unique key to identify the drives. We are done with the testing for now. We may need you to check a few more commands later if these two don’t give us a workable solution, but for now this is great!!.

                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                        T 1 Reply Last reply Reply Quote 0
                        • T
                          tlehrian @george1421
                          last edited by

                          @george1421 Cool. Glad I could help. Let me know if/when you need any other testing done.

                          george1421G 1 Reply Last reply Reply Quote 0
                          • Q
                            Quazz Moderator
                            last edited by

                            Interesting.

                            Reading up more on this subject, I suspect this issue isn’t actually specific to NVME drives, but rather to any multi drive system where one drive is larger than the other since they can initialize in ‘random’ order.

                            Of course, with traditional drives they tend to be slow enough that they initialize in a more predictable pattern I guess.

                            george1421G 1 Reply Last reply Reply Quote 0
                            • george1421G
                              george1421 Moderator @Quazz
                              last edited by

                              @Quazz Right since this know proven to happen on more than one hardware type, this is something the linux kernel developers should be working on. From the nvme printout we can surely see the drives changing location. The linux kernel developers may get around this by scanning the nvme drives and looking for a disk that has the boot blocks on it and picking that one for disk 0. In FOS’ case we aren’t trying to boot from the media. But I’m only guessing here. We surely need to do a bunch more research now that we have detailed info.

                              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                              Q 2 Replies Last reply Reply Quote 0
                              • Q
                                Quazz Moderator @george1421
                                last edited by Quazz

                                @george1421 As far as I understand, in a normal Linux install they simply use UUID of the disk to determine the order (though predictability is still low since if any device isn’t working properly, any following devices will take up preceding values)

                                eg

                                3 disks
                                /dev/sda
                                /dev/sdb
                                /dev/sdc
                                
                                disk /dev/sdb fails to load properly.
                                
                                /dev/sda
                                /dev/sdb (was /dev/sdc previously!)
                                

                                Since no such data is ever stored on FOS, it’s basically a race between the devices and whoever wins is the one on top.

                                1 Reply Last reply Reply Quote 0
                                • Q
                                  Quazz Moderator @george1421
                                  last edited by

                                  @george1421 Created an issue over at github

                                  https://github.com/FOGProject/fos/issues/27

                                  This one will need some consideration I think, not really straightforward

                                  1 Reply Last reply Reply Quote 0
                                  • george1421G
                                    george1421 Moderator @tlehrian
                                    last edited by

                                    @tlehrian Ok I have a few more tasks for you, well kind of.

                                    1. Does this lenovo computer have the latest firmware updates installed
                                    2. Does this lenovo computer’s nvme drives have the latest firmware installed? (you should be able to get disk firmware for this model of computer from the lenovo site)
                                    3. How many times of rebooting did it take to get the drive order to flip? Did you have to do a hard boot to get them to flip or was just a reboot enough?
                                    4. Does this laptop have legacy (bios) mode? If it does, do these drives change order in bios mode?

                                    So right now its not clear in my head if its a hardware, firmware (bios), or linux kernel error.

                                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                    Q T 3 Replies Last reply Reply Quote 0
                                    • Q
                                      Quazz Moderator @george1421
                                      last edited by

                                      @george1421 After reading up a bit more, it seems the conclusion is that it’s not so much about the disks themselves, but rather that the assignment is arbitrary if they’re connected to different controllers, which in the case of NVME will always be the case and in the case of SATA (on most modern consumer grade hardware at least) will not be the case.

                                      1 Reply Last reply Reply Quote 0
                                      • T
                                        tlehrian @george1421
                                        last edited by

                                        @george1421 Ok. This is on an HP Z2 G4 workstation.

                                        1. and

                                        2. I’m not sure about the firmware, but will check and let you know which firmware was installed. This will probably be tomorrow as there is a scheduled power outage for our campus tomorrow morning and we’ve shut these machines down for the day in preparation.

                                        3. it took 2 reboots to switch, and did not require a hard reboot.

                                        4. The BIOS does have legacy mode…I have not tried this in legacy mode, mainly as we are moving our dual-boot setup using GRUB2 to EFI on boot. I’ve never booted this machine in legacy mode. Secure boot is disabled.

                                        1 Reply Last reply Reply Quote 0
                                        • T
                                          tlehrian @george1421
                                          last edited by

                                          @george1421 I forgot I still had one of these up and running. It looks like the firmware for the MOBO is not at latest version, and the BIOS is also not at latest version. Not sure about NVME firmware. I think this is the 981 which is an OEM version of the Samsung 970, but not sure if the same firmware applies.

                                          george1421G 1 Reply Last reply Reply Quote 0
                                          • george1421G
                                            george1421 Moderator @tlehrian
                                            last edited by

                                            @tlehrian What I’m trying to rule out is if its a Lenovo thing that is causing the drives to switch, I don’t think it is, but I wanted to rule out hardware (or stale firmware) as the culprit.

                                            As for the question about bios (legacy) mode. I wanted to see with the same physical hardware, is the device swapping location related to the uefi firmware to see if it does the same thing in bios mode.

                                            Right now I’m looking at this as I don’t know where the issue is. So I’m trying to rule out where the problem isn’t first to get the number of possibilities down.

                                            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                            T 2 Replies Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 1 / 3
                                            • First post
                                              Last post

                                            142

                                            Online

                                            12.0k

                                            Users

                                            17.3k

                                            Topics

                                            155.2k

                                            Posts
                                            Copyright © 2012-2024 FOG Project