Dell 7040 NVMe SSD Boot Issue



  • We are experiencing an issue with Dell Optiplex 7040s not booting to FOG. I’ve upgraded to the latest version of Trunk and also updated to newest boot kernel, but still no luck. It goes so quickly that it’s hard to get an error message, but I’ve attached a screenshot from a video where it gets an IP, but then stops. Other models of Optiplex’s still work (780s, 790s, 7010s). Specs are as follows:

    Dell Optiplex 7040 SFF
    Intel i7-6700 3.4 GHz
    8GB DDR4 RAM
    M.2 256GB PCIe NVMe Class 40 SSD
    AMD Radeon R6 350X

    Thank you in advance for your assistance.0_1485283811456_Screen Shot 2017-01-24 at 1.46.58 PM.png 0_1485283934778_Screen Shot 2017-01-24 at 1.51.48 PM.png


  • Moderator

    @jjacobs said in Dell 7040 NVMe SSD Boot Issue:

    @george1421, no we are using the legacy mode, i didn’t knew that the uefi mode worked in Fog :)

    OK thanks for clearing that up.

    Yes, uefi (working well) and nvme disks were added in an early 1.2.0 trunk release of FOG. With FOG 1.3.x both are fully supported and work very well. UEFI (EFI) is here to stay so its either support or die. Many new systems are coming uefi only (i.e. MS Surface Pro)



  • @george1421, no we are using the legacy mode, i didn’t knew that the uefi mode worked in Fog :)


  • Moderator

    @jjacobs Just for clarity (understand its early here in the US). Are you saying a kernel from early december worked correctly on a 7040 with an nvme disk running in uefi mode? And now you updated to svn 6061 and it failes?

    If this is the case it gives us a solid timeline.

    BUT I’m suspecting there are different circumstances since Win10, Mint 18, and Ubuntu 16.04 (actually what Mint 18 is based on) gave us the same results as the FOS Linux.



  • Hi,

    I just got a call from a colleague that we had the same problem today. I updated last Thursday to the latest svn 6061, before we used an svn from the beginning of 2016 with a kernel of December 2016.
    Before the Optiplex 7040 worked great without any changes to RAID or AHCI.

    The change to AHCI works for us, it still gives some errors but it continues.

    Kind regards,
    Johan



  • @george1421 I updated the BIOS on mine to 1.5.7. Same results.


  • Moderator

    @jburleson That is about what conclusion I came up with.

    I used linux mint. It would not find the hard drive in raid-on mode but would in achi mode. The same held true for windows 10 with in uefi mode. The nearest I can guess is that raid mode and uefi are not happy together.

    So at the end of the day if anyone needs to use an nvme disk in uefi mode they MUST switch the disk mode to achi, period.

    Thank you for testing in your environment too. I think we have a solid answer, use achi mode if you are running uefi on a Dell 7040.

    <edit> FWIW the firmware on this 7040 is 1.4.5 (which I know isn’t the latest).



  • @jburleson This seems to be a known issue or least it has been reported elsewhere.

    Here is an ArchLinux post from a year ago about the same issue.
    https://bbs.archlinux.org/viewtopic.php?id=204629

    I also found posts on superuser about linux not finding the nvme drive under UEFI with RAID on.

    Ultimately the solution was to switch to ACHI.



  • @jburleson Second test. Switch from UEFI to Legacy but left SATA Operation in RAID mode.

    You are going to like this.

    lsblk

    ubuntu@ubuntu:~$ lsblk
    NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
    sda           8:0    1  14.9G  0 disk /cdrom
    ├─sda1        8:1    1   1.4G  0 part 
    └─sda2        8:2    1   2.3M  0 part 
    loop0         7:0    0   1.4G  1 loop /rofs
    nvme0n1     259:0    0 238.5G  0 disk 
    ├─nvme0n1p1 259:1    0   450M  0 part 
    ├─nvme0n1p2 259:2    0   100M  0 part 
    ├─nvme0n1p3 259:3    0    16M  0 part 
    └─nvme0n1p4 259:4    0 237.9G  0 part 
    ubuntu@ubuntu:~$ 
    

    Onboard Hardware:

    ubuntu@ubuntu:~$ lspci -nn
    00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:191f] (rev 07)
    00:01.0 PCI bridge [0604]: Intel Corporation Sky Lake PCIe Controller (x16) [8086:1901] (rev 07)
    00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:1912] (rev 06)
    00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
    00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
    00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
    00:17.0 RAID bus controller [0104]: Intel Corporation SATA Controller [RAID mode] [8086:2822] (rev 31)
    00:1b.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Root Port #17 [8086:a167] (rev f1)
    00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
    00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
    00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
    00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
    00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
    02:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller [144d:a802] (rev 01)
    ubuntu@ubuntu:~$ 
    

    Notice the new addition

    02:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller [144d:a802] (rev 01)
    

    Kernel Drivers

    ubuntu@ubuntu:~$ lspci -k
    00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07)
    	Subsystem: Dell Skylake Host Bridge/DRAM Registers
    00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 07)
    	Kernel driver in use: pcieport
    	Kernel modules: shpchp
    00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06)
    	Subsystem: Dell Skylake Integrated Graphics
    	Kernel driver in use: i915_bpo
    	Kernel modules: i915_bpo
    00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)
    	Subsystem: Dell Sunrise Point-H USB 3.0 xHCI Controller
    	Kernel driver in use: xhci_hcd
    00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)
    	Subsystem: Dell Sunrise Point-H Thermal subsystem
    00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)
    	Subsystem: Dell Sunrise Point-H CSME HECI
    	Kernel driver in use: mei_me
    	Kernel modules: mei_me
    00:17.0 RAID bus controller: Intel Corporation SATA Controller [RAID mode] (rev 31)
    	Subsystem: Dell SATA Controller [RAID mode]
    	Kernel driver in use: ahci
    	Kernel modules: ahci
    00:1b.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Root Port #17 (rev f1)
    	Kernel driver in use: pcieport
    	Kernel modules: shpchp
    00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)
    	Subsystem: Dell Sunrise Point-H LPC Controller
    00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
    	Subsystem: Dell Sunrise Point-H PMC
    00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31)
    	Subsystem: Dell Sunrise Point-H HD Audio
    	Kernel driver in use: snd_hda_intel
    	Kernel modules: snd_hda_intel
    00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
    	Subsystem: Dell Sunrise Point-H SMBus
    	Kernel modules: i2c_i801
    00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
    	Subsystem: Dell Ethernet Connection (2) I219-LM
    	Kernel driver in use: e1000e
    	Kernel modules: e1000e
    02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)
    	Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller
    	Kernel driver in use: nvme
    	Kernel modules: nvme
    ubuntu@ubuntu:~$ 
    

    Picked up the Samsung controller here as well.



  • @george1421

    Here is what I got when I booted Ubuntu 16.04 from USB. I ran through the same commands you ran previously in the thread.

    BIOS: UEFI
    SATA Operation: RAID

    lsblk still does not show the hard drive.

    ubuntu@ubuntu:~$ lsblk
    NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    sda      8:0    1 14.9G  0 disk /cdrom
    ├─sda1   8:1    1  1.4G  0 part 
    └─sda2   8:2    1  2.3M  0 part 
    loop0    7:0    0  1.4G  1 loop /rofs
    ubuntu@ubuntu:~$ 
    

    Onboard Hardware:

    ubuntu@ubuntu:~$ lspci -nn
    00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:191f] (rev 07)
    00:01.0 PCI bridge [0604]: Intel Corporation Sky Lake PCIe Controller (x16) [8086:1901] (rev 07)
    00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:1912] (rev 06)
    00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
    00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
    00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
    00:17.0 RAID bus controller [0104]: Intel Corporation SATA Controller [RAID mode] [8086:2822] (rev 31)
    00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
    00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
    00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
    00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
    00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
    ubuntu@ubuntu:~$ 
    

    Kernel Drivers

    ubuntu@ubuntu:~$ lspci -k
    00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07)
    	Subsystem: Dell Skylake Host Bridge/DRAM Registers
    00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 07)
    	Kernel driver in use: pcieport
    	Kernel modules: shpchp
    00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06)
    	Subsystem: Dell Skylake Integrated Graphics
    	Kernel driver in use: i915_bpo
    	Kernel modules: i915_bpo
    00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)
    	Subsystem: Dell Sunrise Point-H USB 3.0 xHCI Controller
    	Kernel driver in use: xhci_hcd
    00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)
    	Subsystem: Dell Sunrise Point-H Thermal subsystem
    00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)
    	Subsystem: Dell Sunrise Point-H CSME HECI
    	Kernel driver in use: mei_me
    	Kernel modules: mei_me
    00:17.0 RAID bus controller: Intel Corporation SATA Controller [RAID mode] (rev 31)
    	Subsystem: Dell SATA Controller [RAID mode]
    	Kernel driver in use: ahci
    	Kernel modules: ahci
    00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)
    	Subsystem: Dell Sunrise Point-H LPC Controller
    00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
    	Subsystem: Dell Sunrise Point-H PMC
    00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31)
    	Subsystem: Dell Sunrise Point-H HD Audio
    	Kernel driver in use: snd_hda_intel
    	Kernel modules: snd_hda_intel
    00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
    	Subsystem: Dell Sunrise Point-H SMBus
    	Kernel modules: i2c_i801
    00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
    	Subsystem: Dell Ethernet Connection (2) I219-LM
    	Kernel driver in use: e1000e
    	Kernel modules: e1000e
    

  • Moderator

    @george1421 I had to do real work this afternoon so I had to stop.

    But the last thing I did was try to install windows 10 from the recovery disk and the windows 10 installer would not see the nvme disk in uefi mode with raid-on. When I have a bit more time I’ll see if I can boot a ubuntu live dvd and see if that will see that nvme drive.


  • Moderator

    @george1421 Second test, switch raid-on mode to achi.

    lsblk reports:

    [Wed Jan 25 root@fogclient ~]# lsblk
    NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
    nvme0n1     259:0    0 238.5G  0 disk
    |-nvme0n1p1 259:1    0     3G  0 part
    `-nvme0n1p2 259:2    0 235.5G  0 part
    

    and lspci

    [Wed Jan 25 root@fogclient ~]# lspci -nn
    00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:191f] (rev 07)
    00:01.0 PCI bridge [0604]: Intel Corporation Sky Lake PCIe Controller (x16) [8086:1901] (rev 07)
    00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:1912] (rev 06)
    00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
    00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
    00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
    00:16.3 Serial controller [0700]: Intel Corporation Sunrise Point-H KT Redirection [8086:a13d] (rev 31)
    00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
    00:1b.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Root Port #17 [8086:a167] (rev f1)
    00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
    00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
    00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
    00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
    00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
    02:00.0 Non-Volatile memory controller [0108]: Toshiba America Info Systems Device [1179:010f] (rev 01)
    [Wed Jan 25 root@fogclient ~]#
    

  • Moderator

    @george1421 On my initial test

    1. Grab a 7040 from inventory and reset bios back to factory and change mode to uefi (raid-on by default). Note system was a functional system in bios mode with a mbr image.
    2. Schedule a debug deploy task
    3. lsblk shows no hard drive, period.
    [Wed Jan 25 root@fogclient ~]# lsblk
    [Wed Jan 25 root@fogclient ~]#
    

    and onboard hardware

    [Wed Jan 25 root@fogclient ~]# lspci -nn
    00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:191f] (rev 07)
    00:01.0 PCI bridge [0604]: Intel Corporation Sky Lake PCIe Controller (x16) [8086:1901] (rev 07)
    00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:1912] (rev 06)
    00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
    00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
    00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
    00:16.3 Serial controller [0700]: Intel Corporation Sunrise Point-H KT Redirection [8086:a13d] (rev 31)
    00:17.0 RAID bus controller [0104]: Intel Corporation SATA Controller [RAID mode] [8086:2822] (rev 31)
    00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
    00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
    00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
    00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
    00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
    [Wed Jan 25 root@fogclient ~]#
    

    Now kernel drivers associated with the hardware

    [Wed Jan 25 root@fogclient ~]# lspci -k
    00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07)
            Subsystem: Dell Device 06b9
            Kernel driver in use: skl_uncore
    lspci: Unable to load libkmod resources: error -12
    00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 07)
            Kernel driver in use: pcieport
    00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06)
            Subsystem: Dell Device 06b9
    00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)
            Subsystem: Dell Device 06b9
            Kernel driver in use: xhci_hcd
    00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)
            Subsystem: Dell Device 06b9
    00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)
            Subsystem: Dell Device 06b9
    00:16.3 Serial controller: Intel Corporation Sunrise Point-H KT Redirection (rev 31)
            Subsystem: Dell Device 06b9
            Kernel driver in use: serial
    00:17.0 RAID bus controller: Intel Corporation SATA Controller [RAID mode] (rev 31)
            Subsystem: Dell Device 06b9
            Kernel driver in use: ahci
    00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)
            Subsystem: Dell Device 06b9
    00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
            Subsystem: Dell Device 06b9
    00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31)
            Subsystem: Dell Device 06b9
    00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
            Subsystem: Dell Device 06b9
            Kernel driver in use: i801_smbus
    00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
            Subsystem: Dell Device 06b9
            Kernel driver in use: e1000e
    

    This tells me the linux kernel supports the raid controller. So the hardware IS supported by linux. Something else must be not happy.

    00:17.0 RAID bus controller: Intel Corporation SATA Controller [RAID mode] (rev 31)
            Subsystem: Dell Device 06b9
         >>   Kernel driver in use: ahci
    

    Speculation: If uefi / gpt disk is not found in system then no disk is displayed.


  • Moderator

    @jburleson Well yes and no.

    While its using the raid controller its not really a raid setup.

    What I interesting in the picture is the partition name AND the device major and minor numbers. I don’t think device 43 is currently allowed.



  • Not sure if this will help any.

    mdadm -D /dev/md0 shows
    Raid Level 0
    Total Devices 0
    State Inactive

    cat /proc/mdstat shows
    Personalities: [linear] [raio0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
    unused devices: <none>]


  • Moderator

    @jburleson Well that sure is a WFT kind of picture. It DOES tell us a bit more of what we need. Why so many partitions is interesting.

    I was just about to grab a 7040 and do the same. I’ll still do that and give me something to play with over the lunch hour.



  • @Tom-Elliott

    Here is the output of lsblk.0_1485362205137_WIN_20170125_09_33_13_Pro.jpg



  • @george1421

    I modify the BIOS when the computers come in. One of the settings I change is to switch the SATA operation to AHCI.

    I switched from ipxe.efi since the Surface Pro 4 would not boot from it.

    ipxe7156.efi does not work for RAID mode either (just tested it).

    After my next appointment I will run debug and see if I can get you some additional information on it.


  • Moderator

    @Tom-Elliott IMO: The concern I have is that raid-on is the default for almost all Dell systems uefi or bios. So for every 7040 in uefi mode, the OP or IT tech will need to change the disk support method. This can be automated with Dell’s CCTK its just a pain and will continue to cause FOG support calls.

    I’ll grab a 7040 from our test lab and see if I can find a consistent answer.


  • Senior Developer

    @chrisdecker I’m going to solve the thread as we know Changing the HDD presentation type from RAID to AHCI will allow you to use the system.

    I agree with @george1421 however and would like to see what lsblk sees when the disk is in RAID mode.

    That said, I suspect it doesn’t find anything because the RAID utilities aren’t being called to even try to scan anything. That or the way the RAID is presented to the FOS System isn’t even recognized (could be driver based I suppose).


Log in to reply
 

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.