Problem Capturing right Host Primary Disk with INTEL VROC RAID1
-
@nils98 There are a few interesting things in here, but nothing remarkable. I see this is a server chassis of some kind. I also see there is sata and nvme disks in this server. A quick look of vroc and this is designed for nvme drives and not sata and this is on cpu raid.
Is your array with the sata drives /dev/sda and /dev/sdb or with the nvme drives?
I remember seeing something in the forums regarding the intel xscale processor and vmd. I need to see if I can find those posts.
For completeness, what is the manufacturer and model of this server. What is the target OS for this server. Did you setup the raid configuration in the bmc or firmware, so the drive array is already configured?
And finally if you boot a linux live cd does it properly see the raid array.
Lastly for debugging with FOS linux if you do the following you can remote into the FOS Linux system.- PXE boot into debug mode (capture or deploy)
- Get the ip address of the target computer with
ip a s
- Give root a password with
passwd
just make it something simple like hello it will be reset at next reboot. - now with putty or ssh you can connect to the fos linux engine to run commands remotely. This makes it easier to copy and paste into the fos linux engine.
-
@nils98 I don’t know your prerequisities.
Our machines get delivered with a preinstalled windows.
The RAID1 is also already assembled.
We do not create a raid 1 via mdadm in fog. Also i did not inject any drivers for VROC.
I think /dev/md0 get’s created because of the use of the kernel-parameter “mdraid=true” but it’s empty.
If you check in bios/uefi. Is there a raid 1 shown? If not can you create one? I never had problems to see my preassembled VROC raid1 with “lsblk” in debug mode.
-
@george1421 the Vroc Raid is created via the Sata /dev/sda and /dev/sdb.
The Nvme are only content discs that are later connected to Windows.The board is a Supermicro X12SPI-TF running Windows110.
Last time we created the raid with the Windows Install. But I had already created it via the bios. Actually it is already created and I don’t want to touch it with the FogI will test Linux live later.
I had already read about connecting via Putty here, thanks.@Ceregon That was exactly the same for us.
The bios shows me a Raid 1 with both SSDs and I can also create a new one there if necessary.
as you can see above, no raid is listed via “lsblk”. -
@george1421 with Debian 12 live I recognize the raid and vroc
any ideas what I can change in the VOS to make it look exactly like this?
-
@nils98 Nice, this means its possible with the FOG FOS kernel. If the linux live cd did not work then you would be SOL.
OK so lets start with (under the live image) lets run this commands.
lsmod > /tmp/modules.txt
lspci -nnk > /tmp/pcidev.txt
use scp or winscp on windows to copy these tmp files out and post them here. Also grab the /var/log/messages or /var/log/syslog and post them here. Let me take a look at them to see 1) what dynamic modules are loaded and/or the kernel modules linked to the PCIe devices.
-
@george1421 Here are the files.
Unfortunately I have not found a messages or syslog file, I have only found a boot log file in the folder. -
@nils98 Nothing is jumping out at me as to the required module. The VMD module is required for vroc and that is part of the FOG FOS build. Something I hadn’t asked you before, what version of FOG are you using and what version of the FOS Linux kernel are you using? If you pxe boot into the FOS Linux console then run
uname -a
it will print the kernel version. -
@george1421
FOG currently has version 1.5.10.16.
FOS 6.1.63
I set up the whole system a month ago. I only took over the clients from another system, which had FOG version 1.5.9.122.The Raid PC has now been added.
-
@nils98 said in Problem Capturing right Host Primary Disk with INTEL VROC RAID1:
FOS 6.1.63
OK good deal I wanted to make sure you were on the latest kernel to ensure we weren’t dealing with something old.
I rebuilt the kernel last night with what thought might be missing, then I saw that mdadm was updated so I rebuilt the entire fos linux system but it failed on the mdadm updated program. It was getting late last night so I stopped.
With the the linux kernel 6.1.63, could you pxe boot it into debug mode and then give root a password with
passwd
and collect the ip address of the target computer withip a s
then connect to the target computer using root and password you defined. Download the /var/log/messages and/or syslog if they exist. I want to see if the 6.1.63 kernel is calling out for some firmware drivers that are not in the kernel by default. If I can do a side by side with what you posted from the live linux kernel I might be able to find what’s missing. -
@george1421 here is the message file
-
@nils98 Ok there have been a few things I gleaned by looking over everything in details.
The stock FOS linux kernel looks like its working because I see this in the messages file during boot. I do see all of the drives being detected.
Mar 1 15:46:40 fogclient kern.info kernel: md: Waiting for all devices to be available before autodetect Mar 1 15:46:40 fogclient kern.info kernel: md: If you don't use raid, use raid=noautodetect Mar 1 15:46:40 fogclient kern.info kernel: md: Autodetecting RAID arrays. Mar 1 15:46:40 fogclient kern.info kernel: md: autorun ... Mar 1 15:46:40 fogclient kern.info kernel: md: ... autorun DONE.
This tells me its scanning but not finding an existing array. It would be handy to have the live CD startup file to verify that is the case.
Intel VROC is the rebranded Intel Rapid Store Technology [RSTe]
There is no setting for
CONFIG_INTEL_RST
in the current kernel configuration file: https://github.com/FOGProject/fos/blob/master/configs/kernelx64.config Its not clear if this is a problem or not, but just connecting the dots between VROC and RSTe: https://cateee.net/lkddb/web-lkddb/INTEL_RST.html I did enable it in the test kernel belowTest kernel based on linux kernel 6.6.18 (hint: newer kernel that is available via fog repo).
https://drive.google.com/file/d/12IOjoKmEwpCxumk9zF1vtQJt523t8Sps/view?usp=drive_linkTo use this kernel copy it to /var/www/html/fog/service/ipxe directory and keep its existing name. This will not overwrite the FOG delivered kernel. Now go to the FOG Web UI and go to FOG Configuration->FOG Settings and hit the expand all button. Search for bzImage, replace bzImage name with bzImage-6.6.18-vroc2 then save the settings. Note this will make all of your computers that boot into fog load this new kernel. Understand this is untested and you can always put things back by just replacing bzImage-6.6.18-vroc2 with bzImage in the fog configuration.
Now pxe boot into a debug console on the target computer.
Do the normal routine to see if lsblk and
cat /proc/mdstat
andmdm --detailed-platform
returns anything positive.If the kernel doesn’t assemble the array correctly then we will have to try to see if we can manually assemble the array using mdadm tool.
I should say that we need to ensure the array already exists before we perform these test because if the array is defunct or not created we will not see it with the above tests.
-
@george1421 Unfortunately, nothing has changed.
“mdm --detailed-platform” does not find “mdm” with “mdadm --detail-platform” it still shows the same error.
I have also searched the log files under the live system again but unfortunately found nothing.
-
@nils98 Well that’s not great news. I really thought that I had it with including the intel rst driver. Would you mind sending me the messages log from booting this new kernel? Also make sure when you are in debug mode that you run
uname -a
and make sure the kernel version is right. -
@george1421 Here are the logs
-
I apologise for not getting in touch for so long.
But I was able to find startup logs with Ubuntu live and my raid is recognised directly.
Hope the logs help. -
This post is deleted! -
Hi everyone, reading the investigations allready done gives me a feeling you got close to a fix to this.
I got the experimental vroc file from the download link earlier in this topic.
I have exactly the same issues, Intel VROC / Optane with 2 NVME in raid1.
I can see the individual nvme’s but not the raid array/volume.Is there anywhere near to be expected a fix for this?
-
@rdfeij For the record, what computer hardware do you have?
-
@george1421
SuperMicro X13SAE-F server board with Intel Optane / VROC in raid1 mode.
2x NVME in raid1. -
With me yes: in bios raid1 exists over 2 nvme’s
mdraid=true is enabledmd0 indeed is empty
lsblk only shows content on the 2 nvme but not with md0I hope this will be fixed soon, otherwise we are forced to another (WindowsPE based?) imaging platform since we get more and more VROC/Optane servers/workstations with raid enabled (industrial/security usage).
I’m willing to help out to get this solved.