NVMe madness
-
@Sebastian-Roth
ThanksSome of the images were initially captured on the 1.5.9 RC candidates 11 - 16 from the dev branch. One new image was captured on 1.5.9 final dev branch, (just a couple days ago), neither image directory had the .size files.
I have created d1.size and d2.size files inside the directories of the 2 NVMe based images using the values from the blockdev --getsize64 output
-
Wow. After creating the .size files I captured an image from each NVMe drive to their corresponding existing image. the .size files are now gone. Did I need to change the owner of the .size files to root?
Anyways so I updated FOG dev build to the latest as of today 1.5.9.29 captured an image from the NVMe. The .size files weren’t created. Maybe this is indicative of another problem?
-
@Fog_Newb I have obviously missed your last post three days ago, whooos.
Some of the images were initially captured on the 1.5.9 RC candidates 11 - 16 from the dev branch. One new image was captured on 1.5.9 final dev branch, (just a couple days ago), neither image directory had the .size files.
So far the size thing is only implemented in “All disk” capture and deploy mode. So this is probably why you don’t have any of the size files so far. As I said, I will try to add it to single disk mode as well but it will take a bit of time.
After creating the .size files I captured an image from each NVMe drive to their corresponding existing image. the .size files are now gone.
See my comment above. Every time you re-capture in single disk mode the size files are gone.
Did I need to change the owner of the .size files to root?
You might want to do that but it doesn’t make a difference.
Anyways so I updated FOG dev build to the latest as of today 1.5.9.29 captured an image from the NVMe. The .size files weren’t created.
You need to be a bit more patient. I will let you know in this topic as soon as I have something to test with.
-
No worries I didn’t expect the fix to be available when I updated. I was just testing/wondering about .size files and why they were missing and or not being created automatically. I didn’t know they were only created in all disk mode and thought maybe something else was wrong. Thanks for clearing that up.
-
-
@Fog_Newb Not yet, sorry.
-
@Fog_Newb Finally found some time to work on this. At first I was hoping to come up with kind of a logic that would detect disk size and select the correct disk according to this. As I said we have this working for images set to multiple disk type.
But half way into it I figured that my solution would only work for deployments where we have the size information from capturing beforehand. But this wouldn’t solve the issue of NVMe drives switching position on capturing in the first place.
The only way we can solve this is by telling FOS to look for a disk of certain size when running a task. So far we only allowed the host setting Host Primary Disk to name a Linux device file to use - e.g.
/dev/nvme0n1
. But now I added the functionality for you to specify disk size (byte count) using this same field.So you just need to do is download the modified init.xz and put into
/var/www/html/fog/service/ipxe/
directory (rename the original file beforehand) and set Host Primary Disk to an integer value matching exactly the byte count of the disk (value you get fromblockdev --getsize64 /dev/...
).Please give it a try and let me knwo what you think,
-
Thanks. I will test as soon as I can. Probably middle of the night or early tomorrow.
-
I copied over the updated init.xz to
/var/www/html/fog/service/ipxe/
Then set the host primary disk to 1000204886016 and attempted to capture
It worked great
Thank you very much
-
@Fog_Newb said in NVMe madness:
Then set the host primary disk to 1000204886016 and attempted to capture
So this was capturing the one tera byte disk, right?
-
Yes.
-
So yes, this is a perfect solution since Primary host disk can now be set by size. I have one image for the OS disk, and one for the “D” drive. I just switch the Primary Host disk setting depending on which image I want to capture or deploy.
-
This is committed to the fos repo and will be available in the next release.
@Tom-Elliott Should we re-phrase the name of that field or just leave it like that and add this information to the documentation?
-
@Sebastian-Roth I think leave as is but update documentation.
-
@JJ-Fullmer @Jurgen-Goedbloed Would one of you take care of adding this new feature to the documentation?
-
@Sebastian-Roth hmmm what about a new field specifically designed for the size? If hard drive specified default to that, else if size specified match drive based on that?
-
@Tom-Elliott But why have another file if they can’t be used in combination anyway? It’s either or!
-
@Sebastian-Roth that’s what I mean. If you have specified the hdd use that. If hdd not found but you have size defined, try to find matching size. If both or either defined but not exist error out.
-
@Tom-Elliott I kind of like that idea of a fallback mechanism, but I fear that it would kind of nullify what it’s trying to solve in a way.
Few people would know about the NVME naming scheme being essentially random on each boot and thus may presumably fill in the name of the drive. This would then get used and the size (if filled in) would be ignored.
I think if it were to be implemented as such, then the GUI should inform to use size over name for NVME drives.
PS: What happens when you have 2 identical drives?
-
@Quazz Right, using a second field will just allow for people to confuse things. While we try to give people more responsibility instead of trying to solve all things within the code I don’t think adding another field is valuable in this term. It’s a simple case of either or in one field and there is no situation where specifying both size and dev name at the same time makes sense.
PS: What happens when you have 2 identical drives?
I have thought about this while implementing the multi disk NVMe code and this new stuff as well but I don’t think there is any possible way we can help people with two identical sized disks. If we store disk identifiers (globally unique) it cannot be deployed to a different target machine. As far as I see size is the only parameter helping us to solve this dilemma.