Feature request for FOG 1.6.x - Configure image capture to use NFSv4 instead of NFSv3
-
@george1421 I’ve worked with nfsv4 for about a week now. Many times I’ve stopped in irritation that the system doesn’t work like the examples but I’ve finally have a solid configuration for NFSv4. While nfsv4 is nfs, its very different in its setup than versions of NFS less than v4. NFSv4 creates a logical file system that doesn’t necessarily align with the physical file system on the server. In some ways it makes things more complex, in other ways it allows for some pretty cool redirection.
I’ve spent about 5 evenings trying to figure out how to map our current directory structure over to the nfsv4 file system. The problem was what ever the root was shared as ro or wr the child directories shares followed what the root share permissions were and ignored the specific share level permissions that the directories were exported as. I won’t go into details on the tests, but lets say I just ignored what I thought I knew and followed the examples and it worked.
ref: https://cwiki.apache.org/confluence/display/DIRxINTEROP/NFSv4+Server+Export+SetupJust to repeat it again, nfsv4 creates a logical file system with the root of the share, what you define as the root directory in your exportfs. You only assign the fsid=0 what you are calling the root of your logical file system. For my example I’ve created the logical file system structure under /opt/fog/data directory. In this example I’ve created bind mounts to the original FOG directories to maintain backwards compatibility.
Build the nfsv4 mount points
mkdir -p /opt/fog/data/capture mkdir -p /opt/fog/data/images
Freebie: so we present the same postinitscript files for both capture and imaging.
mkdir -p /images/dev/postinitscripts mkdir -p /images/postinitscripts
Create the bind loop back directories to map the physical file locations to the nfsv4 logical file system. Note this mapping can bring in directories from random locations but present them as a unified logical filesystem in NFS. This is the part that had me stuck for several days, where I thought I new better and this was unnecessary. But without this bind mount the child directories took on the share level permission of the logical nfs root.
-> /etc/fstab /images/dev /opt/fog/data/capture none bind 0 0 /images /opt/fog/data/images none bind 0 0 /images/postinitscripts /images/dev/postinitscripts none bind 0 0
Just a placeholder for when creating the fogproject user. I would recommend that the fog installer actually set a unique user and group IP when the fog project user is created. Then we can use only the squash root to assign the files created by the root user to the unique group and user ID. Its really only adding security through obscurity so I did not add it into the install instructions here.
We’ll add the fogproject user
adduser fogproject
Then we will grab the uid and gid of the fogproject user to use in the script that adds the nfs exports
$uid = `grep fogproject /etc/passwd | cut -d ":" -f3` $gid = `grep fogproject /etc/passwd | cut -d ":" -f4`
I’m only showing the product of the exports file not how $uid and $gid is set in the file
-> /etc/exports/opt/fog/data/ *(fsid=0,no_subtree_check,crossmnt,all_squash,insecure,anonuid=1000,anongid=1000) /opt/fog/data/capture *(rw,sync,no_subtree_check,no_wdelay,no_subtree_check,insecure_locks,all_squash,insecure,anonuid=1000,anongid=1000) /opt/fog/data/images *(ro,sync,no_subtree_check,all_squash,insecure,anonuid=1000,anongid=1000)
I understand the above is a big changed from what we have before. All users that connect to the nfs share will be squash all users to the fogproject user. The locks are only needed on a writeable share
To configure the nfs server to only honor nfsv4 requests the following file needs to be edited
-> /etc/nfs.conf[nfsd] tcp=y vers2=n vers3=n vers4=y vers4.0=y vers4.1=y vers4.2=y
Once the services are restarted the NFSv4 service will only accept nfs connects on tcp port 2049 where we can craft a custom firewall rule to protect without opening a large number of ports just for NFS.
There will be some changes needed in the FOS linux OS to accommodate this new NFSv4 structure. Most notably is to update the mount command to send the connect request
To capture an image
mount -t nfs -o nfsvers=4,rw 192.168.112.14:/capture /mnt/images
To deploy an image
mount -t nfs -o nfsvers=4,rw 192.168.112.14:/images /mnt/images
As a side note you can connect to the nfs root directory with this
mount -t nfs -o nfsvers=4,rw 192.168.112.14:/capture /mnt/images
And then move between the logical shares. The nfsv4 root share is exported as ro. When you change into the capture share with
cd
the share permissions automatically change to rw. If youcd
to /images the share permissions switch to ro. If you cd up one directory the change to ro.This is what a you see when you execute these commands.
# mount -t nfs -o nfsvers=4,rw 192.168.112.14:/ /mnt/images # cd images/ # ls -la total 0 drwxr-xr-x. 4 root root 35 Dec 4 17:44 . drwxr-xr-x. 3 root root 20 Nov 30 18:49 .. drwxrwxrwx. 3 fogproject fogproject 85 Dec 4 17:52 capture drwxrwxrwx. 5 fogproject fogproject 82 Nov 29 12:05 images
Traversing these directory will change the share level permissions.
I’m still working on the ARM project so now I’ll update the FOS Linux client with this new file structure to see how well (if) it works.
-
As I continue down this rabbit hole I find that in buildroot the nfs-utils in nfs-utils.mak nfsv4 is specifically disabled.
NFS_UTILS_CONF_OPTS = \ --disable-nfsv4 \ --disable-nfsv41 \ --disable-gss \ --disable-uuid \ --enable-tirpc \ --enable-ipv6 \ --without-tcp-wrappers \ --with-statedir=/run/nfs \ --with-rpcgen=$(HOST_DIR)/bin/rpcgen HOST_NFS_UTILS_CONF_OPTS = \ --disable-nfsv4 \ --disable-nfsv41 \ --disable-gss \ --disable-uuid \ --disable-ipv6 \ --without-tcp-wrappers \ --with-statedir=/run/nfs \ --disable-caps \ --disable-tirpc \ --without-systemd \
Need to be changed to enable nfsv4 to this
NFS_UTILS_CONF_OPTS = \ --enable-nfsv4 \ --enable-nfsv41 \ --disable-gss \ --disable-uuid \ --enable-tirpc \ --enable-ipv6 \ --without-tcp-wrappers \ --with-statedir=/run/nfs \ --with-rpcgen=$(HOST_DIR)/bin/rpcgen HOST_NFS_UTILS_CONF_OPTS = \ --enable-nfsv4 \ --enaable-nfsv41 \ --disable-gss \ --disable-uuid \ --disable-ipv6 \ --without-tcp-wrappers \ --with-statedir=/run/nfs \ --disable-caps \ --disable-tirpc \ --without-systemd \
Once enabled the package and initrd need to be recompiled.
In the initrd in /bin/fog.mount in lines 17 and 19
ref: mount -o nolock,proto=tcp,rsize=32768,wsize=32768,intr,noatime “$storage” /images >/tmp/mount-output 2>&1The mount command needs to be updated to create an nfsv4 mount.
from thisup) mount -o nolock,proto=tcp,rsize=32768,wsize=32768,intr,noatime "$storage" /images >/tmp/mount-output 2>&1 ;; down) mount -o nolock,proto=tcp,rsize=32768,intr,noatime "$storage" /images >/tmp/mount-output 2>&1 ;;
to this
up) mount -o nolock,nfsvers=4,proto=tcp,rsize=32768,wsize=32768,intr,noatime "$storage" /images >/tmp/mount-output 2>&1 ;; down) mount -o nolock,nfsvers=4,proto=tcp,rsize=32768,intr,noatime "$storage" /images >/tmp/mount-output 2>&1 ;;
So far the FOG web code needs to be updated because the nfsv4 shares are presented differently than the >nfsv4.
nfsv3 share structure
/images/ /images/dev/
nfsv4 shares structure
/ /images/ /capture/
./lib/fog/bootmenu.class.php starting at 1497 https://github.com/FOGProject/fogproject/blob/171d63724131c396029992730660497d48410842/packages/web/lib/fog/bootmenu.class.php#L1497
Replacing
$storage = escapeshellcmd( sprintf( '%s:/%s/%s', $ip, trim($StorageNode->get('path'), '/'), ( $TaskType->isCapture() ? 'dev/' : '' ) ) );
to this
$storage = escapeshellcmd( sprintf( '%s:/%s', $ip, ( $TaskType->isCapture() ? 'capture/' : trim($StorageNode->get('path'), '/') ) ) );
Its still not clear if there is a benefit to the cost of modifying the code to support NFS v4 over just keeping everything the same.
-
@george1421 We are now being required to get NFSv4 utilized in our environment; have about 3 months before they shut off our server…
I have tried to follow some of the steps that you have taken to get this working. What I am confused about is where to make modifications to the FOS linux OS?
Would you be able to provide a little more details into the steps you took to get this working?
Thank you!
-
@quinniedid Wow this has been 6 months already…
This thread really wasn’t intended to be a how-to document but more of an engineering working document to see if it is really possible to run nfsv4 with FOG. The intent was to minimize the number of open ports needed to make firewall rule crafting a bit easier. NFSv4 brings some interesting but confusing additions to the fog design. What I learned was that both the FOG server and FOS Linux needed some tweaks to get things working like with NFSv3. -
Changes needed to FOS for NFSv4 support
Understand these instructions are for the FOG Devs and not the general FOG admin. You need to know the insides of FOS Linux development to understand some of my notations. I did not test with the option of
port=2049
. The hope is that is the default so it shouldn’t be needed. By defining a specific port firewall rules can be crafted much easier than with NFSv3 and earlier.- In buildroot the nfs-utils packages in the nfs-utils.mak file nfsv4 support must be enabled.
NFS_UTILS_CONF_OPTS = \ --enable-nfsv4 \ --enable-nfsv41 \ ... HOST_NFS_UTILS_CONF_OPTS = \ --enable-nfsv4 \ --enable-nfsv41 \ ...
-
nfsvers=4
must be added to the mount command in the following files in the overlay fs directory
./rootfs_overlay/bin/fog line:14
./rootfs_overlay/bin/fog.mount line:17,20
./rootfs_overlay/bin/fog.av line:15
./rootfs_overlay/bin/fog.photorec -
Done. Now rebuild initrd filesystem in buildroot
I’ve compiled a FOG 1.5.9 NFSv4 version of the initrd here: https://drive.google.com/file/d/1EHLhmM9-kXpFO7kfk3H1ydEZF3q8lID1/view?usp=sharing
-
Changes needed on FOG server to support NFSv4
- Build the nfsv4 virtual fs mount points
mkdir -p /opt/fog/data/capture mkdir -p /opt/fog/data/images
- (optional) Provide the same
./postinitscript
files for both capture and deploy.
mkdir -p /images/dev/postinitscripts mkdir -p /images/postinitscripts
- Edit the /etc/fstab to bind mount the virtual NFSv4 file system to the physical fog directories
/images/dev /opt/fog/data/capture none bind 0 0 /images /opt/fog/data/images none bind 0 0
If you included optional step #2 append this to the end of the /etc/fstab
/images/dev/postinitscripts /images/postinitscripts none bind 0 0
- Connect the virtual fs to the physical fs
mount -a
Now you should be able to run these commands to see if the mount works. Looking at
/opt/fog/data/capture
should give the same list as/images/dev
and looking at/opt/fog/data/images
should give the same list as/images
. If that is valid then move on to the next step.-
Now we need to get the gid and uid of the fog service account fogproject.
5.1 Run the following command to get fogproject’s uid:grep fogproject /etc/passwd | cut -d ":" -f3
note this value. It will most likely be 1000 or 1001 but could be anything, it depends on the host OS.
5.2 Run the following command to getfogproject
’s gid:grep fogproject /etc/passwd | cut -d ":" -f4
note this value. It will most likely be 1000 or 1001. You will need the uid and gid values in the next step. -
Edit the
/etc/exports
file. In this case we are converting FOG to only operate in NFSv4 mode. So we will remove all of the FOG NFSv3 export lines and replace them with the NFSv4 export lines. Insert the following into the/etc/exports
file.
/opt/fog/data/ *(fsid=0,no_subtree_check,insecure) /opt/fog/data/images *(ro,sync,nohide,no_subtree_check,all_squash,insecure,anonuid=1001,anongid=1001) /opt/fog/data/capture *(rw,sync,nohide,no_subtree_check,all_squash,insecure,insecure_locks,no_wdelay,anonuid=1001,anongid=1001)
Be sure to update the
anonuid=
andanongid=
values above to match the values of the uid and gid you collected in the previous step.- The last step is FOG server linux OS dependant. We will need to enable NFSv4 and disable all other NFSvX support. For Debian variants you need to do this:
7.1 Edit/etc/default/nfs-common
and make these adujustments
NEED_STATD="no" NEED_IDMAPD="yes"
7.2 Edit
/etc/default/nfs-kernel-server
Note that RPCNFSDOPTS is typically not included by default. Please add that option if it is not present.RPCNFSDOPTS="-N 2 -N 3" RPCMOUNTDOPTS="--manage-gids -N 2 -N 3"
ref: https://wiki.debian.org/NFSServerSetup
7.3 For RHEL compatible linux OS, edit
/etc/nfs.conf
and make thest changes:[nfsd] tcp=y vers2=n vers3=n vers4=y vers4.0=y vers4.1=y vers4.2=y
- Edit the following file on the FOG server. We need to have it tell FOS Linux to use
/capture
as the target directory instead of/images/dev
Use your favorite linux editor and modify this file/var/www/html/fog/lib/fog/bootmenu.class.php
starting at 1497
Changing:
$storage = escapeshellcmd( sprintf( '%s:/%s/%s', $ip, trim($StorageNode->get('path'), '/'), ( $TaskType->isCapture() ? 'dev/' : '' ) ) );
to this
$storage = escapeshellcmd( sprintf( '%s:/%s', $ip, ( $TaskType->isCapture() ? 'capture/' : trim($StorageNode->get('path'), '/') ) ) );
- Reboot the FOG server. This will check your edits as well as restart the NFS server in v4 mode.
-
@george1421 I am having a hard time understanding what is going on here. I have followed the above steps. I see that it checks the mounted file system and checks it but once we start getting to preparing backup location, I get failed…
Any ideas?
-
@quinniedid Looks like a permission issue to me. At this stage FOS just tries to create a directory and errors out if it’s not able to create the sub directory on the NFS share - see code reference.
-
@quinniedid There is two places where things could go not as planned.
In the equivelent to windows file share permissions we have the NFS shares, mapping to hopefully the fogproject user here
/opt/fog/data/ *(fsid=0,no_subtree_check,insecure) /opt/fog/data/images *(ro,sync,nohide,no_subtree_check,all_squash,insecure,anonuid=1001,anongid=1001) /opt/fog/data/capture *(rw,sync,nohide,no_subtree_check,all_squash,insecure,insecure_locks,no_wdelay,anonuid=1001,anongid=1001)
hopefully the anonuid and gid above point to the uid and gid of the fogproject user. Make sure you have the proper uid and gid for the fogproject user. What we are doing with the above commands is to make all users end up saving files as the fogproject user in the fogproject group. With NFSv3 FOS Linux would connect as root and write files as root (a bit dangerous from a security standpoint). Now all file writes and reads should be as a non-privileged user fogproject.
The next thing we need to check is to ensure the fogproject user has read/write access to /images and everything below it. Lets see the output of
ls -la /images
andls -la /images/dev
Finally what host OS is the fog server running under?
-
@george1421 Here are my settings and OS information. I am using your initrd that you uploaded, don’t know if I mentioned that or not.
From the client:
NFS debug enabled I see access denied… It seems like everything is configured correctly:
-
@quinniedid Well this is a bit troubling. From the FOG server side it looks like fogproject has access.
So lets try this. Setup a debug capture (tick the debug check box before scheduling the capture) PXE boot the target computer, that will drop you to the fos linux command prompt on the target computer. Start imaging by keying in
fog
. You will need to press enter at each break point. Eventually you will get to the error, press ctrl C to exit out. At this point /images should be mapped to /images/dev (acutally the nfsv4 mount point/capture
See if you can touch a file in/images
from the client computer.If you can’t. I ran into an anomally when setting my debian system up. The instructions I posted below I created on a centos system, then I duplicated on a debian system. I found that the order of the records in the export file changed the behavior of the readonly vs rw shares. I though it was a fluke, but maybe there is something more to this. If you can’t create a file in /capture (actually /images/dev to the FOG OS) then swap the order of the definition in the exports file to make it look like this:
/opt/fog/data/ *(fsid=0,no_subtree_check,insecure) /opt/fog/data/capture *(rw,sync,nohide,no_subtree_check,all_squash,insecure,insecure_locks,no_wdelay,anonuid=1001,anongid=1001) /opt/fog/data/images *(ro,sync,nohide,no_subtree_check,all_squash,insecure,anonuid=1001,anongid=1001)
Putting the rw directory first. Then run the command
exportfs -a
Lets see if that has an impact on image capture. It shouldn’t matter, but we are still learning here. So… -
@george1421 I have been testing quite a bit of the day with all sorts of different configurations on the server. Come to find out it appears that in the initrd that you uploaded, for some reason the NFS mount is being mounted as “Read Only File System”.
If I issue on the FOS client
mount -t nfs -o vers=4 10.0.0.5:/capture /images
I am able to then write to the/images
directory no problem… The only issue is that I still can’t continue to run the capture as it marks/images
directory as read-only again.I don’t know what to look for in FOS for the reason why it is mounting the NFS as a read-only file system?
-
@quinniedid Did you update the fog server boot.php file?
The bit that you can map to /capture and you can write the share. That is the right direction.
To help debugging on the target computer, when its in debug mode you can do this. Give root a password with
passwd
make it something simple like hello this password will be reset on the next reboot. Get the IP address of the target computer withip a s
. From there you can remote into the target computer with Putty logging in with root and hello. Once you are there you can do all of the commands, even imaging from the remote session.Now that you have a remote session run this command and show me the output
cat /proc/cmdline
Make sure that/capture
is listed in the settings and not/images
. Use Putty to copy and paste the output of the command in this thread. -
@george1421 Thank you for that SSH tip!!! Saves me from having to go into the office almost everyday to test this stuff.
I don’t have any network access into my test environment but at least I can do it from a Windows or Linux client that’s already on the network instead of the console.
This is the command result:
Now what I am unsure of is if I did make any modification to the boot.php. I will try and find that file and make the necessary modifications.
UPDATE: I was able to change the boot.php file which is what you had mentioned at the very beginning. I was thinking that was part of the initrd file but clearly it is not. It is capturing right now. After it is done, I will be testing a deploy as well.
-
@quinniedid Ok I think you did not update the FOG server file, its referencing
/images/dev
in the kernel parameters, it should be/capture
WELL for some silly reason I did not include these instructions in the steps. It was listed previously but not in the single post.
Edit
/var/www/html/fog/lib/fog/bootmenu.class.php
on the fog server starting at line 1497Replacing
$storage = escapeshellcmd( sprintf( '%s:/%s/%s', $ip, trim($StorageNode->get('path'), '/'), ( $TaskType->isCapture() ? 'dev/' : '' ) ) );
to this
$storage = escapeshellcmd( sprintf( '%s:/%s', $ip, ( $TaskType->isCapture() ? 'capture/' : trim($StorageNode->get('path'), '/') ) ) );
-
@george1421 I was able to capture and deploy!!! Thank you for helping me through this process. I am being forced to use this starting in September now.
How much modification would it take to even just allow NFSv4 to work in FOG as it stands today?
-
@quinniedid said in Feature request for FOG 1.6.x - Configure image capture to use NFSv4 instead of NFSv3:
How much modification would it take to even just allow NFSv4 to work in FOG as it stands today?
I guess I don’t understand? You have NFSv4 running to day.
From the perspective of fog supporting nfsv4 out of the box. The request was for FOG 1.6.1 to move over to nfsv4. As you see the tweaks are not that much to do (assuming if someone includes all of the instructions).NFSv4 does bring in some additional levels of security that are not available in v3. I guess if we have fog admins that are willing to test nfsv4 to ensure there are no hidden gotchas it may be an easier sell. But ultimately its up to the developers to decide to include it or not.
-
@george1421 @quinniedid Regarding this I might point to the discussion we had about replacing NFSv3 - possibly even replacing NFS altogether: https://forums.fogproject.org/topic/14772/feature-request-for-fog-1-6-x-replace-nfsv3
I won’t find the time to lead this discussion on weather FOG should move to NFSv4 or change to an entire different protocol. Though I think an in depth discussion is worthwhile before heading down one or the other road.
-
@george1421 I was able to get it working, yes. I was able to do both a capture and a deploy with NFSv4 being setup.
I now have the ability to only open two ports in the firewall for all of my FOG clients. Ports 2049/tcp and 80/tcp. I have a DHCP relay server that sits on each network, and it delivers the both ipxe.efi and default.ipxe via TFTP and everything else is done with HTTP. I haven’t figured out a way to boot not using TFTP but it seems that only works if HTTP network boot is an option for the device. This way I do not have to expose a TFTP service on the FOG server or having to use a helper to get it where it needs to be.
@Sebastian-Roth Quickly looking at that discussion it seems that maybe SSH might be the future but it comes with some performance loss and some other struggles.
I would be more than happy to just have NFSv4 be the default standard because at least that is improvement until a more in depth development and assessment can be done to do something different.
Thank you all!!!
-
@quinniedid said in Feature request for FOG 1.6.x - Configure image capture to use NFSv4 instead of NFSv3:
I haven’t figured out a way to boot not using TFTP but it seems that only works if HTTP network boot is an option for the device.
The issue is the pxe rom on the target computers. They generally only speak tftp. Once iPXE gets loaded it speaks multiple languages (tftp, http, https, nfs, AoE). The problem is getting iPXE loaded in the first place over the network.
While this isn’t a sustainable solution, you can usb boot iPXE then go 100% http{s}/nfs