• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Feature request for FOG 1.6.x - Replace NFSv3

    Scheduled Pinned Locked Moved Feature Request
    35 Posts 6 Posters 10.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Q
      Quazz Moderator
      last edited by Quazz

      I am personally a fan of an SSH/SCP solution. It’s a very familiar protocol, secure and pretty straightforward. SSH ports are likely already configured in firewalls as well. Also has pretty good error handling.

      Tools like socat are cool, but I think a lot of people are not very familiar with them and since you’d need SSH or the like to get it going anyway, it seems like an extra step without any clear benefit (unless I’m missing something).

      The only nod towards socat I’d give is that it is likely more reliable in network transfers, but this comes at the cost of needing another port open in the firewall.

      It also would be kind of ironic to move away from NFS because of insecure open ports only to then turn around and open an insecure port anyway lol.

      george1421G 1 Reply Last reply Reply Quote 0
      • george1421G
        george1421 Moderator @Quazz
        last edited by george1421

        @Quazz said in Feature request for FOG 1.6.x - Replace NFSv3:

        Tools like socat are cool, but I think a lot of people are not very familiar with them and since you’d need SSH or the like to get it going anyway, it seems like an extra step without any clear benefit (unless I’m missing something).

        In the initial testing performance between scp/socat/nfs is pretty much the same. Understand I was working with a 10Gb file of all zeros so I don’t know the impact of real data on the transfer speeds.

        From FOS’ perspective I kind of put nfs and ssh in one camp and socat/netcat into another. With nfs and ssh the target computer can do a push/pull of random files under the direction of the FOG code. With socat there needs to be a coordinate with the FOG server and FOS Engine because socat is a throw/catch program. I think it would be easier to use ssh as it kind of parallels the action of NFS.

        It also would be kind of ironic to move away from NFS because of insecure open ports only to then turn around and open an insecure port anyway lol.

        One option is to move FOS/FOG to nfsv4 and that consolidates everything down to a single well known port. With nfsv4 we can also introduce authentication so the NFS share won’t be just open to the world for writing. NFSv4 won’t address data security in transit, but it will help protect data at rest.

        The downside with using port 22 ssh is there may be some policies where a certain encryption structure must be used and changing the sshd in certain circumstances will break imaging. The thought would be to then spin up a new sshd server on a different port so the sshd configuration could be tightly managed by FOG.

        I’m not saying there is a right answer yet only this is what I see and protocol alone either of the methods were withing a few seconds of each other with just pure data transfer.

        1 Reply Last reply Reply Quote 1
        • S
          Sebastian Roth Moderator
          last edited by Sebastian Roth

          @george1421 said in Feature request for FOG 1.6.x - Replace NFSv3:

          The downside with using port 22 ssh is there may be some policies where a certain encryption structure must be used and changing the sshd in certain circumstances will break imaging. The thought would be to then spin up a new sshd server on a different port so the sshd configuration could be tightly managed by FOG.

          Hmmm, there are pros and cons on both sides with using default SSH on port 22 and spinning up an extra one on another port. Whichever we decide there will be setups that can’t handle it this or the other way round. So I would suggest we try to make it default to port 22 but build scripts and all in such a way that it’s fairly easy for anyone to switch to a non-standard SSH port if needed. @george1421 @Quazz What do you think?

          We’ll need to work out a proof of concept over the next weeks to see if it all works anyway.

          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

          Q 1 Reply Last reply Reply Quote 0
          • Q
            Quazz Moderator
            last edited by

            Some reading to consider: https://www.linuxjournal.com/content/encrypting-nfsv4-stunnel-tls

            Mentions SSHFS as well (even faster than clear text NFS in their tests??)

            I can’t really decide, in the end. Each approach has its own set of downsides and upsides it looks like.

            What is most important? Reliability (eg NFS restarting TCP transactions), Security (encrypting the data stream), Maintainability (KISS), Performance (NFS likely slower than SSH pipe)

            Additionally, I wonder if we would see differences in performance when we compare transfer performance of a static file vs a data stream. Or perhaps this consideration is irrelevant since more than likely the bottleneck won’t be network transfer anyway, right?

            1 Reply Last reply Reply Quote 1
            • Q
              Quazz Moderator @Sebastian Roth
              last edited by

              @Sebastian-Roth said in Feature request for FOG 1.6.x - Replace NFSv3:

              @george1421 said in Feature request for FOG 1.6.x - Replace NFSv3:

              The downside with using port 22 ssh is there may be some policies where a certain encryption structure must be used and changing the sshd in certain circumstances will break imaging. The thought would be to then spin up a new sshd server on a different port so the sshd configuration could be tightly managed by FOG.

              Hmmm, there are pros and cons on both sides with using default SSH on port 22 and spinning up an extra one on another port. Whichever we decide there will be setups that can’t handle it this or the other way round. So I would suggest we try to make it default to port 22 but build scripts and all in such a way that it’s fairly easy for anyone to switch to a non-standard SSH port if needed. @george1421 @Quazz What do you think?

              We’ll need to work out a proof of concept over the next weeks to see if it all works anyway.

              I agree with trying to stick to 22 where possible, but to make it configurable. I can imagine some environments have custom ports.

              1 Reply Last reply Reply Quote 0
              • george1421G
                george1421 Moderator
                last edited by

                Updated benchmarks. FOG Server 1.5.9 w/kernel 4.19.145(guess) running on Dell o7010. Target computer Dell o7010 both server and target have ssd sata drives. All copy tests use a 10GB file.

                Make 10GB file on target computer to FOG hard drive over NFS

                # time dd if=/dev/zero of=r10-1gb.img count=1024 bs=104857601024+0 records in
                1024+0 records out
                10737418240 bytes (11 GB, 10 GiB) copied, 93.0698 s, 115 MB/s
                real    1m33.072s
                user    0m0.013s
                sys     0m4.699s
                

                Copy file using scp to FOG server x3 includes entering root password on FOG server

                # time scp /mnt/t2/r10gb.img root@192.168.10.1:/images/r11gb.img
                The authenticity of host '192.168.10.1 (192.168.10.1)' can't be established.
                ECDSA key fingerprint is SHA256:OpIsFYWVDCr/ovMlmPPSl46jpT332P3+BHnchdxzTCI.
                Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
                Warning: Permanently added '192.168.10.1' (ECDSA) to the list of known hosts.
                root@192.168.10.1's password:
                r10gb.img                                                    100%   10GB 111.1MB/s   01:32
                real    1m43.380s
                user    0m44.117s
                sys     0m12.580s
                
                # time scp /mnt/t2/r10gb.img root@192.168.10.1:/images/r11gb.img
                root@192.168.10.1's password:
                r10gb.img                                                    100%   10GB 111.1MB/s   01:32
                real    1m35.493s
                user    0m44.476s
                sys     0m12.223s
                
                # time scp /mnt/t2/r10gb.img root@192.168.10.1:/images/r11gb.img
                root@192.168.10.1's password:
                r10gb.img                                                    100%   10GB 111.1MB/s   01:32
                real    1m35.447s
                user    0m44.404s
                sys     0m11.946s
                

                Timing using piping over ssh instead of scp

                # time cat /mnt/t2/r10gb.img | ssh root@192.168.10.1 "cat > /images/r12gb.img"
                root@192.168.10.1's password:
                real    1m36.133s
                user    0m43.906s
                sys     0m11.090s
                
                # time cat /mnt/t2/r10gb.img | ssh root@192.168.10.1 "cat > /images/r12gb.img"
                root@192.168.10.1's password:
                real    1m36.794s
                user    0m43.751s
                sys     0m12.099s
                

                While the cpu load is heavier on both the target computer and the FOG server using ssh the actual copy times almost identical between nfs, scp, and ssh. Just the CPU load increased when sshd was involved.

                Tom ElliottT 1 Reply Last reply Reply Quote 0
                • Tom ElliottT
                  Tom Elliott @george1421
                  last edited by

                  @george1421 Would it be better to use SCP or RSYNC?

                  Can you run an example using RSYNC to establish the “SSH” connection and transfer to see what the FOG Server and Client load looks like?

                  I think you’ll see the same types of speeds. I think part of the issue with the cat pipe cat “load” is due mostly to the 2 processes being opened plus the addition of the SSH establishment.

                  If we are just looking to test ssh, scp is the best tool for the job, though rsync will probably give us more configuration options.

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                  Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                  Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                  george1421G 1 Reply Last reply Reply Quote 0
                  • george1421G
                    george1421 Moderator
                    last edited by george1421

                    Interesting, I repeated the same test with the 5.6.18 kernel and got faster transfer times.

                    Kernel 5.6.18
                    Straight file copy over NFS

                    # time cp r10gb.img /mnt/t2/                                
                    real    0m46.336s
                    user    0m0.052s
                    sys     0m7.169s
                    
                    # time cp r10gb.img /mnt/t2/
                    real    0m48.108s
                    user    0m0.045s
                    sys     0m8.881s
                    
                    

                    Now scp

                    # time scp /mnt/t2/r10gb.img root@192.168.10.1:/images/r11gb.img
                    root@192.168.10.1's password:
                    r10gb.img                                                    100% 6875MB 111.1MB/s   01:01
                    real    1m5.796s
                    user    0m29.704s
                    sys     0m6.750s
                    

                    Now piped over ssh

                    # time cat /mnt/t2/r10gb.img | ssh root@192.168.10.1 "cat > /images/r12gb.img"
                    root@192.168.10.1's password:
                    real    1m5.241s
                    user    0m29.134s
                    sys     0m6.849s
                    
                    # I had to repeat it a second time just to confirm it was actually 30 
                    #seconds improvement
                    #
                    # time cat /mnt/t2/r10gb.img | ssh root@192.168.10.1 "cat > /images/r12gb.img"
                    root@192.168.10.1's password:
                    
                    real    1m6.662s
                    user    0m29.833s
                    sys     0m6.966s
                    

                    So for a straight nfs copy kerne 5.6.18 is about 45 seconds faster copying the file. For the ssh route it was about 30 seconds faster with 5.6.18 over 4.19.145

                    1 Reply Last reply Reply Quote 0
                    • george1421G
                      george1421 Moderator @Tom Elliott
                      last edited by george1421

                      @Tom-Elliott said in Feature request for FOG 1.6.x - Replace NFSv3:

                      Would it be better to use SCP or RSYNC?

                      I don’t know the answer at the moment but I can/will surely test it. I have some screen shots of CPU loading while doing these transfers with 5.6.18 kernel. I setup rsyncd on one of my servers and I’m using it to evacuate a second physical server of data. It seems pretty fast moving 3.5GB image files. Just for disclosure this is on a 10GbE network

                      3,515,218,762,752  20%  176.05MB/s    5:17:22 (xfr#70, to-chk=213/284)
                      

                      If ssh/encryption route is decided I want to look into the kernel to ensure it has all of the crypto APIs enabled and if enabled do they have an impact on transport times.

                      1 Reply Last reply Reply Quote 0
                      • S
                        Sebastian Roth Moderator
                        last edited by

                        @george1421 said in Feature request for FOG 1.6.x - Replace NFSv3:

                        root@192.168.10.1's password:
                        r10gb.img                                                    100% 6875MB 111.1MB/s   01:01
                        real    1m5.796s
                        

                        I assume something went wrong with the test file here. You seem to get faster copy because the file is smaller - 6875 MB vs. 10 GB in the last tests. Transfer rate in scp was and is around 111 MB/s!

                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                        1 Reply Last reply Reply Quote 0
                        • S
                          Sebastian Roth Moderator
                          last edited by

                          @Tom-Elliott said in Feature request for FOG 1.6.x - Replace NFSv3:

                          Would it be better to use SCP or RSYNC?

                          In essence we need something that is able to pipe contents of a single file to partclone for writing to disk or the other way round. I don’t see how rsync (used for many files) or scp would help us to do this. While you can actually scp into/from stdin/out I can’t see this being much of a gain compared to using sshfs where we mount the remote filesystem directly.

                          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                          1 Reply Last reply Reply Quote 0
                          • george1421G
                            george1421 Moderator
                            last edited by george1421

                            This post is deleted!
                            1 Reply Last reply Reply Quote 0
                            • S
                              Sebastian Roth Moderator
                              last edited by Sebastian Roth

                              @george1421 said in Feature request for FOG 1.6.x - Replace NFSv3:

                              Their testing shows that ubuntu 20.04 moves data the fastes, then 18.04, Centos 8 and finally Cento 7 is the slowest.

                              What protocol are they using? Some proprietary stuff I’d imagine. That would break it down to subsystem IO being faster on newer kernals and Ubuntu leveraging some kind of optimized IO?!

                              FOG Server ssh pipeline

                              That picture shows both a scp and ssh command. So either one is spawned from the other (kind of likely when I look at the many command line options and PIDs) or you can two commands in parallel. The headline “ssh pipeline” doesn’t fit I would think.

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              george1421G 1 Reply Last reply Reply Quote 0
                              • george1421G
                                george1421 Moderator @Sebastian Roth
                                last edited by

                                @Sebastian-Roth I’m going to redo those stats this morning and delete the first ones. I do this I botched getting pictures aligned with the test. I’ll fully document the testing protocol so it can be duplicated if we need verification.

                                Transfer rate in scp was and is around 111 MB/s!

                                Understand both the fog server and target computer are on an isolated network with their main task is being file transfer and not servicing 100s of client computers with the fog client installed. The 111MB/s tells me the scp speed is being bottle-necked by the network (1GbE ~= 125MB/s theoretical max). I’ll test local and remote write speeds on each system this morning.

                                1 Reply Last reply Reply Quote 0
                                • george1421G
                                  george1421 Moderator
                                  last edited by george1421

                                  Testing systems Dell o7010 both fog server and client computer. Both systems have local ssd sata drives. The target computer is running a customized linux kernel 5.6.18 and a customized init but both as based on FOG 1.5.9. The customization was done to aid in debugging and bench-marking the systems.

                                  Testing script

                                  mkdir /mnt/locdsk
                                  mount /dev/sda1 /mnt/locdsk
                                  mkdir /images
                                  mount -o nolock,proto=tcp,rsize=32768,wsize=32768,intr,noatime "192.168.10.1:/images/dev" /images 
                                  
                                  #Test 1 creation of local and remote file by target computer
                                  time dd if=/dev/zero of=/mnt/locdsk/L10gb.img count=1024 bs=10485760
                                  time dd if=/dev/zero of=/images/R10gb.img count=1024 bs=10485760
                                  
                                  #Test 2 cp files to and from server
                                  time cp /mnt/locdsk/L10gb.img /images
                                  time cp /mnt/locdsk/L10gb.img /images/L10gb-1.img
                                  
                                  time cp /images/R10gb.img /mnt/locdsk
                                  time cp /images/R10gb.img /mnt/locdsk/R10gb-1.img
                                  
                                  #Test 3 scp files to and from server
                                  time scp /mnt/locdsk/L10gb.img root@192.168.10.1:/images/L10gb-2.img
                                  time scp /mnt/locdsk/L10gb.img root@192.168.10.1:/images/L10gb-3.img
                                  
                                  time scp root@192.168.10.1:/images/dev/R10gb.img /mnt/locdsk/R10gb-2.img
                                  time scp root@192.168.10.1:/images/dev/R10gb.img /mnt/locdsk/R10gb-3.img
                                  
                                  #Test 4 ssh pipeline to and from server
                                  time cat /mnt/locdsk/L10gb.img | ssh root@192.168.10.1 "cat > /images/L10gb-4.img"
                                  time cat /mnt/locdsk/L10gb.img | ssh root@192.168.10.1 "cat > /images/L10gb-5.img"
                                  
                                  time ssh root@192.168.10.1 "cat /images/dev/R10gb.img" | cat > /mnt/locdsk/L10gb-6.img
                                  time ssh root@192.168.10.1 "cat /images/dev/R10gb.img" | cat > /mnt/locdsk/L10gb-7.img
                                  

                                  Testing results as captured.

                                  ## Building the test files both local and remote
                                  # time dd if=/dev/zero of=/mnt/locdsk/L10gb.img count=1024 bs=10485760
                                  10737418240 bytes (11 GB, 10 GiB) copied, 20.2216 s, 531 MB/s
                                  **real    0m20.223s	user    0m0.001s	sys     0m6.460s
                                  
                                  # time dd if=/dev/zero of=/images/R10gb.img count=1024 bs=10485760
                                  10737418240 bytes (11 GB, 10 GiB) copied, 93.3867 s, 115 MB/s
                                  **real    1m33.390s	user    0m0.003s	sys     0m5.369s
                                  
                                  ## Confirm that files exist and are properly sized
                                  # ls -la /mnt/locdsk/
                                  total 10485785
                                  drwxr-xr-x 3 root root        4096 Oct  9 08:25 .
                                  drwxr-xr-x 3 root root        1024 Oct  9 08:23 ..
                                  -rw-r--r-- 1 root root 10737418240 Oct  9 08:26 L10gb.img
                                  drwx------ 2 root root       16384 Jan 10  2013 lost+found
                                  
                                  # ls -la /images/
                                  total 10519109
                                  drwxrwxrwx  3 sshd root          63 Oct  9  2020 .
                                  drwxr-xr-x 19 root root        1024 Oct  9 08:23 ..
                                  -rwxrwxrwx  1 sshd root           0 Sep 28 13:36 .mntcheck
                                  -rw-r--r--  1 root root 10737418240 Oct  9  2020 R10gb.img
                                  drwxrwxrwx  2 sshd root          26 Sep 28 13:36 postinitscripts
                                  
                                  ### Copy Local to Remote ###
                                  # time cp /mnt/locdsk/L10gb.img /images
                                  ** real    1m34.821s	user    0m0.083s	sys     0m7.314s
                                  
                                  # time cp /mnt/locdsk/L10gb.img /images/L10gb-1.img
                                  **real    1m34.759s	user    0m0.046s	sys     0m6.801s
                                  

                                  cp_local_remote_client.png
                                  cp_local_remote_server.png

                                  ### Copy Remote to Local ###
                                  # time cp /images/R10gb.img /mnt/locdsk
                                  **real    1m41.710s	user    0m0.084s	sys     0m11.327s
                                  
                                  # time cp /images/R10gb.img /mnt/locdsk/R10gb-1.img
                                  **real    1m41.520s	user    0m0.095s	sys     0m11.392s
                                  

                                  cp_remote_local_client.png
                                  cp_remote_local_server.png

                                  ### SCP Local to Remote ###
                                  # time scp /mnt/locdsk/L10gb.img root@192.168.10.1:/images/L10gb-2.img
                                  The authenticity of host '192.168.10.1 (192.168.10.1)' can't be established.
                                  ECDSA key fingerprint is SHA256:OpIsFYWVDCr/ovMlmPPSl46jpT332P3+BHnchdxzTCI.
                                  Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
                                  Warning: Permanently added '192.168.10.1' (ECDSA) to the list of known hosts.
                                  root@192.168.10.1's password:
                                  L10gb.img                                                      100%   10GB 110.0MB/s   01:33
                                  **real    1m40.007s	user    0m44.460s	sys     0m13.378s
                                  
                                  # time scp /mnt/locdsk/L10gb.img root@192.168.10.1:/images/L10gb-3.img
                                  root@192.168.10.1's password:
                                  L10gb.img                                                      100%   10GB 109.5MB/s   01:33
                                  **real    1m37.404s	user    0m44.420s	sys     0m13.068s
                                  

                                  scp_local_remote_client.png
                                  scp_local_remote_server.png

                                  ### SCP Remote to Local ###
                                  # time scp root@192.168.10.1:/images/dev/R10gb.img /mnt/locdsk/R10gb-2.img
                                  root@192.168.10.1's password:
                                  R10gb.img                                                      100%   10GB 101.9MB/s   01:40
                                  **real    1m44.166s	user    0m43.986s	sys     0m22.887s
                                  
                                  # time scp root@192.168.10.1:/images/dev/R10gb.img /mnt/locdsk/R10gb-3.img
                                  root@192.168.10.1's password:
                                  R10gb.img                                                      100%   10GB 102.0MB/s   01:40
                                  **real    1m44.620s	user    0m43.437s	sys     0m23.061s
                                  

                                  scp_remote_local_client.png
                                  scp_remote_local_server.png

                                  ### SSH Pipeline Local to Remote ###
                                  # time cat /mnt/locdsk/L10gb.img | ssh root@192.168.10.1 "cat > /images/L10gb-4.img"
                                  root@192.168.10.1's password:
                                  **real    1m35.562s	user    0m42.701s	sys     0m12.975s
                                  
                                  # time cat /mnt/locdsk/L10gb.img | ssh root@192.168.10.1 "cat > /images/L10gb-5.img"
                                  root@192.168.10.1's password:
                                  **real    1m35.749s	user    0m43.478s	sys     0m11.166s
                                  

                                  ssh_local_remote_client.png
                                  ssh_local_remote_server.png

                                  ### SSH Pipeline Remote to Local ###
                                  # time ssh root@192.168.10.1 "cat /images/dev/R10gb.img" | cat > /mnt/locdsk/L10gb-6.img
                                  root@192.168.10.1's password:
                                  **real    1m43.745s	user    0m44.738s	sys     0m20.828s
                                  
                                  # time ssh root@192.168.10.1 "cat /images/dev/R10gb.img" | cat > /mnt/locdsk/L10gb-7.img
                                  root@192.168.10.1's password:
                                  **real    1m43.564s	user    0m43.976s	sys     0m21.966s
                                  

                                  ssh_remote_local_client.png
                                  ssh_remote_local_server.png

                                  1 Reply Last reply Reply Quote 1
                                  • 1
                                  • 2
                                  • 2 / 2
                                  • First post
                                    Last post

                                  184

                                  Online

                                  12.3k

                                  Users

                                  17.4k

                                  Topics

                                  155.8k

                                  Posts
                                  Copyright © 2012-2025 FOG Project