• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Replication Bandwidth

Scheduled Pinned Locked Moved Solved
FOG Problems
2
6
830
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H
    Hanz
    last edited by Jul 17, 2019, 12:15 PM

    I’m running 1.5.7 on Fedora 30 VM with 1 main server and 10 nodes. I’ve been using FOG since trunk began. This is the first time I remember the replication being throttled oddly. Replication bandwidth for server and nodes are set to 1Gb but images only transferring at 15-20 Mb. Any ideas on what could be causing this. Up until now the replication bandwidth has always been 1 Gb on server and 0 for nodes (which I though meant unlimited, and up until now has been replicating near the Gb speeds).

    1 Reply Last reply Reply Quote 0
    • G
      george1421 Moderator
      last edited by Jul 17, 2019, 12:40 PM

      I don’t think this is a fog issue, specifically. BUT lets look at it a bit more.

      FOG uses ftp to move files between the master node and the storage nodes. Specifically it uses the lftp utility. One of the parameters of the lftp program is bandwidth throttling (which one I don’t remember off the top of my head). Now I “think” in the fog replicator logs it will show you the lftp command with its calling parameters. Lets see if that bandwidth switch is set. If you can’t find it in the logs, when the replicator is running you can use this command to capture the lftp command and its parameters ps aux|grep lftp A ftp process needs to be happening at the exact time you run the ps command to get the parameters.

      So the first step is just to see if FOG is misbehaving poorly. The next step is to probably test by manually ftping a 100MB or larger file from the fog server to a storage node. Its possible that a change in the network infrastructure has slowed the ftp process.

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

      H 1 Reply Last reply Jul 17, 2019, 12:53 PM Reply Quote 0
      • H
        Hanz @george1421
        last edited by Hanz Jul 17, 2019, 7:00 AM Jul 17, 2019, 12:53 PM

        @george1421

        root      8682  0.1  0.1 227832  8956 ?        S    08:04   0:02 lftp -e set xfer:log 1; set xfer:log-file /opt/fog/log/fogreplicator.Sysprep-Win10EDU-X64.transfer.Shermanhigh.log;set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-rate 0:128000000; mirror -c --parallel=20 -R --ignore-time -vvv --exclude ".srvprivate" "/images/SysprepImagex64" "/images/SysprepImagex64";exit -u fogproject,wXdfeo5CxUKy x.x.x.x
        root      8734  0.1  0.1 227832  8892 ?        S    08:04   0:02 lftp -e set xfer:log 1; set xfer:log-file /opt/fog/log/fogreplicator.SysprepWin10EDU.transfer.Shermanhigh.log;set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-rate 0:128000000; mirror -c --parallel=20 -R --ignore-time -vvv --exclude ".srvprivate" "/images/SysprepWin10EDU" "/images/SysprepWin10EDU";exit -u fogproject,wXdfeo5CxUKy x.x.x.x
        root      9033  0.1  0.1 228060  9268 ?        S    08:04   0:03 lftp -e set xfer:log 1; set xfer:log-file /opt/fog/log/fogreplicator.Win10-1903-UEFI.transfer.Shermanhigh.log;set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-rate 0:128000000; mirror -c --parallel=20 -R --ignore-time -vvv --exclude ".srvprivate" "/images/Win10-1903-UEFI" "/images/Win10-1903-UEFI";exit -u fogproject,wXdfeo5CxUKy x.x.x.x
        bcs      39132  0.0  0.0 215744   896 pts/2    S+   08:52   0:00 grep --color=au to lftp
        

        This is currently while moving an image ~ 8Gb to a single node

        [Mod note] Fixed the entry for readability -Geo

        I’ll also note that this connection has and still is gigabit, and no changes to infrastructure.

        G 1 Reply Last reply Jul 17, 2019, 1:00 PM Reply Quote 0
        • G
          george1421 Moderator @Hanz
          last edited by george1421 Jul 17, 2019, 7:04 AM Jul 17, 2019, 1:00 PM

          @Hanz said in Replication Bandwidth:

          set net:limit-rate 0:128000000;

          OK from the call parameters we see this rate limit set, now I just have to decode it looks like its 128MB/s which is just above the 1GbE theoretical maximum.

          From: https://www.toysdesk.com/2013/11/lftp-limit-bandwidth-upload-download/

          set net:limit-rate 0:512000
          

          The first value in net:limit-rate is the download limit, the second number is the upload limit (after the colon), so…

          So in your case for download there is no limit, for upload its 128MB/s. So unless lftp is doing something strange it should not be rate limiting the transfer. Since this lftp command runs from the perspective of the FOG Master node the upload rate will limit the data speed coming out of the FOG Master node.

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

          1 Reply Last reply Reply Quote 0
          • G
            george1421 Moderator
            last edited by Jul 17, 2019, 1:08 PM

            So the next question would be how could we prove that it is or isn’t the fog server at fault?

            Maybe by stopping the fog replicator and killing off all of the lftp processes. Then manually copy a file from the fog server to the remote storage node. Then repeat the process from a windows computer on the same subnet as the FOG server to the same remote storage node. Basically is about creating a truth table of what works and what doesn’t. I can say with 3 ongoing ftp processes that is probably filling up the 1GbE link on your FOG server with just replication traffic. I am a bit surprised to see 3 sessions running at once since I thought the FOG replicator was serial in nature not parallel.

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

            H 1 Reply Last reply Jul 17, 2019, 1:10 PM Reply Quote 0
            • H
              Hanz @george1421
              last edited by Hanz Jul 17, 2019, 7:26 AM Jul 17, 2019, 1:10 PM

              @george1421 The weird and suspicious part is that 1.5.5 was last known working version, 1.5.7…I’ll also note that the parrallel behavior has always performed that way, at least for quite some versions now. I could actually transfer a single new captured image to all nodes at the same time and they would all run at their perspective top speeds (I have 4 schools that only have 100Mb connection and the rest are all gigabit. This speed is definitely new though.

              Note I just manually transfered the file via ftp and the speed was back to normal at another node…then on this node in question and it is indeed a port going bad or the cable itself only allowing a very minimal top speed of 10-15 Mb. This can be closed, but I appreciate the new command.

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              1 / 1
              • First post
                1/6
                Last post

              153

              Online

              12.0k

              Users

              17.3k

              Topics

              155.2k

              Posts
              Copyright © 2012-2024 FOG Project