• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Storage Node Re-Writing Images Daily and Crushing My Network

Scheduled Pinned Locked Moved
FOG Problems
3
7
1.2k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F
    FallingWax
    last edited by Jan 6, 2017, 8:11 PM

    Running Version 1.3.0
    SVN Revision: 6050

    While I was troubleshooting some network speed issues came across this error in the wireshark:

    [Reassembly error, protocol TCP: New fragment overlaps old data (retransmission?)]

    This was communication between my main fog and a storage node. I went and looked at the image files and it looks like they are rewriting the same images. I haven’t created anything new but the date on the images has changed everyday. This is crushing my network speeds so badly I have stop the ImageReplicator service, which immediately fixed the problem. Any help would be appreciated!

    Thanks

    W 1 Reply Last reply Jan 7, 2017, 1:43 AM Reply Quote 0
    • T
      Tom Elliott
      last edited by Jan 6, 2017, 9:18 PM

      If your network is constantly erroring out retransmission would be expected. The FOG Replication stuff is aware of what’s replicating and what’s not, so if the transmission STOPs before the files are fully copied, it might be rewritting because something is killing the connections.

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

      F 1 Reply Last reply Jan 6, 2017, 9:20 PM Reply Quote 0
      • F
        FallingWax @Tom Elliott
        last edited by Jan 6, 2017, 9:20 PM

        @Tom-Elliott Is there any logging that i might look at for the Image Replicator to see if that is what is happening?

        T 1 Reply Last reply Jan 6, 2017, 9:23 PM Reply Quote 0
        • T
          Tom Elliott @FallingWax
          last edited by Jan 6, 2017, 9:23 PM

          @FallingWax /var/log/fog/fogreplicator.log and/or /var/log/fog/fogsnapinrep.log

          Then there’s the nodes getting the files:

          /var/log/fog/fogreplicator.log.transfer.<nodename>.log
          And/or
          /var/log/fog/fogsnapinrep.log.transfer.<nodename>.log

          (There’s not much in regards to replicating)

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

          Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

          Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

          1 Reply Last reply Reply Quote 0
          • F
            FallingWax
            last edited by Jan 6, 2017, 10:01 PM

            Looks like i found three images that don’t finish replicating or don’t replicate properly and are consistently writing/deleting over and over. Those images work correctly on the Main machine so I would hesitate to remove them.

            This is what i see in the log

            | Image Name: Dell_7040_Win10_x64
            [01-06-17 5:48:56 pm] | Dell_7040_Win10_x64: No need to sync d1.mbr file to 19$
            [01-06-17 5:48:56 pm] | Dell_7040_Win10_x64: No need to sync d1.partitions fil$
            [01-06-17 5:48:56 pm] | Dell_7040_Win10_x64: No need to sync d1p1.ebr file to $
            [01-06-17 5:48:57 pm] | Dell_7040_Win10_x64: No need to sync d1p2.img file to $
            [01-06-17 5:48:57 pm] | Files do not match.
            [01-06-17 5:48:57 pm] * Deleting remote file: /images/Dell7040Win10x64/d1p3.img
            [01-06-17 5:48:58 pm] | Files do not match.
            [01-06-17 5:48:58 pm] * Deleting remote file: /images/Dell7040Win10x64/d1p4.img
            [01-06-17 5:48:58 pm] | Dell_7040_Win10_x64: No need to sync d1p5.ebr file to $
            [01-06-17 5:48:59 pm] | Dell_7040_Win10_x64: No need to sync d1p5.img file to $
            [01-06-17 5:48:59 pm] * Starting Sync Actions
            [01-06-17 5:48:59 pm] | CMD:
            lftp -e 'set ftp:list-options -a;set net:max-retries 10$

            1 Reply Last reply Reply Quote 0
            • T
              Tom Elliott
              last edited by Jan 7, 2017, 12:01 AM

              What about disk usage? Is it possible your nodes (or your main server) are maxed out on disk space?

              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

              1 Reply Last reply Reply Quote 0
              • W
                Wayne Workman @FallingWax
                last edited by Wayne Workman Jan 6, 2017, 7:46 PM Jan 7, 2017, 1:43 AM

                First check your disk usage on the remote node as Tom said. Check it with this command: df -h look for partitions with 99% or 100% usage.

                @FallingWax I remember having this problem although I can’t remember what I called the thread title… it’s here in the forums somewhere.

                But, basically I figured out that very large images were not completing replication within the grace window and the fog image replicator would just kill the old replication task and start it again.

                I brought this issue up to @Tom-Elliott at the time and he coded a fix - the fix made the image replicator aware of prior spawned lftp instances, and it would wait for those instances to complete before trying to restart them.

                Maybe something in the code base is goofed, I’m not sure. But you need to look at this setting and write down what it is:
                Web Interface -> FOG Configuration -> FOG Settings -> FOG Linux Service Sleep Times -> IMAGEREPSLEEPTIME So write that down, it’s in seconds. Next you need to go through your replication logs. Tom pointed out the places in the filesystem but they are also available via the web interface here: Web Interface -> FOG Configuration -> Log Viewer -> Image Replicator. You need to figure out if the image replication sleep time is close to when the image replicator just restarts the transfer - or not. If it’s close to when it restarts, this could mean that there’s an issue with the image replicator keeping track of lftp instances that it created. There could of course be other issues that we don’t know about so you should be extra observant when looking through all of this stuff.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                Daily Clean Installation Results:
                https://fogtesting.fogproject.us/
                FOG Reporting:
                https://fog-external-reporting-results.fogproject.us/

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                1 / 1
                • First post
                  6/7
                  Last post

                172

                Online

                12.1k

                Users

                17.3k

                Topics

                155.3k

                Posts
                Copyright © 2012-2024 FOG Project