• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Storage Node Re-Writing Images Daily and Crushing My Network

    Scheduled Pinned Locked Moved
    FOG Problems
    3
    7
    1.2k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • FallingWaxF
      FallingWax
      last edited by

      Running Version 1.3.0
      SVN Revision: 6050

      While I was troubleshooting some network speed issues came across this error in the wireshark:

      [Reassembly error, protocol TCP: New fragment overlaps old data (retransmission?)]

      This was communication between my main fog and a storage node. I went and looked at the image files and it looks like they are rewriting the same images. I haven’t created anything new but the date on the images has changed everyday. This is crushing my network speeds so badly I have stop the ImageReplicator service, which immediately fixed the problem. Any help would be appreciated!

      Thanks

      Wayne WorkmanW 1 Reply Last reply Reply Quote 0
      • Tom ElliottT
        Tom Elliott
        last edited by

        If your network is constantly erroring out retransmission would be expected. The FOG Replication stuff is aware of what’s replicating and what’s not, so if the transmission STOPs before the files are fully copied, it might be rewritting because something is killing the connections.

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        FallingWaxF 1 Reply Last reply Reply Quote 0
        • FallingWaxF
          FallingWax @Tom Elliott
          last edited by

          @Tom-Elliott Is there any logging that i might look at for the Image Replicator to see if that is what is happening?

          Tom ElliottT 1 Reply Last reply Reply Quote 0
          • Tom ElliottT
            Tom Elliott @FallingWax
            last edited by

            @FallingWax /var/log/fog/fogreplicator.log and/or /var/log/fog/fogsnapinrep.log

            Then there’s the nodes getting the files:

            /var/log/fog/fogreplicator.log.transfer.<nodename>.log
            And/or
            /var/log/fog/fogsnapinrep.log.transfer.<nodename>.log

            (There’s not much in regards to replicating)

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

            1 Reply Last reply Reply Quote 0
            • FallingWaxF
              FallingWax
              last edited by

              Looks like i found three images that don’t finish replicating or don’t replicate properly and are consistently writing/deleting over and over. Those images work correctly on the Main machine so I would hesitate to remove them.

              This is what i see in the log

              | Image Name: Dell_7040_Win10_x64
              [01-06-17 5:48:56 pm] | Dell_7040_Win10_x64: No need to sync d1.mbr file to 19$
              [01-06-17 5:48:56 pm] | Dell_7040_Win10_x64: No need to sync d1.partitions fil$
              [01-06-17 5:48:56 pm] | Dell_7040_Win10_x64: No need to sync d1p1.ebr file to $
              [01-06-17 5:48:57 pm] | Dell_7040_Win10_x64: No need to sync d1p2.img file to $
              [01-06-17 5:48:57 pm] | Files do not match.
              [01-06-17 5:48:57 pm] * Deleting remote file: /images/Dell7040Win10x64/d1p3.img
              [01-06-17 5:48:58 pm] | Files do not match.
              [01-06-17 5:48:58 pm] * Deleting remote file: /images/Dell7040Win10x64/d1p4.img
              [01-06-17 5:48:58 pm] | Dell_7040_Win10_x64: No need to sync d1p5.ebr file to $
              [01-06-17 5:48:59 pm] | Dell_7040_Win10_x64: No need to sync d1p5.img file to $
              [01-06-17 5:48:59 pm] * Starting Sync Actions
              [01-06-17 5:48:59 pm] | CMD:
              lftp -e 'set ftp:list-options -a;set net:max-retries 10$

              1 Reply Last reply Reply Quote 0
              • Tom ElliottT
                Tom Elliott
                last edited by

                What about disk usage? Is it possible your nodes (or your main server) are maxed out on disk space?

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                1 Reply Last reply Reply Quote 0
                • Wayne WorkmanW
                  Wayne Workman @FallingWax
                  last edited by Wayne Workman

                  First check your disk usage on the remote node as Tom said. Check it with this command: df -h look for partitions with 99% or 100% usage.

                  @FallingWax I remember having this problem although I can’t remember what I called the thread title… it’s here in the forums somewhere.

                  But, basically I figured out that very large images were not completing replication within the grace window and the fog image replicator would just kill the old replication task and start it again.

                  I brought this issue up to @Tom-Elliott at the time and he coded a fix - the fix made the image replicator aware of prior spawned lftp instances, and it would wait for those instances to complete before trying to restart them.

                  Maybe something in the code base is goofed, I’m not sure. But you need to look at this setting and write down what it is:
                  Web Interface -> FOG Configuration -> FOG Settings -> FOG Linux Service Sleep Times -> IMAGEREPSLEEPTIME So write that down, it’s in seconds. Next you need to go through your replication logs. Tom pointed out the places in the filesystem but they are also available via the web interface here: Web Interface -> FOG Configuration -> Log Viewer -> Image Replicator. You need to figure out if the image replication sleep time is close to when the image replicator just restarts the transfer - or not. If it’s close to when it restarts, this could mean that there’s an issue with the image replicator keeping track of lftp instances that it created. There could of course be other issues that we don’t know about so you should be extra observant when looking through all of this stuff.

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                  Daily Clean Installation Results:
                  https://fogtesting.fogproject.us/
                  FOG Reporting:
                  https://fog-external-reporting-results.fogproject.us/

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post

                  228

                  Online

                  12.0k

                  Users

                  17.3k

                  Topics

                  155.2k

                  Posts
                  Copyright © 2012-2024 FOG Project