• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Replication deletes file before transfer is complete

Scheduled Pinned Locked Moved Solved
Bug Reports
2
10
2.1k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • W
    Wayne Workman
    last edited by Wayne Workman Mar 5, 2016, 10:05 PM Mar 6, 2016, 3:21 AM

    r6575.

    What I’m noticing is that due to the slowness of my fog servers transfer rates, it can’t get done before the replicator checks again and determines the files aren’t the same, then the replicator deletes the remote file and starts over. It appears the time limit is 20 minutes.

    This particular file that is being re-replicated over and over happens to be the biggest one in the storage group.

    Here are some of the logs from the replicator:

    [03-05-16 9:58:55 pm] * Started sync for Image Win7
    [03-05-16 9:58:55 pm] | PID: 30391
    [03-05-16 9:58:55 pm] | Replication not complete
    [03-05-16 9:58:55 pm] * Deleting remote file: /images/Win7/d1p2.img
    [03-05-16 9:58:55 pm] | Files do not match
    [03-05-16 9:58:55 pm] | Remote File size: 7051149312
    [03-05-16 9:58:55 pm] | Local File size: 9291944405
    [03-05-16 9:58:55 pm] | Remote File: /images/Win7/d1p2.img
    [03-05-16 9:58:55 pm] | Local File: /images/Win7/d1p2.img
    
    [03-05-16 9:38:13 pm] * Started sync for Image Win7
    [03-05-16 9:38:13 pm] | PID: 15700
    [03-05-16 9:38:13 pm] | Replication not complete
    [03-05-16 9:38:13 pm] * Deleting remote file: /images/Win7/d1p2.img
    [03-05-16 9:38:13 pm] | Files do not match
    [03-05-16 9:38:13 pm] | Remote File size: 7149060096
    [03-05-16 9:38:13 pm] | Local File size: 9291944405
    [03-05-16 9:38:13 pm] | Remote File: /images/Win7/d1p2.img
    [03-05-16 9:38:13 pm] | Local File: /images/Win7/d1p2.img
    
    [03-05-16 9:07:15 pm] * Started sync for Image Win7
    [03-05-16 9:07:15 pm] | PID: 11812
    [03-05-16 9:07:15 pm] | Replication not complete
    [03-05-16 9:07:15 pm] * Deleting remote file: /images/Win7/d1p2.img
    [03-05-16 9:07:15 pm] | Files do not match
    [03-05-16 9:07:15 pm] | Remote File size: 4564320256
    [03-05-16 9:07:15 pm] | Local File size: 9291944405
    [03-05-16 9:07:15 pm] | Remote File: /images/Win7/d1p2.img
    [03-05-16 9:07:15 pm] | Local File: /images/Win7/d1p2.img
    
    [03-05-16 8:47:02 pm] * Started sync for Image Win7
    [03-05-16 8:47:02 pm] | PID: 11210
    [03-05-16 8:47:02 pm] | Replication not complete
    [03-05-16 8:47:02 pm] * Deleting remote file: /images/Win7/d1p2.img
    [03-05-16 8:47:02 pm] | Files do not match
    [03-05-16 8:47:02 pm] | Remote File size: 7483969536
    [03-05-16 8:47:02 pm] | Local File size: 9291944405
    [03-05-16 8:47:01 pm] | Remote File: /images/Win7/d1p2.img
    [03-05-16 8:47:01 pm] | Local File: /images/Win7/d1p2.img
    

    Here’s what the bandwidth looks like during the time of these logs:
    0_1457236867130_Replicator bandwidth.png

    Given, my servers are very old and slow, and given the bandwidth is capped at 100Mbps (I’ll look into that but it’s not a fog problem), this still is a potential issue for slow links, or for really large images. If the checksum is not getting done within the allowed time due to the image being 100 or 150GB in size (There is a site in my organization with an image that big), or if the image isn’t replicated in time due to a slow link, this same problem will crop up again.

    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
    Daily Clean Installation Results:
    https://fogtesting.fogproject.us/
    FOG Reporting:
    https://fog-external-reporting-results.fogproject.us/

    1 Reply Last reply Reply Quote 0
    • T
      Tom Elliott
      last edited by Mar 6, 2016, 4:01 AM

      Should be fixed, thanks for reporting.

      I had the delete happening before checking if the prior replication was still running. Added a check to break out if it is already running.

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

      1 Reply Last reply Reply Quote 2
      • W
        Wayne Workman
        last edited by Mar 6, 2016, 2:55 PM

        I’m still seeing the same issue. Last night I updated both the master and slave and then went to bed. This morning I find that replication didn’t get done still. So I re-updated both again and I’m seeing this entry in the logs every 10 minutes:

        [03-06-16 8:41:19 am] * Started sync for Image Win7
        [03-06-16 8:41:19 am] | PID: 4079
        [03-06-16 8:41:19 am] | Replication not complete
        [03-06-16 8:41:18 am] * Deleting remote file: /images/Win7/d1p2.img
        [03-06-16 8:41:18 am] | Files do not match
        [03-06-16 8:41:18 am] | Remote File size: 6607823640
        [03-06-16 8:41:18 am] | Local File size: 9291944405
        [03-06-16 8:41:17 am] | Remote File: /images/Win7/d1p2.img
        [03-06-16 8:41:17 am] | Local File: /images/Win7/d1p2.img
        

        Another thing I noticed is that the bandwidth maxes out when the d1p2.img file is checksummed. At this point in the logs below, it’s not even deleted yet but this is the moment where the bandwidth maxes out:

        [03-06-16 8:51:25 am] * Started sync for Image Win7
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 8899842
        [03-06-16 8:51:25 am] | Local File size: 8899842
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1p1.img
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1p1.img
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 190
        [03-06-16 8:51:25 am] | Local File size: 190
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1.partitions
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1.partitions
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 0
        [03-06-16 8:51:25 am] | Local File size: 0
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1.original.swapuuids
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1.original.swapuuids
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 30
        [03-06-16 8:51:25 am] | Local File size: 30
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1.original.fstypes
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1.original.fstypes
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 190
        [03-06-16 8:51:25 am] | Local File size: 190
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1.minimum.partitions
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1.minimum.partitions
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 1048576
        [03-06-16 8:51:25 am] | Local File size: 1048576
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1.mbr
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1.mbr
        [03-06-16 8:51:25 am] | Files match
        [03-06-16 8:51:25 am] | Remote File size: 1
        [03-06-16 8:51:25 am] | Local File size: 1
        [03-06-16 8:51:25 am] | Remote File: /images/Win7/d1.fixed_size_partitions
        [03-06-16 8:51:25 am] | Local File: /images/Win7/d1.fixed_size_partitions
        [03-06-16 8:51:24 am] | Image name: Win7
        

        0_1457276104529_Replicator Bandwidth 2.png

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
        Daily Clean Installation Results:
        https://fogtesting.fogproject.us/
        FOG Reporting:
        https://fog-external-reporting-results.fogproject.us/

        1 Reply Last reply Reply Quote 0
        • W
          Wayne Workman
          last edited by Mar 6, 2016, 3:04 PM

          [03-06-16 9:02:01 am] * Started sync for Image Win7
          [03-06-16 9:02:01 am] | PID: 17781
          [03-06-16 9:02:01 am] | Replication not complete
          [03-06-16 9:02:01 am] * Deleting remote file: /images/Win7/d1p2.img
          [03-06-16 9:02:01 am] | Files do not match
          [03-06-16 9:02:01 am] | Remote File size: 7127236608
          [03-06-16 9:02:01 am] | Local File size: 9291944405
          [03-06-16 9:02:01 am] | Remote File: /images/Win7/d1p2.img
          [03-06-16 9:02:01 am] | Local File: /images/Win7/d1p2.img
          

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
          Daily Clean Installation Results:
          https://fogtesting.fogproject.us/
          FOG Reporting:
          https://fog-external-reporting-results.fogproject.us/

          1 Reply Last reply Reply Quote 0
          • W
            Wayne Workman
            last edited by Wayne Workman Mar 6, 2016, 9:35 AM Mar 6, 2016, 3:28 PM

            Ok more information on this.

            I was keeping an eye on the file itself, and right before the replicator re-runs, the file is actually still growing:

            [root@application Win7]# ls -la d1p2.img
            -rw-r--r--. 1 fog fog 6725181416 Mar  6 09:22 d1p2.img
            [root@application Win7]# ls -la d1p2.img
            -rw-r--r--. 1 fog fog 6803567240 Mar  6 09:22 d1p2.img
            [root@application Win7]# ls -la d1p2.img
            -rw-r--r--. 1 fog fog 6952289792 Mar  6 09:22 d1p2.img
            [root@application Win7]# ls -la d1p2.img
            -rw-r--r--. 1 fog fog 7065146216 Mar  6 09:22 d1p2.img
            [root@application Win7]# ls -la d1p2.img
            ls: cannot access d1p2.img: No such file or directory
            [root@application Win7]# ls -la d1p2.img
            ls: cannot access d1p2.img: No such file or directory
            [root@application Win7]# ls -la d1p2.img
            ls: cannot access d1p2.img: No such file or directory
            [root@application Win7]# ls -la d1p2.img
            ls: cannot access d1p2.img: No such file or directory
            [root@application Win7]# ls -la
            total 9732
            drwxr-xr-x.  2 fog  fog      174 Mar  6 09:23 .
            drwxrwxrwx. 10 root root     228 Mar  5 16:36 ..
            -rw-r--r--.  1 fog  fog        1 Mar  5 14:40 d1.fixed_size_partitions
            -rw-r--r--.  1 fog  fog  1048576 Mar  5 14:40 d1.mbr
            -rw-r--r--.  1 fog  fog      190 Mar  5 14:40 d1.minimum.partitions
            -rw-r--r--.  1 fog  fog       30 Mar  5 14:40 d1.original.fstypes
            -rw-r--r--.  1 fog  fog        0 Mar  5 14:40 d1.original.swapuuids
            -rw-r--r--.  1 fog  fog  8899842 Mar  5 14:40 d1p1.img
            -rw-r--r--.  1 fog  fog      190 Mar  5 14:40 d1.partitions
            

            Right where you see the file become unavailable, that’s when the replicator deleted it:

            [03-06-16 9:23:01 am] * Started sync for Image Win7
            [03-06-16 9:23:01 am] | PID: 31237
            [03-06-16 9:23:01 am] | Replication not complete
            [03-06-16 9:23:01 am] * Deleting remote file: /images/Win7/d1p2.img
            [03-06-16 9:23:01 am] | Files do not match
            [03-06-16 9:23:01 am] | Remote File size: 7170686976
            [03-06-16 9:23:01 am] | Local File size: 9291944405
            [03-06-16 9:23:01 am] | Remote File: /images/Win7/d1p2.img
            [03-06-16 9:23:01 am] | Local File: /images/Win7/d1p2.img
            

            Here is the bandwidth at this exact moment, there is no change in bandwidth. Later on, the bandwidth drops off to nothing:

            0_1457278501816_Replicator Bandwidth 3.png

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
            Daily Clean Installation Results:
            https://fogtesting.fogproject.us/
            FOG Reporting:
            https://fog-external-reporting-results.fogproject.us/

            1 Reply Last reply Reply Quote 0
            • T
              Tom Elliott
              last edited by Mar 6, 2016, 4:30 PM

              found there was a bad call (cause I’m a dumba$$), in that the checker would not work the first time around. It would actually break on the wrong file which would have caused some issues. Fixed by moving the check to the start of the main calling loop and not even do anything further down if the original item is already running.

              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

              1 Reply Last reply Reply Quote 1
              • W
                Wayne Workman
                last edited by Wayne Workman Mar 6, 2016, 1:08 PM Mar 6, 2016, 7:07 PM

                I updated both servers to r6597 at about 10:55, I just got back home and found this on the bandwidth graph:

                0_1457291080274_Replicator Bandwidth 4.png

                I think the graph stopped working after an hour because my laptop went to sleep. But logging back into the fog web interface, the bandwidth is still riding at 100Mbps, which is the max on this particular link.

                This is what just now showed up in the logs, which is the same thing as before:

                [03-06-16 1:06:48 pm] * Started sync for Image Win7
                [03-06-16 1:06:46 pm] * Deleting remote file: /images/Win7/d1p2.img
                [03-06-16 1:06:46 pm] | Files do not match
                [03-06-16 1:06:46 pm] | Remote File size: 3003860200
                [03-06-16 1:06:46 pm] | Local File size: 9291944405
                [03-06-16 1:06:45 pm] | Remote File: /images/Win7/d1p2.img
                [03-06-16 1:06:45 pm] | Local File: /images/Win7/d1p2.img
                

                I’m going to reboot both servers and see if that does anything.

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                Daily Clean Installation Results:
                https://fogtesting.fogproject.us/
                FOG Reporting:
                https://fog-external-reporting-results.fogproject.us/

                1 Reply Last reply Reply Quote 0
                • W
                  Wayne Workman
                  last edited by Mar 6, 2016, 7:21 PM

                  After rebooting both master and slave nodes, I watched the logs cycle through the replicator checks 3 times, watched it say it’s deleting the d1p2 file 3 times in a row.

                  [03-06-16 1:20:08 pm] * Started sync for Image Win7
                  [03-06-16 1:20:03 pm] * Deleting remote file: /images/Win7/d1p2.img
                  [03-06-16 1:20:03 pm] | Files do not match
                  [03-06-16 1:20:03 pm] | Remote File size: 2715025408
                  [03-06-16 1:20:03 pm] | Local File size: 9291944405
                  [03-06-16 1:20:03 pm] | Remote File: /images/Win7/d1p2.img
                  [03-06-16 1:20:03 pm] | Local File: /images/Win7/d1p2.img
                  

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                  Daily Clean Installation Results:
                  https://fogtesting.fogproject.us/
                  FOG Reporting:
                  https://fog-external-reporting-results.fogproject.us/

                  1 Reply Last reply Reply Quote 0
                  • T
                    Tom Elliott
                    last edited by Mar 6, 2016, 9:02 PM

                    Still me being an idiot.

                    I had an iterator being reused within an already existing iterator.

                    This should now be fixed, (based on my testing).

                    I’m solving again.

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    W 1 Reply Last reply Mar 6, 2016, 9:13 PM Reply Quote 0
                    • W
                      Wayne Workman @Tom Elliott
                      last edited by Mar 6, 2016, 9:13 PM

                      @Tom-Elliott Confirmed fixed.

                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
                      Daily Clean Installation Results:
                      https://fogtesting.fogproject.us/
                      FOG Reporting:
                      https://fog-external-reporting-results.fogproject.us/

                      1 Reply Last reply Reply Quote 1
                      • 1 / 1
                      1 / 1
                      • First post
                        6/10
                        Last post

                      190

                      Online

                      12.1k

                      Users

                      17.3k

                      Topics

                      155.3k

                      Posts
                      Copyright © 2012-2024 FOG Project