• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Replication runaway on one storage node

    Scheduled Pinned Locked Moved Solved
    FOG Problems
    3
    7
    2.1k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      Kiweegie
      last edited by

      HI all

      apologies first of all - I’ve posted so many problems and questions lately I’m beginning to feel like a spammer 🙂

      Due to lots of help from Tom and co I have a fully functional FOG Server once more running on latest trunk. That one is now the single master node for our head office. What used to be the head office storage node I recently rebuilt (due issues with the storage node installer) as a master node serving our remote sites.

      I have set up the remote sites as storages nodes within this master server and limited the bandwidth on each storage node to 500kbps to avoid saturating the links.

      There is only a single image on this server of 6.9GB which has already been replicated successfully to all 8 remote storage nodes.

      I noticed this evening on our PRTG server that there was really high network utilisation from the remote master node to a single storage node which is eating up 50% of our head office link. Given that the image is already on the storage node should logic not kick in to stop replication from overwriting the image again? I’m assuming that’s what’s happening.

      I bounced the master server and no change then did same on the node and that seems to have done the trick. Curious to know how the replication code works so I can more accurately troubleshoot should this happen again. And how robust everyone is finding it to be?

      cheers, Kiweegie.

      1 Reply Last reply Reply Quote 0
      • Tom ElliottT
        Tom Elliott
        last edited by

        I’m half tempted to find out first if there was another problem altogether. I say this because there is checking. So the only way I can think of, assuming the actual replication service is fine based on the other nodes receiving the image, is the image is transferred but at the remote site the image was corrupted. So every cycle would cause it to try to replacing the file. Maybe HDD on other side was having an issue? Just thinking.

        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        K 2 Replies Last reply Reply Quote 0
        • K
          Kiweegie @Tom Elliott
          last edited by

          @Tom-Elliott Thanks Tom - checked this morning when I came into work and while the bandwidth wasn’t spiking above the 500Kbps limit it was still copying to the same storage node even though image was there already. Server and disk appear ok so I’m running with the theory (courtesy of your suggestion) that the image on node was corrupt. Binned that and allowing replication to repeat clean upload and will check how that goes.

          Wayne WorkmanW 1 Reply Last reply Reply Quote 0
          • Wayne WorkmanW
            Wayne Workman @Kiweegie
            last edited by Wayne Workman

            I want to work on the replication stuff - I want it to be hash based.

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
            Daily Clean Installation Results:
            https://fogtesting.fogproject.us/
            FOG Reporting:
            https://fog-external-reporting-results.fogproject.us/

            1 Reply Last reply Reply Quote 0
            • K
              Kiweegie @Tom Elliott
              last edited by

              @Tom-Elliott @Wayne-Workman

              Well the server was still hogging a boat-load of bandwidth so I’m in the process of rebuilding it from scratch. Should know in about an hour or so if the bandwidth issue is sorted.

              regards Kiweegie.

              1 Reply Last reply Reply Quote 1
              • K
                Kiweegie
                last edited by

                FOG Storage node now rebuilt and iftop on the master node shows traffic to the storage node sitting at around 520Kb. I’ll check again in the morning to make sure once image has transferred in full the traffic dies off and doesn’t keep firing packets over.

                regards Kiweegie.

                1 Reply Last reply Reply Quote 0
                • K
                  Kiweegie
                  last edited by

                  Image is still copying this morning but is being accurately limited to the 500kbps setting set on the storage node. Not sure what the issue was but suspect it may even have been erroneous reporting within our monitoring tool PRTG. Marking as resolved.

                  cheers, Kiweegie.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post

                  159

                  Online

                  12.0k

                  Users

                  17.3k

                  Topics

                  155.2k

                  Posts
                  Copyright © 2012-2024 FOG Project