• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

Replication runaway on one storage node

Scheduled Pinned Locked Moved Solved
FOG Problems
3
7
2.1k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K
    Kiweegie
    last edited by Jan 8, 2016, 1:20 AM

    HI all

    apologies first of all - I’ve posted so many problems and questions lately I’m beginning to feel like a spammer 🙂

    Due to lots of help from Tom and co I have a fully functional FOG Server once more running on latest trunk. That one is now the single master node for our head office. What used to be the head office storage node I recently rebuilt (due issues with the storage node installer) as a master node serving our remote sites.

    I have set up the remote sites as storages nodes within this master server and limited the bandwidth on each storage node to 500kbps to avoid saturating the links.

    There is only a single image on this server of 6.9GB which has already been replicated successfully to all 8 remote storage nodes.

    I noticed this evening on our PRTG server that there was really high network utilisation from the remote master node to a single storage node which is eating up 50% of our head office link. Given that the image is already on the storage node should logic not kick in to stop replication from overwriting the image again? I’m assuming that’s what’s happening.

    I bounced the master server and no change then did same on the node and that seems to have done the trick. Curious to know how the replication code works so I can more accurately troubleshoot should this happen again. And how robust everyone is finding it to be?

    cheers, Kiweegie.

    1 Reply Last reply Reply Quote 0
    • T
      Tom Elliott
      last edited by Jan 8, 2016, 1:26 AM

      I’m half tempted to find out first if there was another problem altogether. I say this because there is checking. So the only way I can think of, assuming the actual replication service is fine based on the other nodes receiving the image, is the image is transferred but at the remote site the image was corrupted. So every cycle would cause it to try to replacing the file. Maybe HDD on other side was having an issue? Just thinking.

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

      K 2 Replies Last reply Jan 8, 2016, 1:51 PM Reply Quote 0
      • K
        Kiweegie @Tom Elliott
        last edited by Jan 8, 2016, 1:51 PM

        @Tom-Elliott Thanks Tom - checked this morning when I came into work and while the bandwidth wasn’t spiking above the 500Kbps limit it was still copying to the same storage node even though image was there already. Server and disk appear ok so I’m running with the theory (courtesy of your suggestion) that the image on node was corrupt. Binned that and allowing replication to repeat clean upload and will check how that goes.

        W 1 Reply Last reply Jan 8, 2016, 10:23 PM Reply Quote 0
        • W
          Wayne Workman @Kiweegie
          last edited by Wayne Workman Jan 8, 2016, 4:23 PM Jan 8, 2016, 10:23 PM

          I want to work on the replication stuff - I want it to be hash based.

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!
          Daily Clean Installation Results:
          https://fogtesting.fogproject.us/
          FOG Reporting:
          https://fog-external-reporting-results.fogproject.us/

          1 Reply Last reply Reply Quote 0
          • K
            Kiweegie @Tom Elliott
            last edited by Jan 8, 2016, 10:40 PM

            @Tom-Elliott @Wayne-Workman

            Well the server was still hogging a boat-load of bandwidth so I’m in the process of rebuilding it from scratch. Should know in about an hour or so if the bandwidth issue is sorted.

            regards Kiweegie.

            1 Reply Last reply Reply Quote 1
            • K
              Kiweegie
              last edited by Jan 9, 2016, 12:07 AM

              FOG Storage node now rebuilt and iftop on the master node shows traffic to the storage node sitting at around 520Kb. I’ll check again in the morning to make sure once image has transferred in full the traffic dies off and doesn’t keep firing packets over.

              regards Kiweegie.

              1 Reply Last reply Reply Quote 0
              • K
                Kiweegie
                last edited by Jan 9, 2016, 9:15 AM

                Image is still copying this morning but is being accurately limited to the 500kbps setting set on the storage node. Not sure what the issue was but suspect it may even have been erroneous reporting within our monitoring tool PRTG. Marking as resolved.

                cheers, Kiweegie.

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                1 / 1
                • First post
                  2/7
                  Last post

                209

                Online

                12.1k

                Users

                17.3k

                Topics

                155.3k

                Posts
                Copyright © 2012-2024 FOG Project