• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    vsftpd

    Scheduled Pinned Locked Moved Solved
    FOG Problems
    2
    8
    667
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      AndrewG78
      last edited by

      Hi,
      I recently added server master node(as a new storage group) to the fog server and upgraded FOG from 1.5.4 to 1.5.7. I have high CPU load 80% of the time, even if there are no any tasks to do. There are two vsftpd daemons consuming 20% of the CPU + kworker from time to time. How can I debug this ?
      9ff6dec2-bc6d-4d17-845a-e7aa8046a6f4-image.png

      1 Reply Last reply Reply Quote 0
      • S
        Sebastian Roth Moderator
        last edited by

        @AndrewG78 Did you upgrade all your nodes? Please make sure you read and understand the important notice in the release notes for FOG 1.5.5 (and later): https://news.fogproject.org/fog-1-5-5-officially-released/

        Nodes being on different versions (1.5.4 vs. 1.5.5) will replicate images over and over again as some of the hashing code needed to be changed. Therefore we advise you to update all nodes in one go! Please make sure you stop replication on the master first systemctl stop FOGImageReplicator; systemctl stop FOGSnapinReplicator, then update the storage node(s) and then update master node as a last step.

        Possibly we need to add this notice to all new releases?!?

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        1 Reply Last reply Reply Quote 0
        • A
          AndrewG78
          last edited by

          @Sebastian-Roth
          Thx for the update. Yes I have read this before I started.
          I have updated my node to the FOG server version at the same time, but I did not stop the replication.
          I’m not sure if this scenario is realted to my setup.
          I have two separate starge groups with only one master node in each of these groups.
          So there are no nodes in the groups in which Master would replicate.
          I will disable FOGImageReplicator and FOGSnapinReplicator on the server, but Im not sure if this is the right way to solve the issue.

          1 Reply Last reply Reply Quote 0
          • S
            Sebastian Roth Moderator
            last edited by

            @AndrewG78 Hmm, maybe I was heading down the wrong track but from the minimal information I had the impression some kind of replication would be going on.

            kworker quite often is high disk IO and that kind of made up for me with PHP trying to calculating a checksum and FTP transferring… Just guessing here.

            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

            1 Reply Last reply Reply Quote 0
            • A
              AndrewG78
              last edited by

              @Sebastian-Roth
              So after disabling replication services, FOG UI became super responsive.
              No more kworkers and vsftps deamons.
              Perhaps an issue in the newest version?
              Does anyone have similar setup and can confirm this bad behaviour ?

              1 Reply Last reply Reply Quote 0
              • S
                Sebastian Roth Moderator
                last edited by

                @AndrewG78 We need more information. Please check all the logs in /var/log/fog/... and upload log files here.

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                A 1 Reply Last reply Reply Quote 0
                • A
                  AndrewG78 @Sebastian Roth
                  last edited by

                  @Sebastian-Roth
                  I think I found the reason(s).
                  There are 3 things I would like to clarify.
                  1.
                  Although replication services are disabled, there is still some replication done between storage groups.
                  In my case, I have two storage groups, every group has one storage node.
                  Both Nodes were master ones.
                  The image from the new group(2) was replicated to the old default group(1).
                  I have unchecked replicate checkbox in the image, and also disabled Master Node for the old default group. So there is only one master node. The old group has no master node at all.
                  After this, all seems to be fine now.
                  a)
                  The question is, was this a proper behaviour?
                  I thought replication is done only within the storage group members(nodes).
                  b)
                  Are there any other services that could do this replication?
                  4d509116-1acb-4405-8260-31f610da96f5-image.png
                  10e9d166-d360-4415-b3c1-878dbfb84c0c-image.png
                  2.
                  The high cpu load(kworker and vsftpd) was related to replication and lack of disk space. Replication processes did not stop even if there was 0% of free space.
                  I think this is a bug.
                  3.
                  I can see a bunch of multicast log files.
                  a)
                  Should there be some smarter log rotation ?
                  b)
                  "No new tasks found "is logged every 10s - Can we change this time somehow ?
                  d558c270-722a-4d13-a700-e108817dbb33-image.png
                  483c19d0-e408-40e8-a34f-f06bdc136a78-image.png

                  1 Reply Last reply Reply Quote 0
                  • S
                    Sebastian Roth Moderator
                    last edited by

                    @AndrewG78 I’ll try to answer all the things you brought up. But first let me state that so far you haven’t been clear (from my point of view) what has happened on which FOG server. For replication there are at least two parties (servers) involved and it’s important for me to understand which one showed the issue. I will get to that point later on again.

                    Although replication services are disabled, there is still some replication done between storage groups.

                    Disabled on which server? All FOG servers?

                    1 a) The question is, was this a proper behaviour?
                    I thought replication is done only within the storage group members(nodes).

                    As I haven’t invented the replication algorithm I don’t know it as much as Tom would. But reading the docs I get the impression that this is expected to happen: https://wiki.fogproject.org/wiki/index.php?title=Replication
                    6. If the node currently checking is the "primary master group" for the data it's working, it will attempt replicating its data to the master of each of the other groups the data is assigned under.

                    1 b) Are there any other services that could do this replication?

                    You have two nodes and both have replication services running on them!

                    1. The high cpu load(kworker and vsftpd) was related to replication and lack of disk space. Replication processes did not stop even if there was 0% of free space.
                      I think this is a bug.

                    The vsftpd part is what I would call the receiving node in this constellation. This might give you an idea which node was causing this. Disks can run out of space for many different reasons. I don’t see why our replication service should constantly check and stop replication just because of little space. Every server needs a good working disk space monitoring to warn the sysadmin to take care of it. See it from this side: If we add a check and simply stop replicating because of a lack of disk space people who don’t monitor their disk space won’t notice possibly for month and might blame us about replication not working. Although it’s not nice to hit a full disk this will eventually cause trouble and make the sleeping sysadmin aware.

                    3 a) Should there be some smarter log rotation ?

                    As well something a sysadmin should be able to handle. Linux has logrotate and I don’t see why we should invent that again.

                    3 b) "No new tasks found "is logged every 10s - Can we change this time somehow ?

                    Yes, web UI -> FOG Configuration -> FOG Settings -> FOG Linux Service Sleep Times -> MULTICASTSLEEPTIME

                    Sorry if my answers sound a bit impolite. I don’t mean it that way! Just wanted to show you that things can be seen from the other side as well.

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post

                    153

                    Online

                    12.0k

                    Users

                    17.3k

                    Topics

                    155.2k

                    Posts
                    Copyright © 2012-2024 FOG Project