FOG Web GUI speed and default storage activity

  • Hello,

    I’m noticing that after some time the fog server is running that it begins to have issues with the nodes. On the dashboard, it shows a message that “a valid connection cannot be established” to all nodes. So when you click on the storage node, it will refresh and it works but then other nodes have the same issue. I came across a post to increase the memory limit for the php file in fog settings. that seem to help but not sure if it will fix it. I have also noticed that the FOG server itself is having a lot of activity at least by the graph and it’s currently not needing to replicate.

    My question is there areas in the FOG settings to improve web UI speeds? I have read @george1421 post about php-fpm making web UI faster but I’m not sure if this is ready or experimental. Is there anything I can do to improve the speeds currently on the working branch? I have about 18 storage nodes with 6 storage groups on network of roughly 2500 PC’s. Currently the fog database only has about 60 computers as I have had to start from scratch. I have disabled the host lookup option and increased the imagerepsleeptime so it’s not so taxing on the web ui as also for replication.

    Thank you.

  • @wayne-workman

    Very true but making the change in the live DB is a great plan B until a long term plan A is found. I checked the mysql mem calc link and I compared the mysqld.cnf settings for those calculations but many of those variables are not there. I think at this point it works and I don’t foresee needing many more storage nodes in the future for this fog server. I’m happy it’s working now. Thank you both for all your help.

  • @sebastian-roth said in FOG Web GUI speed and default storage activity:

    On the other hand setting the value up to 250 in the live DB worked like a charm.

    Sure it works, but it’s not a permanent solution. The changes are undone upon reboot or mysql service restarting.

  • Developer

    @JGallo @Wayne-Workman Just found this on stackexchange. Though it doesn’t say so anywhere in the mysql manual it’s possible that mysql does somehow restrict the max_connections when it’s higher than some internal memory calculations. Maybe give the mysql mem calc a try. On the other hand setting the value up to 250 in the live DB worked like a charm.

  • @jgallo I’ll have to look into it, but mysql and mariadb have configuration files where this setting can be used - and when you restart mysql/mariadb it should use the setting and work.

  • @wayne-workman

    I’m following the steps in Wiki but mysql is on Ubuntu 16.04.3 LTS and when I go to /etc/mysql/my.cnf and try to add the increase of max connections, it errors out. Modifying the mysqld.cnf in /etc/mysql/mysql.conf.d and removing the comment on the line with max connections works. However it’s really odd that when you restart mysql and go into mysql to check the max connections it would show a smaller value then defined in the mysqld.cnf file. Am I right to modify the mysqld.cnf? I’m lost at this but I know that at least the value is increased so it works as intended but with the incorrect value I defined.

  • @jgallo said in FOG Web GUI speed and default storage activity:

    It would appear that mysql goes back to default max connections of 151 after the update.

    Because the setting was only changed in the MySQL config live. The setting has to be set in the mysql configuration file using the process outlined here:

  • @sebastian-roth

    I did a git pull and updated FOG to address a Windows 10 issue and upon the update the issue I had with database connections returned. It would appear that mysql goes back to default max connections of 151 after the update. Does the fog installer/updater have mysql update settings? If it does, could I suggest an increase on that max connection limit be part of the fog installer/updater? Or is this something that needs to be manually entered in mysql with the amount of storage nodes and groups I have which was done already to address my issue. Thank you again for your help on this.

  • Developer

    Turns out mysql hit the max connection limit. Not sure what the default limit is but on this system it was 151 concurrent connections. Check for connections numbers are just an example - where much higher on the actual system):

    mysql> show status like '%onn%';
    | Variable_name                    | Value |
    | Aborted_connects                 | 0     | 
    | Connections                      | 8     | 
    | Max_used_connections             | 4     | 
    | Connection_error_max_connections | 1421  |
    | Ssl_client_connects              | 0     | 
    | Ssl_connect_renegotiates         | 0     | 
    | Ssl_finished_connects            | 0     | 
    | Threads_connected                | 4     | 

    See that Connection_error_max_connections count. So we checked the max_connections setting and increased it:

    mysql> show variables like 'max_connections';
    | Variable_name   | Value |
    | max_connections | 151   |
    mysql> set global max_connections = 250;

    The change takes effect without restarting the server or mysql!

  • @Wayne-Workman

    @Sebastian-Roth remoted in and looks like mysql was limited to 150 connections. He raised it to 250 and now it would appear fixed so far. I’m keeping an eye on the max used connections and it hasn’t gone over 161 so I must have been hitting that 150 max. Thanks for all your help here. Greatly appreciate it.

  • Another observation is that upon the sudo service mysql restart on the fog server, the error goes away for DefaultMember in the dashboard. I tried the same thing on other storage nodes but the message is still in the dashboard. Maybe this could be something with remote connections with the mysql database.

  • @wayne-workman

    So after several different attempts to see this issue be resolved, the issue continues to persist. The patterns I have observed is that it now takes about an hour for the issue to begin. I have gone into the /etc/mysql/mysql.conf.d/mysqld.cnf file and edited the bind address to each node because on a seperate post it was recommended to use the IP of the node instead of the loopback. Don’t think this worked because I restarted server and node after these changes and problems persist. I also went into the /var/www/fog/lib/fog/fogurlrequests.class.php and edited the aconntimeout = 2000 to 10000 as well as the counttimeout = 15; to 30. Would it be possible @Tom-Elliott or someone to remote in to check this out remotely? I know all the passwords for fogstorage accounts are accurate because imaging works fine. Registering computers at sites works fine as well as image replication to storage groups work fine. I could start from scratch and add one node at a time but I could not be able to place a test fog server in place due to this being a production FOG system in place. I have back ups of the images so starting from scratch is not a problem. I have a strange feeling this could be related to Ubuntu and updates with PHP7. Thank you

  • @wayne-workman

    Yeh these high schoolers get curious and it wouldn’t be the first time we have dealt with rogue routers.

  • @jgallo That makes a lot of sense now actually - because nothing about this problem was making sense before. I guess there were IP conflicts on your network causing all the issues - or possibly a routing loop.

    Glad it seems to be fixed - let us know if it’s not.

  • @wayne-workman

    Sorry about the late reply here. We had a rogue router on a campus which we were hunting for and eventually found. The error message about the database connection after host registration and prior to actually imaging disappeared. Going to restart again and continue to monitor issue.

  • @wayne-workman

    So after some time again message came back even after clearing cache and not rebooting. We had to image a computer and I noticed that when the computer tried to check in after the iPXE files were loading prior to imaging that the error for the computer checking in stated the same message as the storage nodes on the dashboard valid database connection could be made. I will image another computer and screenshot that error.

  • @wayne-workman

    ahh ok. I figured rebooting would reboot all services. Maybe the storage nodes needed a reboot this whole time LOL. There were a few that have been on non-stop for about 90 days. If the messages return, I will do a clear cache and not reboot.

  • @jgallo So, rebooting after you clear the cache nullifies the effects that clearing the cache does, because rebooting does the same thing. I’m trying to isolate if clearing the cache fixes the problems for you.

  • @wayne-workman

    So far it has not prompted me with database connections. earlier I have cleared the cache on all nodes and rebooted. Seems like it’s working because so far it has not prompted anything with database connections errors and I have been sitting on the dashboard since.

  • @jgallo Having cache in the RAM isn’t a bad thing - Linux does a really amazing job at managing memory, it’s just something about fog that causes the cache to not be effective (what’s cached is not what’s actually used often). cache size grows as the Linux kernel does stuff. The less stuff it does, the slower the cache grows. Eventually though - and by design - the RAM gets filled with cache in order to make future requests faster. Again, in the case of fog, what’s cached is not what is getting used over and over.