FOG Web GUI speed and default storage activity
-
I think it’s timing out. Tom does some advanced stuff in the javascript with time & timeouts and such in order to prevent those things from blocking other things in the event of an actual problem (like a dead node).
-
Could it be something on my end with the network where packets are being lost and FOG server isn’t receiving responses to those check ins? Or if it’s a timeout script could we increase it manualy? What things should I be looking for if it’s network related?
-
@jgallo All those things, possibly. Let me explain how this works a bit.
The web server issues a web call to all enabled nodes when you call the homepage (the main dashbard). So, you click the home icon in FOG or login, the home page calls this on each enabled node:
http://x.x.x.x/fog/service/getversion.php
Where x.x.x.x is the IP/name of each node. So say you had three nodes: 10.0.0.5, 10.0.0.6, 10.0.0.7
The homepage would call each one like:
http://10.0.0.5/fog/service/getversion.php
http://10.0.0.6/fog/service/getversion.php
http://10.0.0.7/fog/service/getversion.phpIf they don’t respond in time because they are too busy or the web server is just to busy to hear the responses, then it causes the error “A valid database connection could not be made”
Further, sitting on the homepage increases load somewhat on the nodes and web server and database - because the homepage graphs continually poll all of them for transmission stats, space stats, continually polls the DB for job status, etc.
At one of my last jobs, the “a valid database blah blah” message was common when the FOG system was under load. If imaging is still working, then you don’t have any problems. If it bothers you a lot, try to dial back the
max clients
value in your node’s storage management area to reduce the load. Or decrease your client checkin time. Or build faster servers to accommodate your load. -
Cool. Makes perfect sense. I must say I don’t think it’s the storage node servers as they mostly consists of Dell R410’s and R210’s with at least 8GB to 16GB of memory. I will try to decrease the check in time. Is the value in seconds or minutes for the check in time in FOG?
-
@jgallo said in FOG Web GUI speed and default storage activity:
Is the value in seconds or minutes for the check in time in FOG?
Seconds.
-
Ok so I have tried to decrease the client checkin time and it seems like it helps but eventually ends up causing that message. I even went to the extent of not sitting on the homepage just to make sure and after an hour I go back and check and still have those connections messages.
I will try one more thing and if this doesn’t work i will just deal with it but could I raise the number of vCPU on Hyper-V from 2 to 4? Will this help improve performance? That’s basically the only thing left to do to improve performance on the fog server. I did make the change from dynamic memory to static. thanks.
-
Guys,
Is there any way to alter the choice of pages that the Web UI goes to upon sign in?
I’m seeing very long signin times but the web UI is mostly very fast in my installations. I’ve got 9 remotes sites, each with 1 storage node, some of the sights are connected in over 4G LTE.
Signing into FOG takes variously long times, depending on how the 9 connections are doing.
I could do without the dashboard page and the load it generates. I’d be happy if the default page were anything but the Dashboard.
Jim
-
@jim-graczyk There is not currently an option for that, but (in my limited experience it might be trivial to include that feature). The developers might include a new setting in FOG Settings->Login Settings to allow the FOG admin to change their login landing page. If you look in the url the only difference between the dashboard and any other page is just the node reference.
But then might raise the question, if not the dashboard, then what should be the default landing page? And might that landing page be different on a per user bases? (just thinking of reasons why we might not want this feature). Either way, I would surely make a feature request out of this idea. I fell it does have a moderate amount of merit.
-
@jgallo said in FOG Web GUI speed and default storage activity:
could I raise the number of vCPU on Hyper-V from 2 to 4? Will this help improve performance?
That would only help if your host system isn’t overburdened. If you have too many VMs on it already with too many cores assigned, and not enough cores available, it’ll just make things worse. But if you have plenty of resources, then it would help.
Also, set your client checkin time to something like 300 seconds (five minutes) and see if that makes a difference. Keep in mind the change here isn’t immediate - the clients have to checkin once more to actually get the new setting.
-
I don’t have too many VM’s. I do have 4 but each of those has either 1 or 2 vCPU’s allocated to them. I have now set 4 vCPU’s for the fog server VM and I still have that issue. I also set the client checkin time to 300 with same issue.
Here is what I noticed. I see that the FOG server disk I/O is about 15% give or take on a constant basis. I also noticed that all the disk activity is from apache2 with user www-data and mysql is using up to 5GB of memory at times just on idle. Could this be some programming bug or my database needs to be cleared?
-
@jgallo How many hosts do you have in your environment that have the FOG Client installed?
-
I don’t have any. We use Group Policy to manage printers, settings, etc. Back in the day we used the FOG Client when autojoin domain features were utilized. We don’t anymore due to large amounts of chromebooks replacing aging PC’s.
-
@jgallo What are the link speeds between the main fog server and the other nodes? How many images do you have? What’s the FOG IMAGE REP SLEEP TIME in fog settings set to?
-
At the secondary schools, the connection speed to our district office is 1Gb and the primary schools are at 100mb. The FOG IMAGE REP SLEEP TIME is set to 10800.
-
@jgallo How consistent is the problem? The “a valid connection cannot be established” problem. Any rhyme or pattern? Is this when imaging is happening?
-
on working branch 57 it was very consistent even with all the changes made to fog and the vCPU’s. I have been updating all the storage nodes and fog to working branch 64 today and the problem is still persistent. The only pattern I have observed is that upon rebooting the fog sever, the valid connection messages do not appear for about an hour or so. Then the messages begin to appear for random nodes that I have the graph enabled. At random times, the messages tend to go away but then come back upon selecting another storage node on the dashboard.
-
@jgallo Do you know what version of fog this problem started with?
-
I know that I went from 1.4.4 to 1.5.0 RC1 if I recalled. I know when I upgraded I made a huge leap. New interface and all. During that time there were replication issues and eventually updated to 1.5.0 RC7 which still had replication issues. I then upgraded to 1.5.0 RC9 which replication had major issue that was resolved in a working branch. So I have been on working branch 57 until about an hour ago I went to 64 with all nodes and fog server. I have been observing this issue since I have been on 1.5.0 from the new interface. I never had this issue in 1.4.4
-
@jgallo said in FOG Web GUI speed and default storage activity:
The only pattern I have observed is that upon rebooting the fog sever, the valid connection messages do not appear for about an hour or so.
I’m thinking about this - I’d like you to try to restart only Apache and see if it has the same effect or not. On CentOS/Fedora/RHEL it’s
systemctl restart httpd
and on Ubuntu 16/Debian8/Debian9 it’ssystemctl restart apache2
and on Ubuntu 14-,debian7- it’sservice apache2 restart
-
Tried that and still get the database connection message.