Login takes 5minutes+
-
@greg-plamondon Well there is a couple of things I see in this setup that made me squint a bit. (not in any particular order)
- You have 8 vCPUs allocated to this VM. (initial reaction: a bit overkill). But more to the point, your VM Host server is using an Intel E5440 processor. That CPU has 4 cores without hyperthreading. Lets assume that that VM Host server has 2 physical CPU chips. So that gives you 8 physical cores. That means to run the FOG VM all 8 cores have to be in a ready state to run the FOG server. So there may be a hypervisor scheduling issue on the VM host. If it was me, I would drop the vCPU count to 4, maybe 2 to start and watch the utilization.
Ref: http://www.gabesvirtualworld.com/how-too-many-vcpus-can-negatively-affect-your-performance/ - Disconnect the Centos 7 iso image from the cdrom
- You have a check-in time of 60 seconds, so that means you will have 300 some clients hitting your master fog server every minute. You may want to change that to 300 or 900 seconds. Understand the impact of this in that when you schedule snapin deployment tasks the clients won’t see the task until they check in. But in most cases a 5 or 15 minute check in for snapin deployments are not going to hurt you.
- It looks like mysql is taking a beating from the clients. You can see this in the mysql and apache high processes tasks.
- Depending on how long mysql has been in service in this fog install, we may need to run some mysql clean up and reindexing commands (I don’t have them handy at the moment)
- We might be able to help with performance by switching over to a dedicated php engine (I have a document on that I can dig up in the forums).
- Physical memory doesn’t seem to an issue with the VM so you should be good there.
I would focus on dropping the vCPU count and extending the check in periods first. Then assess the next steps.
So why is fog slow on the initial login? Its probably related to FOG creating and caching your session from the sql database that is being beat up.
- You have 8 vCPUs allocated to this VM. (initial reaction: a bit overkill). But more to the point, your VM Host server is using an Intel E5440 processor. That CPU has 4 cores without hyperthreading. Lets assume that that VM Host server has 2 physical CPU chips. So that gives you 8 physical cores. That means to run the FOG VM all 8 cores have to be in a ready state to run the FOG server. So there may be a hypervisor scheduling issue on the VM host. If it was me, I would drop the vCPU count to 4, maybe 2 to start and watch the utilization.
-
@george1421 I changed the CPU cout from 8 to 4 like you suggested.
Initial load:
Then it calmed down after a bit:
The Centos CDROM is already disconnected:
I am going to change the check-in time to 300 as suggested.I ran these commands against the DB
# No password: mysql -D fog #Password: mysql --user=UsernameHere --password=PasswordHere -D fog DELETE FROM `hosts` WHERE `hostID` = '0'; DELETE FROM `hostMAC` WHERE hmID = '0' OR `hmHostID` = '0'; DELETE FROM `groupMembers` WHERE `gmID` = '0' OR `gmHostID` = '0' OR `gmGroupID` = '0'; DELETE FROM `snapinGroupAssoc` WHERE `sgaID` = '0' OR `sgaSnapinID` = '0' OR `sgaStorageGroupID` = '0'; DELETE from `snapinAssoc` WHERE `saID` = '0' OR `saHostID` = '0' OR `saSnapinID` = '0'; DELETE FROM `hosts` WHERE `hostID` NOT IN (SELECT `hmHostID` FROM `hostMAC` WHERE `hmPrimary` = '1'); DELETE FROM `hosts` WHERE `hostID` NOT IN (SELECT `hmHostID` FROM `hostMAC`); DELETE FROM `hostMAC` WHERE `hmhostID` NOT IN (SELECT `hostID` FROM `hosts`); DELETE FROM `snapinAssoc` WHERE `saHostID` NOT IN (SELECT `hostID` FROM `hosts`); DELETE FROM `groupMembers` WHERE `gmHostID` NOT IN (SELECT `hostID` FROM `hosts`); DELETE FROM `tasks` WHERE `taskStateID` IN ("1","2","3"); DELETE FROM `snapinTasks` WHERE `stState` in ("1","2","3"); quit
I will report back with results later.
Thanks for your help. -
I forgot to mention that the Fogserver is also handleing dhcp for 4 subnets.
-
@greg-plamondon The data store file is what I should have said. The VM holds open a connection to that iso file even if its not mounted. You can see this by looking at the datastore you will see a bunch of .lck files associated with that iso.
In regards to mysql, we will probably need to rebuild and reorg the indexes and tables. Lets see how the other changes impacted the environment first.
-
@george1421
Ok, I removed the CDROM device. I don’t know how long it takes for the changes to speed things up but I haven’t seen any noticeable differences yet. -
@greg-plamondon Well for the clients to get the new checkin interval it should be 60 seconds to switch over to the 5 minute timing.
So is your sql process still at 21% with the matching http process?
If you don’t see any better performance understand that dropping your vCPUs didn’t hurt the process, even if it didn’t help it much. There is a documented penalty for over committing vCPUs to VMs either way.
-
Ok, I found the issue I was having with logging in. I had about 5 storage nodes enabled on the home screens graph, Once i disabled those I was able to login immediately. The submit a task still takes about 1 minutes or 2 but its allot better!
Thanks
-
@greg-plamondon OK it WAS the dashboard doing it to you.
I still think you can optimize your FOG install more. BUT if your issue is addressed then you can save that activity for another time. Switching over to php-fpm should lower the vCPU requirements too. BUT the biggest driver for performance was lowering the check in time to once every 5 minutes especially with 300+ nodes.
-
@george1421
I would love to optimize our FOG install if you have the time to tell me what to do.Thanks!
-
@greg-plamondon Here is a thread I was working on to see if moving to a dedicated php engine (php-fpm) over letting apache just do it, helped any. At the time I didn’t have the volume of check-ins needed to see if there was much of a change. I could tell from the responsivness of the web gui that php-fpm was working correctly. Also adding memcache would help with caching session data in memory over writting it to disk.
There is two parts here php-fpm and memcache.
https://forums.fogproject.org/topic/10717/can-php-fpm-make-fog-web-gui-fastBeyond that I feel that mysql database could also use some tuning to help with performance. But that is a personal feeling without a solid basis of evidence.
-
@greg-plamondon said in Login takes 5minutes+:
I forgot to mention that the Fogserver is also handleing dhcp for 4 subnets.
Ok, gotta say it… You should put MySQL onto a dedicated VM, and point the FOG Web Server to it instead of the local one. Separate out stuff like this to scale horizontally (more VMs) instead of vertically (bigger VMs).
-
@george1421 Well done! Great you followed this so closely. Lot’s good advices…
-
@lpetelik said in Login takes 5minutes+:
following for more solutions.
I wanted to tell you that you can disable the graph for all of the storage nodes, and this really does make login pretty snappy, but of course with this you no longer have graphs for all the nodes on the dashboard. With graphs disabled, you can still see tasks in Task Management.