• Register
    • Login
    • Search
    • Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    How increase the FOG server performance?

    General
    2
    11
    1101
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      Fernando Gietz Developer last edited by Fernando Gietz

      Hi FOGers!

      I need help to customize the setting of my FOG server to increase the performance.

      Environment:

      7000 host in the IT rooms
      300 IT rooms
      9TB of images (increasing)
      60 technicians
      1 FOG server and 1 storage node

      Actually we use an old FOG version (0.30) and works fine … very fine. But we need to migrate the FOG version to the last version.
      To do this step I installed two FOG servers with the 1.5 RC x version (dev and preproduction environments) but I have performance problems.

      1. The web UI goes fine until you send a multicast tasks or you want to see the membership of one group [more info here]
      2. I don’t know if is normal but the mysqld process uses 1,3G of RAM
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
      2073 mysql     20   0 3770600 1,372g   3920 S   0,3 11,8   3448:19 mysqld
      
      

      I use mytop tool to see the mysql performance

      MySQL on localhost (5.5.56-MariaDB)     up 48+03:43:38 [17:36:15]
       Queries: 397.4M  qps:  100 Slow:   953.0         Se/In/Up/De(%):    87/01/01/00 
                   qps now:   84 Slow qps: 0.0  Threads:    8 (   1/   0) 86/00/00/00 
       Key Efficiency: 100.0%  Bps in/out: 31.1k/109.1k   Now in/out: 16.5k/144.6k
      

      84 queries per second, are not a lot of?
      3) FOGImageReplicator and FOGSnapinReplicator. If I have only one node, these two daemons, are neccessaries?
      4) Can I enable the php-fdm to increase the performance [https://forums.fogproject.org/topic/10717/can-php-fpm-make-fog-web-gui-fast]?

      george1421 1 Reply Last reply Reply Quote 0
      • F
        Fernando Gietz Developer last edited by Fernando Gietz

        I have config the mysql to log the queries and seems that some queries are fool.

        180228 16:38:32	  364 Connect	root@localhost as anonymous on fog
        		  364 Query	USE `fog`
        		  364 Query	SET SESSION sql_mode=''
        		  365 Connect	root@localhost as anonymous on fog
        		  365 Query	USE `fog`
        		  364 Quit	
        		  365 Query	SET SESSION sql_mode=''
        		  366 Connect	root@localhost as anonymous on fog
        		  366 Query	USE `fog`
        		  365 Quit	
        		  366 Query	SET SESSION sql_mode=''
        		  366 Query	SELECT `vValue` FROM `fog`.`schemaVersion`
        		  366 Query	SELECT `pName` FROM `plugins`   WHERE `plugins`.`pInstalled`='1' AND `plugins`.`pState`='1'   ORDER BY LOWER(`plugins`.`pName`) ASC
        		  366 Query	SELECT `settingValue` FROM `globalSettings`   WHERE `globalSettings`.`settingKey` IN ('FOG_DEFAULT_LOCALE','FOG_HOST_LOOKUP','FOG_MEMORY_LIMIT','FOG_REAUTH_ON_DELETE','FOG_REAUTH_ON_EXPORT','FOG_TZ_INFO','FOG_VIEW_DEFAULT_SCREEN')   ORDER BY LOWER(`globalSettings`.`settingKey`) ASC
        		  366 Query	SELECT COUNT(`hosts`.`hostID`) AS `total` FROM `hosts` WHERE `hostPending` = '1' LIMIT 1
        		  366 Query	SELECT COUNT(`COLUMN_NAME`)AS`total`FROM`information_schema`.`COLUMNS`WHERE`TABLE_SCHEMA`='fog'AND`TABLE_NAME`='hostMAC'AND`COLUMN_NAME`='hmMAC'
        		  366 Query	SELECT COUNT(`hostMAC`.`hmID`) AS `total` FROM `hostMAC` WHERE `hmPending` = '1' LIMIT 1
        		  366 Query	SELECT `settingValue` FROM `globalSettings`   WHERE `globalSettings`.`settingKey` IN ('FOG_URL_AVAILABLE_TIMEOUT','FOG_URL_BASE_CONNECT_TIMEOUT','FOG_URL_BASE_TIMEOUT')   ORDER BY LOWER(`globalSettings`.`settingKey`) ASC
        		  366 Query	SELECT `globalSettings`.* FROM `globalSettings`  WHERE `settingKey`='FOG_QUICKREG_PENDING_MAC_FILTER'
        		  366 Query	SELECT COUNT(`hostMAC`.`hmID`) AS `total` FROM `hostMAC` WHERE `hmMAC` IN ('40:b0:34:39:57:ac') AND `hmPending` IN ('0','') LIMIT 1
        		  366 Query	SELECT `hmMAC` FROM `hostMAC`   WHERE `hostMAC`.`hmMAC` IN ('40:b0:34:39:57:ac') AND `hostMAC`.`hmPending` IN ('0','')   ORDER BY `hostMAC`.`hmID` ASC
        		  366 Query	SELECT `hmMAC` FROM `hostMAC`   WHERE `hostMAC`.`hmMAC` IN ('40:b0:34:39:57:ac') AND `hostMAC`.`hmIgnoreImaging`='1'   ORDER BY `hostMAC`.`hmID` ASC
        		  366 Query	SELECT `hostMAC`.* FROM `hostMAC`  WHERE `hmMAC`='40:b0:34:39:57:ac'
        		  366 Query	SELECT `hmHostID` FROM `hostMAC`   WHERE `hostMAC`.`hmPending` IN ('0','') AND `hostMAC`.`hmMAC` IN ('40:b0:34:39:57:ac')   ORDER BY `hostMAC`.`hmID` ASC
        		  366 Query	SELECT `hosts`.*,`hostMAC`.*,`images`.*,`os`.*,`imagePartitionTypes`.*,`imageTypes`.*,`hostScreenSettings`.*,`hostAutoLogOut`.*,`inventory`.* FROM `hosts`  LEFT OUTER JOIN `hostMAC` ON `hostMAC`.`hmHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `images` ON `images`.`imageID`=`hosts`.`hostImage`  LEFT OUTER JOIN `os` ON `os`.`osID`=`images`.`imageOSID`  LEFT OUTER JOIN `imagePartitionTypes` ON `imagePartitionTypes`.`imagePartitionTypeID`=`images`.`imagePartitionTypeID`  LEFT OUTER JOIN `imageTypes` ON `imageTypes`.`imageTypeID`=`images`.`imageTypeID`  LEFT OUTER JOIN `hostScreenSettings` ON `hostScreenSettings`.`hssHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `hostAutoLogOut` ON `hostAutoLogOut`.`haloHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `inventory` ON `inventory`.`iHostID`=`hosts`.`hostID`  WHERE `hostID`='7502'  AND `hostMAC`.`hmPrimary` = '1'
        		  366 Query	SELECT COUNT(`hookEvents`.`heName`) AS `total` FROM `hookEvents` WHERE `hookEvents`.`heName`='QUEUED_STATES' AND `hookEvents`.`heName` <> '0'
        		  366 Query	SELECT COUNT(`hookEvents`.`heName`) AS `total` FROM `hookEvents` WHERE `hookEvents`.`heName`='PROGRESS_STATE' AND `hookEvents`.`heName` <> '0'
        		  366 Query	SELECT `taskID` FROM `tasks`  LEFT OUTER JOIN `images` ON `images`.`imageID`=`tasks`.`taskImageID`  LEFT OUTER JOIN `os` ON `os`.`osID`=`images`.`imageOSID`  LEFT OUTER JOIN `imagePartitionTypes` ON `imagePartitionTypes`.`imagePartitionTypeID`=`images`.`imagePartitionTypeID`  LEFT OUTER JOIN `imageTypes` ON `imageTypes`.`imageTypeID`=`images`.`imageTypeID`  LEFT OUTER JOIN `hosts` ON `hosts`.`hostID`=`tasks`.`taskHostID`  LEFT OUTER JOIN `hostMAC` ON `hostMAC`.`hmHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `hostScreenSettings` ON `hostScreenSettings`.`hssHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `hostAutoLogOut` ON `hostAutoLogOut`.`haloHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `inventory` ON `inventory`.`iHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `taskTypes` ON `taskTypes`.`ttID`=`tasks`.`taskTypeID`  LEFT OUTER JOIN `taskStates` ON `taskStates`.`tsID`=`tasks`.`taskStateID`  LEFT OUTER JOIN `nfsGroupMembers` ON `nfsGroupMembers`.`ngmID`=`tasks`.`taskNFSMemberID`  LEFT OUTER JOIN `nfsGroups` ON `nfsGroups`.`ngID`=`nfsGroupMembers`.`ngmGroupID`   WHERE `tasks`.`taskHostID`='7502' AND `tasks`.`taskStateID` IN ('0','1','2','3') AND `hostMAC`.`hmPrimary` = '1'  ORDER BY LOWER(`tasks`.`taskName`) ASC
        		  366 Query	SELECT `hostMAC`.* FROM `hostMAC`  WHERE `hmMAC`='40:b0:34:39:57:ac'
        		  366 Quit	
        

        In one second queries

        1 Reply Last reply Reply Quote 0
        • F
          Fernando Gietz Developer last edited by

          The activity of mysql server is huge. I have restarted the server and in seven minutes:

          MySQL on localhost (5.5.56-MariaDB)     up 0+00:07:00 [16:13:04]
           Queries: 38.1k  qps:   93 Slow:     0.0         Se/In/Up/De(%):    94/00/00/00 
                       qps now:  102 Slow qps: 0.0  Threads:    5 (   1/   0) 85/01/00/00 
           Key Efficiency: 100.0%  Bps in/out: 13.5k/43.9k   Now in/out: 41.3k/190.2k
          
                Id      User         Host/IP         DB      Time    Cmd Query or State                                                       
                 --      ----         -------         --      ----    --- ----------                                                           
                664      root       localhost       test         0  Query show full processlist                                                
                782      root       localhost        fog         4  Sleep                                                                      
                768      root       localhost        fog        10  Sleep                                                                      
                746      root       localhost        fog        19  Sleep                                                                      
                 10      root       localhost        fog       414  Sleep
          

          38k queries??

          1 Reply Last reply Reply Quote 0
          • F
            Fernando Gietz Developer last edited by Fernando Gietz

            I have restarted the mysql server and the usage has downed

            8895 mysql     20   0 1300380  93492   9236 S   7,0  0,8   0:05.37 mysqld
            

            I have config the check_time to 900 seconds

            1 Reply Last reply Reply Quote 0
            • george1421
              george1421 Moderator @Fernando Gietz last edited by george1421

              @fernando-gietz said in How increase the FOG server performance?:

              We are talking about the same check time 🙂 This check time, what means?

              What this means, it tells the client “Check back with the server every XX seconds to see if there is something for you to do”. So the clients will query the FOG server every XX seconds to see if there are snapins to deploy or system rename events, or what ever you can schedule with the FOG Server. This I feel the FOG server and MySQL are busy servicing these client check ins to do much of anything else. As I suggested change the check in time to 900 (15 min) and see if this resolves your problem, or makes it easier on the FOG server. If not, you can change it back.

              Normally with that much ram, swap is never used. 800MB does seem like a lot. 1.3GB of ram for mysql process does seem to be a lot too. Again drop your check in time and wait 30 minutes to see if the resources free up on your fog server.

              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

              1 Reply Last reply Reply Quote 0
              • F
                Fernando Gietz Developer @george1421 last edited by

                @george1421 We are talking about the same check time 🙂 This check time, what means?

                I am worry about the mysql performance and the huge use of RAM, 1,3GB.

                 2073 mysql     20   0 3770600 1,372g   3920 S   6,0 11,8   3452:06 mysqld
                

                And when I want to see the membership of one group, the apache use the 100% vCPU and I spend two minutes to see the list of them.

                The swap use, is normal? circa 100%

                george1421 1 Reply Last reply Reply Quote 0
                • george1421
                  george1421 Moderator @Fernando Gietz last edited by

                  @fernando-gietz I think maybe we are not talking about the same check in time.
                  0_1519755680130_client_checkin.png

                  Also your CPU usage doesn’t look bad (according to top).

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                  F 1 Reply Last reply Reply Quote 0
                  • F
                    Fernando Gietz Developer last edited by Fernando Gietz

                    top command:

                    top - 18:41:55 up 48 days,  4:49,  2 users,  load average: 0,19, 0,23, 0,29
                    Tasks: 282 total,   1 running, 278 sleeping,   0 stopped,   3 zombie
                    %Cpu(s):  8,2 us,  2,2 sy,  0,0 ni, 89,6 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
                    KiB Mem : 12138956 total,   177100 free,  2809672 used,  9152184 buff/cache
                    KiB Swap:  1023996 total,   199544 free,   824452 used.  8521144 avail Mem 
                    
                      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                          
                    26061 apache    20   0  543340  45800   6768 S  11,3  0,4   6:29.34 httpd                                                            
                    13607 apache    20   0  700016  47256   8016 S   9,0  0,4  14:19.99 httpd                                                            
                    16160 apache    20   0  678892  27200   9160 S   7,3  0,2   1:32.28 httpd                                                            
                     2073 mysql     20   0 3770600 1,372g   3920 S   6,0 11,8   3452:06 mysqld
                    

                    atop command:

                    PRC | sys    0.13s  | user   0.20s  | #proc    285  | #trun	 3  | #tslpi   328  | #tslpu     0  | #zombie    3  | #exit      7  |
                    CPU | sys       3%  | user      4%  | irq	0%  | idle    593%  | wait	0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    cpu | sys	1%  | user      0%  | irq	0%  | idle     99%  | cpu003 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    cpu | sys	1%  | user      2%  | irq	0%  | idle     98%  | cpu005 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    cpu | sys	1%  | user	1%  | irq	0%  | idle     99%  | cpu004 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    cpu | sys	1%  | user	0%  | irq	0%  | idle     99%  | cpu000 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    cpu | sys	0%  | user	1%  | irq	0%  | idle     99%  | cpu001 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    cpu | sys	0%  | user	0%  | irq	0%  | idle    100%  | cpu002 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
                    CPL | avg1    0.08  | avg5    0.19  | avg15   0.27  |               | csw     5925  | intr    5744  |               | numcpu     6  |
                    MEM | tot    11.6G  | free  147.2M  | cache   8.5G  | buff    0.1M  | slab  221.8M  | shmem 428.8M  | vmbal   0.0M  | hptot   0.0M  |
                    SWP | tot     1.0G  | free  194.9M  |               |               |               |               | vmcom   2.9G  | vmlim   6.8G  |
                    LVM |   Datos-root  | busy	1%  | read	 5  | write	 4  | KiB/w	 8  | MBr/s   0.19  | MBw/s   0.01  | avio 4.56 ms  |
                    LVM |    Datos-tmp  | busy	0%  | read	 0  | write	 1  | KiB/w	 4  | MBr/s   0.00  | MBw/s   0.00  | avio 1.00 ms  |
                    DSK |          sda  | busy      1%  | read       5  | write      5  | KiB/w      7  | MBr/s   0.19  | MBw/s   0.01  | avio 4.20 ms  |
                    NET | transport     | tcpi	10  | tcpo	12  | udpi    1924  | udpo    1920  | tcpao      2  | tcppo      2  | tcprs      3  |
                    NET | network       | ipi     2102  | ipo     2088  | ipfrw      0  | deliv   2102  |               | icmpi      0  | icmpo      0  |
                    NET | ens192  ----  | pcki    2108  | pcko    2088  | si  220 Kbps  | so 1754 Kbps  | erri       0  | erro       0  | drpo       0  |
                    NET | ens224  ----  | pcki       1  | pcko       1  | si    0 Kbps  | so    0 Kbps  | erri	 0  | erro	 0  | drpo	 0  |
                    

                    The checking time, what checks? The computer state? 15 minutes is a lot of for us. Take note that if you send a multicast tasks, the computers will shutdown in very differents moments and some ones will be out of the tasks (if you have a multicast timeout of 5 minutes)

                    george1421 1 Reply Last reply Reply Quote 0
                    • george1421
                      george1421 Moderator @Fernando Gietz last edited by george1421

                      @fernando-gietz It would be interesting to see what top had to say. With 6 vCPUs, it would be interesting to know how many cores your server has. If it has way more than 6, then 6 vCPUs is OK. Otherwise adding more vCPUs than necessary will slow down your VM.

                      My initial reaction is to take your client check in time to 15 minutes, in stead of 90 seconds. At 90 seconds you have 600 hosts hitting your FOG server (at an average lineralized rate) of 6 hosts per second. We all know host check in at random. So you might have 15 check in, in 1 second and 2 check in, the next second. So drop your check in period to 10-15 minutes.

                      Second I would surely enable php-fpm and memcache to see how well it improves your performance. I have only done this on a small scale and that really helped me with web server responsiveness.

                      Hopefully your vm host server uses more than one network interface to the building switches. For a university I might expect that they use 10 - 40GbE networking. Also look at what interface your VM is using to interface with your vm host server. If your hypervisor is ESX (vSphere) then ensure you are using the VMX3 network interface. That should give you 10G to your vSwitch.

                      Lastly, you may be at a scale (number of users) where you might consider removing the sql server from FOG and running an independent SQL server specifically configured to run MS SQL.

                      I think I might do the first 2 in the list and check on the 3rd one. Leave extracting mssql server out of the fog server until last.

                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                      1 Reply Last reply Reply Quote 1
                      • F
                        Fernando Gietz Developer @george1421 last edited by Fernando Gietz

                        @george1421

                        How many vCPUs does your FOG server have?
                        6 vCPU and 12 GB RAM
                        Do you use the fog client? If so what is your check in interval?
                        Yes, but is not installed in all of them. Actually the client is installed in 600 computers. CLIENT CHECKIN TIME = 90
                        How many network adapters do you have in this fog server?
                        Two adapters. One for clients and one for the storage.
                        Is this fog server virtual or physical?
                        Is virtual
                        What kind of disk subsystem do you have? (raid, single disk, ssd,??)
                        I dont know 🙂 But is not bad, we use the Production environment of the university. I can do download tasks at 13 GB/min, then I suppose that the disks are not the problem

                        OS: RHEL 7 64 bits

                        george1421 1 Reply Last reply Reply Quote 0
                        • george1421
                          george1421 Moderator @Fernando Gietz last edited by

                          @fernando-gietz Lets get a bit more details here.

                          1. How many vCPUs does your FOG server have?
                          2. Do you use the fog client? If so what is your check in interval?
                          3. How many network adapters do you have in this fog server?
                          4. Is this fog server virtual or physical?
                          5. What kind of disk subsystem do you have? (raid, single disk, ssd,??)

                          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                          F 1 Reply Last reply Reply Quote 0
                          • 1 / 1
                          • First post
                            Last post

                          187
                          Online

                          10.4k
                          Users

                          16.4k
                          Topics

                          150.5k
                          Posts

                          Copyright © 2012-2023 FOG Project