• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

How increase the FOG server performance?

Scheduled Pinned Locked Moved
General
2
11
1.7k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F
    Fernando Gietz Developer
    last edited by Fernando Gietz Feb 27, 2018, 10:41 AM Feb 27, 2018, 4:40 PM

    Hi FOGers!

    I need help to customize the setting of my FOG server to increase the performance.

    Environment:

    7000 host in the IT rooms
    300 IT rooms
    9TB of images (increasing)
    60 technicians
    1 FOG server and 1 storage node

    Actually we use an old FOG version (0.30) and works fine … very fine. But we need to migrate the FOG version to the last version.
    To do this step I installed two FOG servers with the 1.5 RC x version (dev and preproduction environments) but I have performance problems.

    1. The web UI goes fine until you send a multicast tasks or you want to see the membership of one group [more info here]
    2. I don’t know if is normal but the mysqld process uses 1,3G of RAM
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    2073 mysql     20   0 3770600 1,372g   3920 S   0,3 11,8   3448:19 mysqld
    
    

    I use mytop tool to see the mysql performance

    MySQL on localhost (5.5.56-MariaDB)     up 48+03:43:38 [17:36:15]
     Queries: 397.4M  qps:  100 Slow:   953.0         Se/In/Up/De(%):    87/01/01/00 
                 qps now:   84 Slow qps: 0.0  Threads:    8 (   1/   0) 86/00/00/00 
     Key Efficiency: 100.0%  Bps in/out: 31.1k/109.1k   Now in/out: 16.5k/144.6k
    

    84 queries per second, are not a lot of?
    3) FOGImageReplicator and FOGSnapinReplicator. If I have only one node, these two daemons, are neccessaries?
    4) Can I enable the php-fdm to increase the performance [https://forums.fogproject.org/topic/10717/can-php-fpm-make-fog-web-gui-fast]?

    G 1 Reply Last reply Feb 27, 2018, 5:02 PM Reply Quote 0
    • G
      george1421 Moderator @Fernando Gietz
      last edited by Feb 27, 2018, 5:02 PM

      @fernando-gietz Lets get a bit more details here.

      1. How many vCPUs does your FOG server have?
      2. Do you use the fog client? If so what is your check in interval?
      3. How many network adapters do you have in this fog server?
      4. Is this fog server virtual or physical?
      5. What kind of disk subsystem do you have? (raid, single disk, ssd,??)

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

      F 1 Reply Last reply Feb 27, 2018, 5:15 PM Reply Quote 0
      • F
        Fernando Gietz Developer @george1421
        last edited by Fernando Gietz Feb 27, 2018, 11:17 AM Feb 27, 2018, 5:15 PM

        @george1421

        How many vCPUs does your FOG server have?
        6 vCPU and 12 GB RAM
        Do you use the fog client? If so what is your check in interval?
        Yes, but is not installed in all of them. Actually the client is installed in 600 computers. CLIENT CHECKIN TIME = 90
        How many network adapters do you have in this fog server?
        Two adapters. One for clients and one for the storage.
        Is this fog server virtual or physical?
        Is virtual
        What kind of disk subsystem do you have? (raid, single disk, ssd,??)
        I dont know 🙂 But is not bad, we use the Production environment of the university. I can do download tasks at 13 GB/min, then I suppose that the disks are not the problem

        OS: RHEL 7 64 bits

        G 1 Reply Last reply Feb 27, 2018, 5:25 PM Reply Quote 0
        • G
          george1421 Moderator @Fernando Gietz
          last edited by george1421 Feb 27, 2018, 11:26 AM Feb 27, 2018, 5:25 PM

          @fernando-gietz It would be interesting to see what top had to say. With 6 vCPUs, it would be interesting to know how many cores your server has. If it has way more than 6, then 6 vCPUs is OK. Otherwise adding more vCPUs than necessary will slow down your VM.

          My initial reaction is to take your client check in time to 15 minutes, in stead of 90 seconds. At 90 seconds you have 600 hosts hitting your FOG server (at an average lineralized rate) of 6 hosts per second. We all know host check in at random. So you might have 15 check in, in 1 second and 2 check in, the next second. So drop your check in period to 10-15 minutes.

          Second I would surely enable php-fpm and memcache to see how well it improves your performance. I have only done this on a small scale and that really helped me with web server responsiveness.

          Hopefully your vm host server uses more than one network interface to the building switches. For a university I might expect that they use 10 - 40GbE networking. Also look at what interface your VM is using to interface with your vm host server. If your hypervisor is ESX (vSphere) then ensure you are using the VMX3 network interface. That should give you 10G to your vSwitch.

          Lastly, you may be at a scale (number of users) where you might consider removing the sql server from FOG and running an independent SQL server specifically configured to run MS SQL.

          I think I might do the first 2 in the list and check on the 3rd one. Leave extracting mssql server out of the fog server until last.

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

          1 Reply Last reply Reply Quote 1
          • F
            Fernando Gietz Developer
            last edited by Fernando Gietz Feb 27, 2018, 11:51 AM Feb 27, 2018, 5:45 PM

            top command:

            top - 18:41:55 up 48 days,  4:49,  2 users,  load average: 0,19, 0,23, 0,29
            Tasks: 282 total,   1 running, 278 sleeping,   0 stopped,   3 zombie
            %Cpu(s):  8,2 us,  2,2 sy,  0,0 ni, 89,6 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
            KiB Mem : 12138956 total,   177100 free,  2809672 used,  9152184 buff/cache
            KiB Swap:  1023996 total,   199544 free,   824452 used.  8521144 avail Mem 
            
              PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                          
            26061 apache    20   0  543340  45800   6768 S  11,3  0,4   6:29.34 httpd                                                            
            13607 apache    20   0  700016  47256   8016 S   9,0  0,4  14:19.99 httpd                                                            
            16160 apache    20   0  678892  27200   9160 S   7,3  0,2   1:32.28 httpd                                                            
             2073 mysql     20   0 3770600 1,372g   3920 S   6,0 11,8   3452:06 mysqld
            

            atop command:

            PRC | sys    0.13s  | user   0.20s  | #proc    285  | #trun	 3  | #tslpi   328  | #tslpu     0  | #zombie    3  | #exit      7  |
            CPU | sys       3%  | user      4%  | irq	0%  | idle    593%  | wait	0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            cpu | sys	1%  | user      0%  | irq	0%  | idle     99%  | cpu003 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            cpu | sys	1%  | user      2%  | irq	0%  | idle     98%  | cpu005 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            cpu | sys	1%  | user	1%  | irq	0%  | idle     99%  | cpu004 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            cpu | sys	1%  | user	0%  | irq	0%  | idle     99%  | cpu000 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            cpu | sys	0%  | user	1%  | irq	0%  | idle     99%  | cpu001 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            cpu | sys	0%  | user	0%  | irq	0%  | idle    100%  | cpu002 w  0%  | guest     0%  | curf 2.67GHz  | curscal   ?%  |
            CPL | avg1    0.08  | avg5    0.19  | avg15   0.27  |               | csw     5925  | intr    5744  |               | numcpu     6  |
            MEM | tot    11.6G  | free  147.2M  | cache   8.5G  | buff    0.1M  | slab  221.8M  | shmem 428.8M  | vmbal   0.0M  | hptot   0.0M  |
            SWP | tot     1.0G  | free  194.9M  |               |               |               |               | vmcom   2.9G  | vmlim   6.8G  |
            LVM |   Datos-root  | busy	1%  | read	 5  | write	 4  | KiB/w	 8  | MBr/s   0.19  | MBw/s   0.01  | avio 4.56 ms  |
            LVM |    Datos-tmp  | busy	0%  | read	 0  | write	 1  | KiB/w	 4  | MBr/s   0.00  | MBw/s   0.00  | avio 1.00 ms  |
            DSK |          sda  | busy      1%  | read       5  | write      5  | KiB/w      7  | MBr/s   0.19  | MBw/s   0.01  | avio 4.20 ms  |
            NET | transport     | tcpi	10  | tcpo	12  | udpi    1924  | udpo    1920  | tcpao      2  | tcppo      2  | tcprs      3  |
            NET | network       | ipi     2102  | ipo     2088  | ipfrw      0  | deliv   2102  |               | icmpi      0  | icmpo      0  |
            NET | ens192  ----  | pcki    2108  | pcko    2088  | si  220 Kbps  | so 1754 Kbps  | erri       0  | erro       0  | drpo       0  |
            NET | ens224  ----  | pcki       1  | pcko       1  | si    0 Kbps  | so    0 Kbps  | erri	 0  | erro	 0  | drpo	 0  |
            

            The checking time, what checks? The computer state? 15 minutes is a lot of for us. Take note that if you send a multicast tasks, the computers will shutdown in very differents moments and some ones will be out of the tasks (if you have a multicast timeout of 5 minutes)

            G 1 Reply Last reply Feb 27, 2018, 6:22 PM Reply Quote 0
            • G
              george1421 Moderator @Fernando Gietz
              last edited by Feb 27, 2018, 6:22 PM

              @fernando-gietz I think maybe we are not talking about the same check in time.
              0_1519755680130_client_checkin.png

              Also your CPU usage doesn’t look bad (according to top).

              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

              F 1 Reply Last reply Feb 28, 2018, 2:06 PM Reply Quote 0
              • F
                Fernando Gietz Developer @george1421
                last edited by Feb 28, 2018, 2:06 PM

                @george1421 We are talking about the same check time 🙂 This check time, what means?

                I am worry about the mysql performance and the huge use of RAM, 1,3GB.

                 2073 mysql     20   0 3770600 1,372g   3920 S   6,0 11,8   3452:06 mysqld
                

                And when I want to see the membership of one group, the apache use the 100% vCPU and I spend two minutes to see the list of them.

                The swap use, is normal? circa 100%

                G 1 Reply Last reply Feb 28, 2018, 2:27 PM Reply Quote 0
                • G
                  george1421 Moderator @Fernando Gietz
                  last edited by george1421 Feb 28, 2018, 8:30 AM Feb 28, 2018, 2:27 PM

                  @fernando-gietz said in How increase the FOG server performance?:

                  We are talking about the same check time 🙂 This check time, what means?

                  What this means, it tells the client “Check back with the server every XX seconds to see if there is something for you to do”. So the clients will query the FOG server every XX seconds to see if there are snapins to deploy or system rename events, or what ever you can schedule with the FOG Server. This I feel the FOG server and MySQL are busy servicing these client check ins to do much of anything else. As I suggested change the check in time to 900 (15 min) and see if this resolves your problem, or makes it easier on the FOG server. If not, you can change it back.

                  Normally with that much ram, swap is never used. 800MB does seem like a lot. 1.3GB of ram for mysql process does seem to be a lot too. Again drop your check in time and wait 30 minutes to see if the resources free up on your fog server.

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                  1 Reply Last reply Reply Quote 0
                  • F
                    Fernando Gietz Developer
                    last edited by Fernando Gietz Feb 28, 2018, 9:11 AM Feb 28, 2018, 3:11 PM

                    I have restarted the mysql server and the usage has downed

                    8895 mysql     20   0 1300380  93492   9236 S   7,0  0,8   0:05.37 mysqld
                    

                    I have config the check_time to 900 seconds

                    1 Reply Last reply Reply Quote 0
                    • F
                      Fernando Gietz Developer
                      last edited by Feb 28, 2018, 3:14 PM

                      The activity of mysql server is huge. I have restarted the server and in seven minutes:

                      MySQL on localhost (5.5.56-MariaDB)     up 0+00:07:00 [16:13:04]
                       Queries: 38.1k  qps:   93 Slow:     0.0         Se/In/Up/De(%):    94/00/00/00 
                                   qps now:  102 Slow qps: 0.0  Threads:    5 (   1/   0) 85/01/00/00 
                       Key Efficiency: 100.0%  Bps in/out: 13.5k/43.9k   Now in/out: 41.3k/190.2k
                      
                            Id      User         Host/IP         DB      Time    Cmd Query or State                                                       
                             --      ----         -------         --      ----    --- ----------                                                           
                            664      root       localhost       test         0  Query show full processlist                                                
                            782      root       localhost        fog         4  Sleep                                                                      
                            768      root       localhost        fog        10  Sleep                                                                      
                            746      root       localhost        fog        19  Sleep                                                                      
                             10      root       localhost        fog       414  Sleep
                      

                      38k queries??

                      1 Reply Last reply Reply Quote 0
                      • F
                        Fernando Gietz Developer
                        last edited by Fernando Gietz Feb 28, 2018, 9:57 AM Feb 28, 2018, 3:56 PM

                        I have config the mysql to log the queries and seems that some queries are fool.

                        180228 16:38:32	  364 Connect	root@localhost as anonymous on fog
                        		  364 Query	USE `fog`
                        		  364 Query	SET SESSION sql_mode=''
                        		  365 Connect	root@localhost as anonymous on fog
                        		  365 Query	USE `fog`
                        		  364 Quit	
                        		  365 Query	SET SESSION sql_mode=''
                        		  366 Connect	root@localhost as anonymous on fog
                        		  366 Query	USE `fog`
                        		  365 Quit	
                        		  366 Query	SET SESSION sql_mode=''
                        		  366 Query	SELECT `vValue` FROM `fog`.`schemaVersion`
                        		  366 Query	SELECT `pName` FROM `plugins`   WHERE `plugins`.`pInstalled`='1' AND `plugins`.`pState`='1'   ORDER BY LOWER(`plugins`.`pName`) ASC
                        		  366 Query	SELECT `settingValue` FROM `globalSettings`   WHERE `globalSettings`.`settingKey` IN ('FOG_DEFAULT_LOCALE','FOG_HOST_LOOKUP','FOG_MEMORY_LIMIT','FOG_REAUTH_ON_DELETE','FOG_REAUTH_ON_EXPORT','FOG_TZ_INFO','FOG_VIEW_DEFAULT_SCREEN')   ORDER BY LOWER(`globalSettings`.`settingKey`) ASC
                        		  366 Query	SELECT COUNT(`hosts`.`hostID`) AS `total` FROM `hosts` WHERE `hostPending` = '1' LIMIT 1
                        		  366 Query	SELECT COUNT(`COLUMN_NAME`)AS`total`FROM`information_schema`.`COLUMNS`WHERE`TABLE_SCHEMA`='fog'AND`TABLE_NAME`='hostMAC'AND`COLUMN_NAME`='hmMAC'
                        		  366 Query	SELECT COUNT(`hostMAC`.`hmID`) AS `total` FROM `hostMAC` WHERE `hmPending` = '1' LIMIT 1
                        		  366 Query	SELECT `settingValue` FROM `globalSettings`   WHERE `globalSettings`.`settingKey` IN ('FOG_URL_AVAILABLE_TIMEOUT','FOG_URL_BASE_CONNECT_TIMEOUT','FOG_URL_BASE_TIMEOUT')   ORDER BY LOWER(`globalSettings`.`settingKey`) ASC
                        		  366 Query	SELECT `globalSettings`.* FROM `globalSettings`  WHERE `settingKey`='FOG_QUICKREG_PENDING_MAC_FILTER'
                        		  366 Query	SELECT COUNT(`hostMAC`.`hmID`) AS `total` FROM `hostMAC` WHERE `hmMAC` IN ('40:b0:34:39:57:ac') AND `hmPending` IN ('0','') LIMIT 1
                        		  366 Query	SELECT `hmMAC` FROM `hostMAC`   WHERE `hostMAC`.`hmMAC` IN ('40:b0:34:39:57:ac') AND `hostMAC`.`hmPending` IN ('0','')   ORDER BY `hostMAC`.`hmID` ASC
                        		  366 Query	SELECT `hmMAC` FROM `hostMAC`   WHERE `hostMAC`.`hmMAC` IN ('40:b0:34:39:57:ac') AND `hostMAC`.`hmIgnoreImaging`='1'   ORDER BY `hostMAC`.`hmID` ASC
                        		  366 Query	SELECT `hostMAC`.* FROM `hostMAC`  WHERE `hmMAC`='40:b0:34:39:57:ac'
                        		  366 Query	SELECT `hmHostID` FROM `hostMAC`   WHERE `hostMAC`.`hmPending` IN ('0','') AND `hostMAC`.`hmMAC` IN ('40:b0:34:39:57:ac')   ORDER BY `hostMAC`.`hmID` ASC
                        		  366 Query	SELECT `hosts`.*,`hostMAC`.*,`images`.*,`os`.*,`imagePartitionTypes`.*,`imageTypes`.*,`hostScreenSettings`.*,`hostAutoLogOut`.*,`inventory`.* FROM `hosts`  LEFT OUTER JOIN `hostMAC` ON `hostMAC`.`hmHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `images` ON `images`.`imageID`=`hosts`.`hostImage`  LEFT OUTER JOIN `os` ON `os`.`osID`=`images`.`imageOSID`  LEFT OUTER JOIN `imagePartitionTypes` ON `imagePartitionTypes`.`imagePartitionTypeID`=`images`.`imagePartitionTypeID`  LEFT OUTER JOIN `imageTypes` ON `imageTypes`.`imageTypeID`=`images`.`imageTypeID`  LEFT OUTER JOIN `hostScreenSettings` ON `hostScreenSettings`.`hssHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `hostAutoLogOut` ON `hostAutoLogOut`.`haloHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `inventory` ON `inventory`.`iHostID`=`hosts`.`hostID`  WHERE `hostID`='7502'  AND `hostMAC`.`hmPrimary` = '1'
                        		  366 Query	SELECT COUNT(`hookEvents`.`heName`) AS `total` FROM `hookEvents` WHERE `hookEvents`.`heName`='QUEUED_STATES' AND `hookEvents`.`heName` <> '0'
                        		  366 Query	SELECT COUNT(`hookEvents`.`heName`) AS `total` FROM `hookEvents` WHERE `hookEvents`.`heName`='PROGRESS_STATE' AND `hookEvents`.`heName` <> '0'
                        		  366 Query	SELECT `taskID` FROM `tasks`  LEFT OUTER JOIN `images` ON `images`.`imageID`=`tasks`.`taskImageID`  LEFT OUTER JOIN `os` ON `os`.`osID`=`images`.`imageOSID`  LEFT OUTER JOIN `imagePartitionTypes` ON `imagePartitionTypes`.`imagePartitionTypeID`=`images`.`imagePartitionTypeID`  LEFT OUTER JOIN `imageTypes` ON `imageTypes`.`imageTypeID`=`images`.`imageTypeID`  LEFT OUTER JOIN `hosts` ON `hosts`.`hostID`=`tasks`.`taskHostID`  LEFT OUTER JOIN `hostMAC` ON `hostMAC`.`hmHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `hostScreenSettings` ON `hostScreenSettings`.`hssHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `hostAutoLogOut` ON `hostAutoLogOut`.`haloHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `inventory` ON `inventory`.`iHostID`=`hosts`.`hostID`  LEFT OUTER JOIN `taskTypes` ON `taskTypes`.`ttID`=`tasks`.`taskTypeID`  LEFT OUTER JOIN `taskStates` ON `taskStates`.`tsID`=`tasks`.`taskStateID`  LEFT OUTER JOIN `nfsGroupMembers` ON `nfsGroupMembers`.`ngmID`=`tasks`.`taskNFSMemberID`  LEFT OUTER JOIN `nfsGroups` ON `nfsGroups`.`ngID`=`nfsGroupMembers`.`ngmGroupID`   WHERE `tasks`.`taskHostID`='7502' AND `tasks`.`taskStateID` IN ('0','1','2','3') AND `hostMAC`.`hmPrimary` = '1'  ORDER BY LOWER(`tasks`.`taskName`) ASC
                        		  366 Query	SELECT `hostMAC`.* FROM `hostMAC`  WHERE `hmMAC`='40:b0:34:39:57:ac'
                        		  366 Quit	
                        

                        In one second queries

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        1 / 1
                        • First post
                          10/11
                          Last post

                        187

                        Online

                        12.0k

                        Users

                        17.3k

                        Topics

                        155.2k

                        Posts
                        Copyright © 2012-2024 FOG Project