Tons of httpd processes
-
Looking through the logs, this happened shortly after yesterday’s episode, just 11 minutes later roughly.
-
I’ve set my client check-in time to two minutes.
This was right after a snapshot was taken in Hyper-V:
This was immediately after @Jbob got done with tinkering with SELinux:
-
@Jbob does the new client do anything in the event of the host shutting down?
-
I’ve made some headway on this particular issue. I’ve realized that a few clients have an unusually large number of connections to the server.
I’ve modified my
monitor.sh
script to be the following:netstat=/usr/bin/netstat #Get the date. dt="$(date +"%I:%M %p %m-%d-%Y")" #Get number of running httpd instances. x=$( /usr/bin/ps -ef | /usr/bin/grep httpd | /usr/bin/wc -l ) color=$x # Seriousness multiplier let color*=5 #Don't let it go over the maximum. if [[ $color -gt 255 ]]; then color=255 fi # Convert to hex. hexR=$(printf '%x\n' $color) topOffenders=$(netstat -tn 2>/dev/null | grep :80 | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -n 4) echo '<p>Top Offenders: '$topOffenders'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html httpCount=$(netstat | grep http | wc -l) echo '<p>Number of http connections: '$httpCount'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html #Print the ine to the file. echo '<p style="color: #66ccff; background-color: #'$hexR'0000">'$x' httpd instances running. '$dt'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html chown apache:apache /var/www/html/httpd.html
The output now looks like below.
The “Top offenders” area is number of connections from a host, and then the host’s IP, all space delimited. I limited it to 4 IPs.
Interestingly enough, the FOG server has around 50 connections to itself.
-
This is after a server reboot at 8:33 this morning.
-
Fun fun. This was during a fog update I believe.
12 httpd instances running. 01:34 PM 02-02-2016 Number of http connections: 2004 Top Offenders: 40 10.2.3.11 25 10.2.3.108 20 10.2.4.9 20 10.2.4.36 34 httpd instances running. 01:33 PM 02-02-2016 Number of http connections: 2245 Top Offenders: 25 10.2.3.218 23 10.2.3.157 22 10.2.3.180 22 10.2.3.108 94 httpd instances running. 01:32 PM 02-02-2016 Number of http connections: 1462 Top Offenders: 73 10.2.3.11 23 10.2.3.178 22 10.2.3.156 20 10.2.4.47 155 httpd instances running. 01:31 PM 02-02-2016 Number of http connections: 2196 Top Offenders: 47 10.2.3.11 19 10.2.3.92 19 10.2.3.162 18 10.2.4.45 69 httpd instances running. 01:30 PM 02-02-2016 Number of http connections: 386 Top Offenders: 16 10.2.3.178 15 10.2.3.218 13 10.2.3.197 13 10.2.3.153 1 httpd instances running. 01:29 PM 02-02-2016 Number of http connections: 930 Top Offenders: 20 10.2.3.222 17 10.2.4.38 17 10.2.4.1 16 10.2.3.89 48 httpd instances running. 01:28 PM 02-02-2016 Number of http connections: 1753 Top Offenders: 20 10.2.3.109 18 10.2.4.47 18 10.2.3.242 18 10.2.3.182 32 httpd instances running. 01:27 PM 02-02-2016 Number of http connections: 1472 Top Offenders: 24 10.2.3.108 23 10.2.4.40 22 10.2.3.204 20 10.2.4.36 76 httpd instances running. 01:26 PM 02-02-2016 Number of http connections: 1583 Top Offenders: 20 10.2.3.13 19 10.2.3.92 19 10.2.3.14 18 10.2.3.163 112 httpd instances running. 01:25 PM 02-02-2016 Number of http connections: 1478 Top Offenders: 20 10.2.3.218 19 10.2.3.93 18 10.2.4.47 18 10.2.3.7 115 httpd instances running. 01:24 PM 02-02-2016 Number of http connections: 1251 Top Offenders: 18 10.2.3.14 17 10.2.3.194 17 10.2.3.184 16 10.2.3.19 57 httpd instances running. 01:23 PM 02-02-2016 Number of http connections: 1762 Top Offenders: 23 10.2.3.205 21 10.2.3.108 20 10.2.3.221 20 10.2.3.180 117 httpd instances running. 01:22 PM 02-02-2016 Number of http connections: 1556 Top Offenders: 23 10.2.3.105 22 10.2.3.93 21 10.2.3.87 21 10.2.3.216 177 httpd instances running. 01:21 PM 02-02-2016 Number of http connections: 1779 Top Offenders: 21 10.2.32.235 20 10.2.3.160 19 10.2.3.218 18 10.2.4.1 203 httpd instances running. 01:20 PM 02-02-2016 Number of http connections: 1295 Top Offenders: 13 10.2.3.108 12 10.2.3.19 12 10.2.3.185 11 10.2.4.94 15 httpd instances running. 01:19 PM 02-02-2016 Number of http connections: 322 Top Offenders: 18 10.2.3.143 16 10.2.3.27 16 10.2.3.165 16 10.2.3.110 17 httpd instances running. 01:18 PM 02-02-2016 Number of http connections: 2010 Top Offenders: 54 10.2.3.11 23 10.2.3.152 21 10.2.3.163 20 10.2.4.40
-
So I figured out that 10.2.3.119 was consistently having high http connection numbers. Turns out, that’s my desktop. Which makes sense because I usually am logged into FOG’s web management interface while I’m looking at these numbers.
The spikes in httpd instances is for sure a buildup of clients wanting to connect to the FOG server - due to me either updating the fog server, snapshotting the fog server, or rebooting the fog server. It seems like if there is a failure in communication between fog clients and the fog server, clients start spamming the fog server. @Jbob
I’ll confirm or deny this is true with wireshark. I’ll monitor average traffic flow for http while the fog server is on, and average traffic flow for http when I turn off the fog server. If the traffic goes wild while the server is off, that’s a problem.
More on this later.
-
So - just ran some basic tests.
Using Wireshark with a capture filter for just
tcp port 80
, I did two captures. Each capture lasted exactly 300 seconds.Capture 1 - baseline - Packets sent to FOG Server on port 80: 61,933
Capture 2 - http blocked - Packets sent to FOG Server on port 80: 9,771
So my theory was wrong, the new client does not start spamming if the FOG server is offline. It appears that when the FOG Client realizes the FOG server is back online – all the encryption/communications all come all at once - and it’s just A LOT for my server to deal with.
-
@Wayne-Workman Do we need to further look into this!?
-
@Sebastian-Roth I’m not having load issues anymore, the processor was under high utilization because Fedora 23 has issues inside of Hyper-V and the polling was causing too much of a load.
But Tom and I think jbob have begun working on limiting the polling down to just 1 per checkup instead of many.
-
I think I can make this tool into a plugin… just a thought…