Tons of httpd processes


  • Moderator

    Is this normal?

     ps -ef | grep httpd | wc -l
    127
    

    127 httpd processes… it lowers slowly if I open up the fog interface and log in and just let it sit… but if I don’t do that, it just climbs higher and higher over time.


  • Moderator

    I think I can make this tool into a plugin… just a thought…


  • Moderator

    @Sebastian-Roth I’m not having load issues anymore, the processor was under high utilization because Fedora 23 has issues inside of Hyper-V and the polling was causing too much of a load.

    But Tom and I think jbob have begun working on limiting the polling down to just 1 per checkup instead of many.


  • Developer

    @Wayne-Workman Do we need to further look into this!?


  • Moderator

    So - just ran some basic tests.

    Using Wireshark with a capture filter for just tcp port 80, I did two captures. Each capture lasted exactly 300 seconds.

    Capture 1 - baseline - Packets sent to FOG Server on port 80: 61,933

    Capture 2 - http blocked - Packets sent to FOG Server on port 80: 9,771

    So my theory was wrong, the new client does not start spamming if the FOG server is offline. It appears that when the FOG Client realizes the FOG server is back online – all the encryption/communications all come all at once - and it’s just A LOT for my server to deal with.


  • Moderator

    So I figured out that 10.2.3.119 was consistently having high http connection numbers. Turns out, that’s my desktop. Which makes sense because I usually am logged into FOG’s web management interface while I’m looking at these numbers.

    The spikes in httpd instances is for sure a buildup of clients wanting to connect to the FOG server - due to me either updating the fog server, snapshotting the fog server, or rebooting the fog server. It seems like if there is a failure in communication between fog clients and the fog server, clients start spamming the fog server. @Jbob

    I’ll confirm or deny this is true with wireshark. I’ll monitor average traffic flow for http while the fog server is on, and average traffic flow for http when I turn off the fog server. If the traffic goes wild while the server is off, that’s a problem.

    More on this later.

    0_1454699577481_upload-a47edb47-961f-49aa-8638-707b6e0fca8e


  • Moderator

    Fun fun. This was during a fog update I believe.

    12 httpd instances running. 01:34 PM 02-02-2016
    
    Number of http connections: 2004
    
    Top Offenders: 40 10.2.3.11 25 10.2.3.108 20 10.2.4.9 20 10.2.4.36
    
    34 httpd instances running. 01:33 PM 02-02-2016
    
    Number of http connections: 2245
    
    Top Offenders: 25 10.2.3.218 23 10.2.3.157 22 10.2.3.180 22 10.2.3.108
    
    94 httpd instances running. 01:32 PM 02-02-2016
    
    Number of http connections: 1462
    
    Top Offenders: 73 10.2.3.11 23 10.2.3.178 22 10.2.3.156 20 10.2.4.47
    
    155 httpd instances running. 01:31 PM 02-02-2016
    
    Number of http connections: 2196
    
    Top Offenders: 47 10.2.3.11 19 10.2.3.92 19 10.2.3.162 18 10.2.4.45
    
    69 httpd instances running. 01:30 PM 02-02-2016
    
    Number of http connections: 386
    
    Top Offenders: 16 10.2.3.178 15 10.2.3.218 13 10.2.3.197 13 10.2.3.153
    
    1 httpd instances running. 01:29 PM 02-02-2016
    
    Number of http connections: 930
    
    Top Offenders: 20 10.2.3.222 17 10.2.4.38 17 10.2.4.1 16 10.2.3.89
    
    48 httpd instances running. 01:28 PM 02-02-2016
    
    Number of http connections: 1753
    
    Top Offenders: 20 10.2.3.109 18 10.2.4.47 18 10.2.3.242 18 10.2.3.182
    
    32 httpd instances running. 01:27 PM 02-02-2016
    
    Number of http connections: 1472
    
    Top Offenders: 24 10.2.3.108 23 10.2.4.40 22 10.2.3.204 20 10.2.4.36
    
    76 httpd instances running. 01:26 PM 02-02-2016
    
    Number of http connections: 1583
    
    Top Offenders: 20 10.2.3.13 19 10.2.3.92 19 10.2.3.14 18 10.2.3.163
    
    112 httpd instances running. 01:25 PM 02-02-2016
    
    Number of http connections: 1478
    
    Top Offenders: 20 10.2.3.218 19 10.2.3.93 18 10.2.4.47 18 10.2.3.7
    
    115 httpd instances running. 01:24 PM 02-02-2016
    
    Number of http connections: 1251
    
    Top Offenders: 18 10.2.3.14 17 10.2.3.194 17 10.2.3.184 16 10.2.3.19
    
    57 httpd instances running. 01:23 PM 02-02-2016
    
    Number of http connections: 1762
    
    Top Offenders: 23 10.2.3.205 21 10.2.3.108 20 10.2.3.221 20 10.2.3.180
    
    117 httpd instances running. 01:22 PM 02-02-2016
    
    Number of http connections: 1556
    
    Top Offenders: 23 10.2.3.105 22 10.2.3.93 21 10.2.3.87 21 10.2.3.216
    
    177 httpd instances running. 01:21 PM 02-02-2016
    
    Number of http connections: 1779
    
    Top Offenders: 21 10.2.32.235 20 10.2.3.160 19 10.2.3.218 18 10.2.4.1
    
    203 httpd instances running. 01:20 PM 02-02-2016
    
    Number of http connections: 1295
    
    Top Offenders: 13 10.2.3.108 12 10.2.3.19 12 10.2.3.185 11 10.2.4.94
    
    15 httpd instances running. 01:19 PM 02-02-2016
    
    Number of http connections: 322
    
    Top Offenders: 18 10.2.3.143 16 10.2.3.27 16 10.2.3.165 16 10.2.3.110
    
    17 httpd instances running. 01:18 PM 02-02-2016
    
    Number of http connections: 2010
    
    Top Offenders: 54 10.2.3.11 23 10.2.3.152 21 10.2.3.163 20 10.2.4.40
    

  • Moderator

    This is after a server reboot at 8:33 this morning.

    0_1454078336963_upload-e229539c-f1ff-4ebd-8db5-452443182c9a


  • Moderator

    I’ve made some headway on this particular issue. I’ve realized that a few clients have an unusually large number of connections to the server.

    I’ve modified my monitor.sh script to be the following:

    netstat=/usr/bin/netstat
    #Get the date.
    dt="$(date  +"%I:%M %p %m-%d-%Y")"
    
    #Get number of running httpd instances.
    x=$( /usr/bin/ps -ef | /usr/bin/grep httpd | /usr/bin/wc -l )
    color=$x
    
    
    # Seriousness multiplier
    let color*=5
    
    #Don't let it go over the maximum.
    if [[ $color -gt 255 ]]; then
            color=255
    fi
    
    # Convert to hex.
    hexR=$(printf '%x\n' $color)
    
    topOffenders=$(netstat -tn 2>/dev/null | grep :80 | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -n 4)
    
    echo '<p>Top Offenders: '$topOffenders'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html
    
    
    httpCount=$(netstat | grep http | wc -l)
    
    echo '<p>Number of http connections: '$httpCount'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html
    
    
    #Print the ine to the file.
    echo '<p style="color: #66ccff; background-color: #'$hexR'0000">'$x' httpd instances running. '$dt'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html
    
    chown apache:apache /var/www/html/httpd.html
    

    The output now looks like below.

    0_1454076180364_upload-806cc931-2ed8-4b77-864e-0814d549a6a9

    The “Top offenders” area is number of connections from a host, and then the host’s IP, all space delimited. I limited it to 4 IPs.

    Interestingly enough, the FOG server has around 50 connections to itself.


  • Moderator

    @Jbob does the new client do anything in the event of the host shutting down?


  • Moderator

    I’ve set my client check-in time to two minutes.

    This was right after a snapshot was taken in Hyper-V:

    0_1454015975734_upload-f240e1a5-f87a-4d32-a387-49a2b124c0e3

    This was immediately after @Jbob got done with tinkering with SELinux:

    0_1454016070273_upload-51589441-26de-4d1a-b109-4beee0200734


  • Moderator

    Looking through the logs, this happened shortly after yesterday’s episode, just 11 minutes later roughly.

    0_1453911152705_upload-3afda6e4-faa3-41e0-9ce6-23dc555c57db


  • Moderator

    @Jbob I’m blaming this on the new fog client.

    Right when the fog server updates or reboots or has a snapshot taken, those things take time and the Apache service goes down or the server becomes unresponsive during these brief moments.

    I think that when the fog client cannot communicate with the server, something happens that causes them to rapidly try over and over. can you please look into it?


  • Moderator

    Just happened again, happened just after I snapshotted the VM, and updated fog.

    Holy cow 220 httpd processes…

    0_1453836457361_upload-e2230ba2-678b-4b12-b50a-a85152f2943a


  • Moderator

    The phenomena did not happen this morning… @Moderators @Developers Thoughts?


  • Moderator

    Ok so I have found an instance of where the httpd processes gets out of control.

    0_1453814329232_upload-5f84b1e6-9475-43e8-9ff4-2e1d56dbea0b

    Interestingly, right before that, this happened:

    0_1453814385020_upload-1e6e7add-c038-4563-b646-5d65e432a6d4

    Looking at the results… it’s possible that at almost 8:00 AM on Monday, a lot of computers were turned on all at once. But what doesn’t make sense to me is how I wouldn’t see this sort of spike every morning.


  • Moderator

    Yesterday was a snow day here. So no users, but some computers did WOL at 7:30. There was no effect to the number of httpd processes.

    I’ve noticed small spikes during times that I would presume large numbers of people are logging into computers all at generally the same time.

    But I’ve seen nothing even close to the 127 reported a week ago.

    So far, the highest I’ve seen since I’ve made this monitoring tool is 36, and that was when I was updating FOG this morning, oddly.

    0_1453399808946_upload-b194afe7-ed93-48ae-85f8-73797826dbf4


  • Moderator

    I made this script to monitor the httpd processes. Problem was, I can’t easily figure out where the issue is when I’m going through thousands of lines of text that all looks the same to me. I needed a way to make it more visual than just text. I modified my script to include color to represent severity of the problem. It highlights the text based on the number of httpd processes returned.

    So… if it returned 100 processes, the text would be highlighted bright red.

    Updated script:

    #Get the date.
    dt="$(date  +"%I:%M %p %m-%d-%Y")"
    
    #Get number of running httpd instances.
    x=$( /usr/bin/ps -ef | /usr/bin/grep httpd | /usr/bin/wc -l )
    color=$x
    
    
    # Seriousness multiplier
    let color*=5
    
    #Don't let it go over the maximum.
    if [[ $color -gt 255 ]]; then
            color=255
    fi
    
    # Convert to hex.
    hexR=$(printf '%x\n' $color)
    
    #Print the ine to the file.
    echo '<p style="color: #66ccff; background-color: #'$hexR'0000">'$x' httpd instances running. '$dt'</p>' | cat - /var/www/html/httpd.html > /var/www/html/temp && mv /var/www/html/temp /var/www/html/httpd.html
    chown apache:apache /var/www/html/httpd.html
    

    Here’s my crontab -e entry for it, this makes it run every minute.

    * * * * * /root/monitor.sh
    

    Initial html file create code for Fedora/CentOS:

    rm -f /var/www/html/httpd.html;touch /var/www/html/httpd.html;chown apache:apache /var/www/html/httpd.html
    

    Sample output is below. I artificially manipulated the last return value in the script just to demonstrate color difference

    0_1453228495139_upload-2dba2040-34af-4337-8b78-2b21d1adceec


  • Moderator

    I’m going to set up a cron event to monitor this… data over time will help with solving the issue.


Log in to reply
 

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.