SVN 6513 Suddenly having couple hundred (already known) hosts showing up under pending...
-
@Wayne-Workman This last instance happened last night while away from all machines.
-
@Hanz Can we setup some CRON tasks to monitor your DB so we have some sort of idea about what’s happening at the moment the hosts are lost?
-
@Wayne-Workman that’s fine
-
This script is CRON ready. Set it up to run under root’s crontab event, for every minute. I can help set this up, but basically,
Switch to root.
Ubuntu/Debian:sudo su
Fedora/CentOS/RHEL:su root
I’d recommend putting the script into /root so:
cd /root
Make a new file called monitor.sh with:
vi monitor.sh
Copy/paste this into the file, save and exit.Then make that file executable with
chmod +x monitor.sh
then enter root’s crontab script with:
crontab -e
and make a entry that looks like:
* * * * * /path/to/the/script/monitor.sh
Then save that.#----- MySQL Credentials -----# snmysqluser="" snmysqlpass="" snmysqlhost="" # If user and pass is blank, leave just a set of double quotes like "" # if the db is local, set the host to just double quotes "" or "127.0.0.1" or "localhost" #----- Test & Environment Specific Variables -----# minimumExpectedApprovedHosts=1744 minimumExpectedMacs=3511 log=/root/monitor.log logHighlights=/root/monitorHighlights.log #Set this to your $PATH variable. #You can get your $PATH variable by executing the following command: # echo $PATH myPATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin" #----- Set paths of all used programs -----# previousIFS=$IFS IFS=: for p in $myPATH; do #Get mysql path if [[ -f $p/mysql ]]; then mysql=$p/mysql fi #Get echo path if [[ -f $p/echo ]]; then echo=$p/echo fi #Get date path if [[ -f $p/date ]]; then date=$p/date fi #Get uptime path if [[ -f $p/uptime ]]; then uptime=$p/uptime fi #Get free path if [[ -f $p/free ]]; then free=$p/free fi #Get ss path if [[ -f $p/ss ]]; then ss=$p/ss fi done #Restore previous contents of IFS IFS=$previousIFS #----- Begin Program -----# sqlCountApprovedHosts="SELECT COUNT(*) FROM hosts WHERE hostPending <> 1" sqlCountTotalHosts="SELECT COUNT(*) FROM hosts" sqlCountMacs="SELECT COUNT(*) FROM hostMAC" sqlShowProcesses="SHOW FULL PROCESSLIST" NOW=$($date '+%d/%m/%Y %H:%M:%S') options="-sN" if [[ $snmysqlhost != "" ]]; then options="$options -h$snmysqlhost" fi if [[ $snmysqluser != "" ]]; then options="$options -u$snmysqluser" fi if [[ $snmysqlpass != "" ]]; then options="$options -p$snmysqlpass" fi options="$options -D fog -e" $echo "----------------------------------------------------------" >> $log $echo "----------------------------------------------------------" >> $log $echo "----------------------------------------------------------" >> $log $echo "----------------------------------------------------------" >> $log $echo "----------------------------------------------------------" >> $log $echo "----------------------------------------------------------" >> $log $echo $NOW >> $log $echo " " >> $log $echo "Number of Approved Hosts:" >> $log NumberOfApprovedHosts=$($mysql $options "$sqlCountApprovedHosts") $echo "$NumberOfApprovedHosts" >> $log $echo " " >> $log $echo "Number of Total Hosts:" >> $log NumberOfTotalHosts=$($mysql $options "$sqlCountTotalHosts") $echo "$NumberOfTotalHosts" >> $log $echo " " >> $log $echo "Number of Total MAC addresses:" >> $log NumberOfTotalMacAddresses=$($mysql $options "$sqlCountMacs") $echo "$NumberOfTotalMacAddresses" >> $log $echo " " >> $log $echo "uptime Output:" >> $log UptimeOutput=$($uptime) $echo "$UptimeOutput" >> $log $echo " " >> $log $echo "free Output:" >> $log FreeOutput=$($free) $echo "$FreeOutput" >> $log $echo " " >> $log $echo "Connected TCP Socket Connections:" >> $log ConnectedTCPSocketConnections=$($ss --tcp) $echo "$ConnectedTCPSocketConnections" >> $log $echo " " >> $log $echo "All current MySQL Processes:" >> $log AllCurrentMysqlProcesses=$($mysql $options "$sqlShowProcesses") $echo "$AllCurrentMysqlProcesses" >> $log $echo " " >> $log #Below is the check for seperating the interesting logs. #This can be changed as needed for other purposes. if [[ $NumberOfApprovedHosts -le $minimumExpectedApprovedHosts ]]; then $echo "----------------------------------------------------------" >> $logHighlights $echo "----------------------------------------------------------" >> $logHighlights $echo "----------------------------------------------------------" >> $logHighlights $echo "----------------------------------------------------------" >> $logHighlights $echo "----------------------------------------------------------" >> $logHighlights $echo "----------------------------------------------------------" >> $logHighlights $echo $NOW >> $logHighlights $echo " " >> $logHighlights $echo "Number of Approved Hosts:" >> $logHighlights $echo "$NumberOfApprovedHosts" >> $logHighlights $echo " " >> $logHighlights $echo "Number of Total Hosts:" >> $logHighlights $echo "$NumberOfTotalHosts" >> $logHighlights $echo " " >> $logHighlights $echo "Number of Total MAC addresses:" >> $logHighlights $echo "$NumberOfTotalMacAddresses" >> $logHighlights $echo " " >> $logHighlights $echo "uptime Output:" >> $logHighlights $echo "$UptimeOutput" >> $logHighlights $echo " " >> $logHighlights $echo "free Output:" >> $logHighlights $echo "$FreeOutput" >> $logHighlights $echo " " >> $logHighlights $echo "Connected TCP Socket Connections:" >> $logHighlights $echo "$ConnectedTCPSocketConnections" >> $logHighlights $echo " " >> $logHighlights $echo "All current MySQL Processes:" >> $logHighlights $echo "$AllCurrentMysqlProcesses" >> $logHighlights $echo " " >> $logHighlights fi
-
Here’s the github project for that:
https://github.com/wayneworkman/MonitorFOGHosts -
Sample output:
---------------------------------------------------------- ---------------------------------------------------------- ---------------------------------------------------------- ---------------------------------------------------------- ---------------------------------------------------------- ---------------------------------------------------------- 02/03/2016 14:33:01 Pending Hosts: 837 R0219508346WDMB 838 r0219508713WDMB Number of Approved Hosts: 450 Number of Total Hosts: 452 Number of Total MAC addresses: 493 uptime Output: 14:33:01 up 20 days, 5:54, 1 user, load average: 0.77, 0.90, 1.00 free Output: total used free shared buff/cache available Mem: 4038628 807812 452320 2652 2778496 3145004 Swap: 4063228 15872 4047356 Connected TCP Socket Connections: State Recv-Q Send-Q Local Address:Port Peer Address:Port LAST-ACK 1 1 10.2.1.11:50359 54.164.24.149:http ESTAB 0 0 10.2.1.11:50760 54.164.24.149:http ESTAB 0 64 10.2.1.11:ssh 10.2.3.9:49455 CLOSE-WAIT 1 0 10.2.1.11:50372 54.164.24.149:http ESTAB 0 0 10.2.1.11:43108 54.209.230.199:http ESTAB 0 0 10.2.1.11:58418 54.209.230.199:http SYN-SENT 0 1 10.2.1.11:41105 10.2.3.28:microsoft-ds LAST-ACK 1 1 10.2.1.11:42702 54.209.230.199:http CLOSE-WAIT 1 0 10.2.1.11:58319 52.6.165.90:http All current MySQL Processes: 37 root localhost fog Sleep 7 NULL 0.000 38 root localhost fog Sleep 48 NULL 0.000 39 root localhost fog Sleep 51 NULL 0.000 40 root localhost fog Sleep 52 NULL 0.000 41 root localhost fog Sleep 0 NULL 0.000 7589 root localhost fog Sleep 2 NULL 0.000 8459 root localhost fog Sleep 3 NULL 0.000 9621 root localhost fog Sleep 2 NULL 0.000 11409 root localhost fog Sleep 3 NULL 0.000 12354 root localhost fog Sleep 2 NULL 0.000 12465 root localhost fog Sleep 2 NULL 0.000 12510 root localhost fog Sleep 2 NULL 0.000 12529 root localhost fog Sleep 2 NULL 0.000 12535 root localhost fog Sleep 2 NULL 0.000 12555 root localhost fog Sleep 2 NULL 0.000 12694 root localhost fog Query 0 init SHOW FULL PROCESSLIST 0.000
-
@Hanz Any news on this yet? I find it very concerning to hear that hosts “drop” into pending state out of nowhere… Hope we can get some more details on this.
-
@Sebastian-Roth Sry, it just hasn’t happened again…I actually thought my database kept failing or something, but @Wayne-Workman thought it reminded him of a bug where hosts were disappearing, hence the title. I’ve setup the monitoring script to watch database, but nothing has occurred yet. I will post if/when something happens.
-
@Hanz Searching through the FOG code for ‘hostPending’ and ‘pending’ (minus pending MAC stuff) I can’t really find any place where FOG would set hostPending other than in lib/client/registerclient.class.php - but this is only called when the new client auto-registers an unknown client. In the clients description you’d see ‘Pending Registration created by FOG_CLIENT’ then! Can you please check what the descriptions of those hosts look like?
-
@Sebastian-Roth I’m thinking the database crashed honestly, it happened before, but I was trying to fix an issue with hosts that had been named incorrectly and fighting with fog client (I didn’t know was set to re-domain and name…tough fight until I figured out what was happening)… but it just recently happened overnight with nothing touched. All groups went to zero hosts and I had no hosts period. After reinstalling the database it was fixed both times.
-
@Hanz Thanks a lot for your answer! I’ll mark this solved for now but please keep us posted if you see it happen again!!