Unable to Connect to Storage Node
-
So we’ve been imaging fine, everything AOK and snappy. We are using 1.1.2 at the moment and random machines now say
“Unable to Connect to Fog Storage Node”
when you prep them for imaging. The fog log shows:
[CODE][08-04-14 3:40:16 pm] | StorageNode Not found on this system.
[08-04-14 3:40:26 pm] | StorageNode Not found on this system.
[08-04-14 3:40:36 pm] | StorageNode Not found on this system.
[08-04-14 3:40:46 pm] | StorageNode Not found on this system.
[08-04-14 3:40:56 pm] | StorageNode Not found on this system.
[08-04-14 3:41:06 pm] | StorageNode Not found on this system.
[08-04-14 3:41:16 pm] * No tasks found!
[08-04-14 3:41:26 pm] * No tasks found!
[08-04-14 3:41:36 pm] | StorageNode Not found on this system.
[08-04-14 3:41:46 pm] | StorageNode Not found on this system.
[08-04-14 3:41:56 pm] | StorageNode Not found on this system.
[08-04-14 3:42:06 pm] | StorageNode Not found on this system.
[08-04-14 3:42:16 pm] | StorageNode Not found on this system.
[08-04-14 3:42:26 pm] * No tasks found!
[08-04-14 3:42:36 pm] * No tasks found!
[08-04-14 3:42:46 pm] * No tasks found!
[08-04-14 3:42:56 pm] * No tasks found!
[08-04-14 3:43:06 pm] * No tasks found!
[08-04-14 3:43:16 pm] | StorageNode Not found on this system.
[08-04-14 3:43:26 pm] * No tasks found!
[08-04-14 3:43:36 pm] * No tasks found!
[08-04-14 3:43:46 pm] * No tasks found!
[08-04-14 3:43:56 pm] * No tasks found!
[08-04-14 3:44:06 pm] * No tasks found!
[08-04-14 3:44:16 pm] * No tasks found![/CODE]But the strange thing is:
A) its working on about 1/2 to 3/4 the machines.
B) if I reboot the fog server it works perfectly fine for 10 minutes
C) we haven’t change anything.This is what I see in the apache log
[CODE][Mon Aug 04 13:05:14.045831 2014] [core:notice] [pid 1792] AH00051: child pid 13545 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Mon Aug 04 13:08:58.349566 2014] [core:notice] [pid 1792] AH00051: child pid 14350 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Mon Aug 04 13:08:58.349605 2014] [core:notice] [pid 1792] AH00051: child pid 14351 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Mon Aug 04 13:10:59.584328 2014] [core:notice] [pid 1792] AH00051: child pid 14519 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Mon Aug 04 13:11:24.010007 2014] [:error] [pid 14773] [client 10.34.100.54:49158] PHP Warning: require(…/commons/base.inc.php): failed to open stream: No such file or directory in /var/www/fog/service/servicemodule-active.php on line 2
[Mon Aug 04 13:11:24.010042 2014] [:error] [pid 14773] [client 10.34.100.54:49158] PHP Fatal error: require(): Failed opening required ‘…/commons/base.inc.php’ (include_path=‘.:/usr/share/php:/usr/share/pear’) in /var/www/fog/service/servicemodule-active.php on line 2
[Mon Aug 04 13:11:24.613239 2014] [core:notice] [pid 1792] AH00051: child pid 14667 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Mon Aug 04 13:11:36.631632 2014] [core:notice] [pid 1792] AH00051: child pid 14333 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Mon Aug 04 13:11:36.631711 2014] [core:notice] [pid 1792] AH00051: child pid 14654 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[/CODE]Do we have a bad server? Bug in fog code?.
Any thoughts? Any one seen this before? Most of my searches have turned up information about Storage Nodes that are completely wrong, rather than intermittent.
Also anyone know why the fog log time zone is off? Everything else on the server is in sync with system clock.
D
[edited to include apache log]