Fog 1.1.0 multicast sits at "Starting to restore image (-) to device (/dev/sda1)
-
There’s a problem, similar to the tftpd-hpa issue, with mysql. I believe this is what you’re encountering.
It’s trying to start mysql, tftpd-hpa, and the FOG Services before the network device is available to run. What ends up happening is the processes (FOGMulticastManager,FOGImageReplicator,FOGScheduler) are running, but can’t communicate. So a restart doesn’t work because the restart simply kills and starts a new process. The previous process can’t stop so it just starts another process using the original process generated as if it were a child process. Because that doesn’t exist, and horrible looping pattern can be seen. I don’t know what to do to fix it. The proper fix for it is to run:
[code]sudo service FOGMulticastManager stop
sudo service FOGImageReplicator stop
sudo service FOGScheduler stop
sudo service FOGMulticastManager start
sudo service FOGImageReplicator start
sudo service FOGScheduler start[/code] -
Thanks, Tom, I hadn’t come across the MySQL thing but I’ll make sure we’re doing the start-stop thing right from here on out.
-
[quote=“Tom Elliott, post: 34771, member: 7271”]There’s a problem, similar to the tftpd-hpa issue, with mysql. I believe this is what you’re encountering.
It’s trying to start mysql, tftpd-hpa, and the FOG Services before the network device is available to run. What ends up happening is the processes (FOGMulticastManager,FOGImageReplicator,FOGScheduler) are running, but can’t communicate. So a restart doesn’t work because the restart simply kills and starts a new process. The previous process can’t stop so it just starts another process using the original process generated as if it were a child process. Because that doesn’t exist, and horrible looping pattern can be seen. I don’t know what to do to fix it. The proper fix for it is to run:
[code]sudo service FOGMulticastManager stop
sudo service FOGMulticastManager stop
sudo service FOGScheduler stop
sudo service FOGMulticastManager start
sudo service FOGImageReplicator start
sudo service FOGScheduler start[/code][/quote]So, after taking a 5 minute look at this, there’s a quick fix, albeit an semi-incorrect fix.
Things to note here:
A) I use Debian.
B) The fog Ubuntu/Debian installer uses sys-rc-conf to register the initscripts which on a Debian system, while it works, is wrong.The solution I came up with to make this work in a hurry:
A) Edit /etc/init.d/FOG* and change line #4 from [code]# Required-Start: $local_fs $remote_fs $network $syslog $network $inetd[/code] to [code]# Required-Start: $local_fs $remote_fs $network $syslog $network $inetd $all[/code]. Adding the $all keyword here is frowned upon, however I’m looking for a quick fix, not a correct fix… Possibly I will revert this at some point after changing the services themselves to start gracefully and wait for a network to be available, rather than bombing out. (If I do this, I’ll submit a patch)B) update-rc.d (or insserv) is the correct way to register init scripts for Debian. I ran
[code]
insserv -d /etc/init.d/FOGMulticastManager
insserv -d /etc/init.d/FOGScheduler
insserv -d /etc/init.d/FOGImageReplicator
[/code] to reset the registration of the initscripts to the correct dependencies/runlevels as requested in the LSB headers of the initscripts.After a reboot, all services were running, and I could get multicast running correctly.
-
So I’ve been playing around more with this.
I’m using a fresh install of FOG 1.2.0 on Debian 7.6First off, the init scripts don’t seem to be added to the startup correctly.
Fixed this via
[code]
insserv -d /etc/init.d/FOGMulticastManager
insserv -d /etc/init.d/FOGScheduler
insserv -d /etc/init.d/FOGImageReplicator
[/code]Second, going by the old idea that the network wasn’t ready yet, I wrote a routine to wait for the network interface
/var/www/fog/commons/nettest.php
[code]
<?php
function clear_screen($outputdevice) { $GLOBALS[‘FOGCore’]->out(chr(27).“[2J”.chr(27).“[;H”,$outputdevice); }
function wait_interface_ready($interface,$outputdevice) {
while (true) {
$retarr = array();
exec(‘netstat -inN’,$retarr);
array_shift($retarr);array_shift($retarr);
foreach($retarr as $line) {
$t = substr($line,0,strpos($line,’ '));
if ($t === $interface) {
$GLOBALS[‘FOGCore’]->out(“Interface now ready…”,$outputdevice);
break 2;
}
}
$GLOBALS[‘FOGCore’]->out(“Interface not ready, waiting…”,$outputdevice);
sleep(10);
}
}
?>
[/code]
Second I added this to /opt/fog/service/FOG*/FOG* similar to below (with appropriate differences for the different files)
[code]
<?php
@error_reporting(0);
require_once( dirname(realpath(FILE)) . “/…/etc/config.php” );
require_once( WEBROOT . “/commons/base.inc.php” );new code here
require_once( WEBROOT . “/commons/nettest.php” );
clear_screen(MULTICASTDEVICEOUTPUT);
$FOGCore->out($FOGCore->getBanner(), MULTICASTDEVICEOUTPUT);
wait_interface_ready($FOGCore->getSetting(‘FOG_UDPCAST_INTERFACE’),MULTICASTDEVICEOUTPUT);$MM = new MulticastManager(); if( ! file_exists( UDPSENDERPATH ) ) { $MM->outall(sprintf(" * Unable to locate udp-sender!.")); exit; } $MM->serviceStart(); $MM->serviceRun(); $MM->outall(sprintf(" * Service has ended."));
?>
[/code]
At this point I tested with the network interfaces being up/down when service is started, and it tests ok… the various /opt/fog/service/FOG*/FOG* scripts will wait until a network device is available before continuing.
Now I rebooted, expecting the issue to be solved… but nope. While more robust, it’s still failing to start with a boot.
After fooling around awhile, I figured out that it’s because the FOG* scripts are being called before mysql is started in the boot sequence. These scripts fail hard without a SQL connection. I’m not sure how to fix that, but I’m not sure that it’s a needed fix at this point.
With the above routine still in place, I also modified the LSB headers in /etc/init.d/FOG*, removing the default run levels so the system can maintain them properly on it’s own, and adding mysql to the required start line (Yes, the lack of $ is important, as it’s a service provided by another init script, not a system service)
[code]BEGIN INIT INFO
Provides: FOGMulticastManager
Required-Start: $local_fs $remote_fs $network $syslog $network $inetd mysql
Required-Stop: $local_fs $remote_fs $network $syslog $network $inetd
Default-Start:
Default-Stop:
X-Interactive: true
Short-Description: Start/Stop FOGMulticastManager
Long-Description: Created by Chuck Syperski
Used to stop and start the FOGMulticastManager Service.
FOGMulticastManager is used to destribute images through
Multicast. Useful to image large amounts of systems simultaneously.
It serves this ability only if it’s the master node.
END INIT INFO
[/code]
Ran insserv again to update the bootscripts…
[code]
insserv -d /etc/init.d/FOGMulticastManager
insserv -d /etc/init.d/FOGScheduler
insserv -d /etc/init.d/FOGImageReplicator
[/code]And now after a reboot, all the services are starting correctly.
-
[quote=“Mentaloid, post: 35670, member: 4362”]So I’ve been playing around more with this.
I’m using a fresh install of FOG 1.2.0 on Debian 7.6First off, the init scripts don’t seem to be added to the startup correctly.
Fixed this via
[code]
insserv -d /etc/init.d/FOGMulticastManager
insserv -d /etc/init.d/FOGScheduler
insserv -d /etc/init.d/FOGImageReplicator
[/code]Second, going by the old idea that the network wasn’t ready yet, I wrote a routine to wait for the network interface
/var/www/fog/commons/nettest.php
[code]
<?php
function clear_screen($outputdevice) { $GLOBALS[‘FOGCore’]->out(chr(27).“[2J”.chr(27).“[;H”,$outputdevice); }
function wait_interface_ready($interface,$outputdevice) {
while (true) {
$retarr = array();
exec(‘netstat -inN’,$retarr);
array_shift($retarr);array_shift($retarr);
foreach($retarr as $line) {
$t = substr($line,0,strpos($line,’ '));
if ($t === $interface) {
$GLOBALS[‘FOGCore’]->out(“Interface now ready…”,$outputdevice);
break 2;
}
}
$GLOBALS[‘FOGCore’]->out(“Interface not ready, waiting…”,$outputdevice);
sleep(10);
}
}
?>
[/code]
Second I added this to /opt/fog/service/FOG*/FOG* similar to below (with appropriate differences for the different files)
[code]
<?php
@error_reporting(0);
require_once( dirname(realpath(FILE)) . “/…/etc/config.php” );
require_once( WEBROOT . “/commons/base.inc.php” );new code here
require_once( WEBROOT . “/commons/nettest.php” );
clear_screen(MULTICASTDEVICEOUTPUT);
$FOGCore->out($FOGCore->getBanner(), MULTICASTDEVICEOUTPUT);
wait_interface_ready($FOGCore->getSetting(‘FOG_UDPCAST_INTERFACE’),MULTICASTDEVICEOUTPUT);$MM = new MulticastManager(); if( ! file_exists( UDPSENDERPATH ) ) { $MM->outall(sprintf(" * Unable to locate udp-sender!.")); exit; } $MM->serviceStart(); $MM->serviceRun(); $MM->outall(sprintf(" * Service has ended."));
?>
[/code]
At this point I tested with the network interfaces being up/down when service is started, and it tests ok… the various /opt/fog/service/FOG*/FOG* scripts will wait until a network device is available before continuing.
Now I rebooted, expecting the issue to be solved… but nope. While more robust, it’s still failing to start with a boot.
After fooling around awhile, I figured out that it’s because the FOG* scripts are being called before mysql is started in the boot sequence. These scripts fail hard without a SQL connection. I’m not sure how to fix that, but I’m not sure that it’s a needed fix at this point.
With the above routine still in place, I also modified the LSB headers in /etc/init.d/FOG*, removing the default run levels so the system can maintain them properly on it’s own, and adding mysql to the required start line (Yes, the lack of $ is important, as it’s a service provided by another init script, not a system service)
[code]BEGIN INIT INFO
Provides: FOGMulticastManager
Required-Start: $local_fs $remote_fs $network $syslog $network $inetd mysql
Required-Stop: $local_fs $remote_fs $network $syslog $network $inetd
Default-Start:
Default-Stop:
X-Interactive: true
Short-Description: Start/Stop FOGMulticastManager
Long-Description: Created by Chuck Syperski
Used to stop and start the FOGMulticastManager Service.
FOGMulticastManager is used to destribute images through
Multicast. Useful to image large amounts of systems simultaneously.
It serves this ability only if it’s the master node.
END INIT INFO
[/code]
Ran insserv again to update the bootscripts…
[code]
insserv -d /etc/init.d/FOGMulticastManager
insserv -d /etc/init.d/FOGScheduler
insserv -d /etc/init.d/FOGImageReplicator
[/code]And now after a reboot, all the services are starting correctly.[/quote]
Mentaloid,
I’ve added much of the scripts you did (less the nettest.php as I added it directly into FOGCore class.) Also all the edits. Hopefully this will help us all out. Thank you for taking the time to test/troubleshoot and post your findings.
-
[quote=“Tom Elliott, post: 35689, member: 7271”]Mentaloid,
I’ve added much of the scripts you did (less the nettest.php as I added it directly into FOGCore class.) Also all the edits. Hopefully this will help us all out. Thank you for taking the time to test/troubleshoot and post your findings.[/quote]
No worries at at all Tom, the least I can do to help the project!
I had figured you would add it directly to the FOGCore classI tested this against Ubuntu as well. Unfortunately, Ubuntu/upstart ignores LSB headers in the init scripts (bah… Ubuntu re-invents the wheel, and cripples it in the process again). Another solution will have to be found for this mysql issue for more compatibility.
2 options exist…
A) write a routine that checks for mysql to be alive, similar to wait_interface_ready.
B) Create an upstart style initscript, and install that instead of insserv/lsb initscript when installing on upstart OS’sI like option A better, as it wouldn’t require separating the installer into ubuntu/upstart and debian/inserv again (yuck). I’ll look into how/where you implemented wait_interface_ready in the FOGCore class, and see if I can create a similar routine for mysql this weekend (or soon thereafter). The problem I can see with this is that FOGCore itself is what bombs without mysql. I’ll toy and see whats possible.
I also noticed a device output mistake in /var/www/fog/lib/fog/MulticastManager.class.php on line 227. This should be MULTICASTDEVICEOUTPUT. In SVN, it is using REPLICATORDEVICEOUTPUT.
-
Thanks for reporting and I’ll fix it right now.
-
Toying around complete… Conclusions:
A) once base.inc.php is included, the full FOGCore/FOGBase is initialized. There doesn’t look to be any way I can see with the current structure to have the classes wait for the database when used as a daemon. Once called, an exception and exit is desired for normal circumstances (called via the CGI/Apache SAPI) if the database isn’t available. If we create an exemption for daemons (CLI SAPI) during DB class construction, then we are stuck with having to determine which service is currently executing and output to the correct output device the daemon for messages and waiting. I don’t see this as desirable operation inside the base DB classes as this isn’t the job of the DB class, or for that matter FOGCore/Base.B) have a class specific to daemons that handles wait_interface_ready and wait_db_ready routines. Feasibly, the only use for these routines would be for the daemons anyways.
I’ve gone ahead and have done option B.
Attached are the patched /opt/fog/service/FOG*/FOG* files, as well as a Daemon.class.php (/var/www/fog/lib/fog/Daemon.class.php).
$Daemon->wait_db_ready() and $Daemon->wait_interface_ready() are both implemented. wait_interface_ready() has some fixes to make it a little more compatible with servers that are using Network-Manager/Network-Manager-Gnome.Tested against Debian 7.6 and Ubuntu 14.04, with and without NM purged.
[url=“/_imported_xf_attachments/1/1313_FOGServices.zip?:”]FOGServices.zip[/url]
-
I’ve added the Daemon class and the edited FOG Service files. I appreciate the assist.
-
Hi
I have the same problem with fog 1.2 and ubuntu 10.04.
It stays at : Starting to restore image (-) to device (/dev/sda1) on my 8 computers
Any ideaThanks
-
1.2.0 stock has the known issue above… if you don’t wish to use the SVN tree, then the above scripts in my previous post could be applied manually.
-
I have the same setup as phm2000 and same issue. I tried the SVN tree following this instructions [url]http://www.fogproject.org/wiki/index.php/SVN[/url]
Unfortunattely the issue is still there.
-
Hi Mentaloid
I tried your files but same issue.
I tried svn, same issue.
The multicast task dissapear from list after 1 minute but individual tasks stays on active task list. -
[quote=“phm2000, post: 36243, member: 24664”]Hi Mentaloid
I tried your files but same issue.
I tried svn, same issue.
The multicast task dissapear from list after 1 minute but individual tasks stays on active task list.[/quote]Have you tried making sure FOGMulticastManager service is actually running properly?
[code]sudo service FOGMulticastManager stop && sleep 30 && sudo service FOGMulticastManager start[/code]
-
Hi,
i made a complete fresh installation with Ubuntu Server 12.04.5 (and also Debian 7.5) and the lastest FOG SVN 2270 (yesterday) with in a NAT VM. I can´t do a multicast deploy of an image. I created a group and assigned one single client (NAT VM too) to this group. its not working (multicast deploy with a group to a single client also not working) . It´s starts up until it cames to the partclone screen. After this it hangs and nothing happens. With Unicast everything works fine and i can create and deploy an image.
In the LOG-Viewer i found serveral entrys about
[CODE][09-10-14 7:18:25 am] | Task (1) mcgroup-with-one-client is already running PID 3999
[09-10-14 7:18:35 am] | Task (1) mcgroup-with-one-client is already running PID 3999
[09-10-14 7:18:45 am] | Task (1) mcgroup-with-one-client is already [/CODE]I´m also noticed that the time is incorrect. I tried to correct it with the hints i found in the forum but it shows still the wrong time. I´am also tried to apply the patches i found here in the thread but it looks like they are already there.
With the new SVN i found the Option Multicast-Image under “Image-Management”. When i create with this an multicast session, start the clients over pxe and select “Join Multicast Session” the multicast restore working. This is really confusing that it is working this way but not the way through the group management.
Here the install Log:
[CODE]Script started on Di 09 Sep 2014 13:58:57 CEST
Debian..#######:. ..,#,.. .::##::.
.:###### .:;####:…;#;…
…##… …##;,;##::::.##…
,# …##…##:::## …::
## .::###,##. . ##.::#.:######::.
…##:::###::…#. … .#…#. #…#:::.
…:####:… …##…##::## … #
# . …##:,;##;:::#: … ##…
.# . .:;####;::::.##:::;#:…
# …:;###…###########################################
FOG
Free Computer Imaging Solution
http://www.fogproject.org/
Developers:
Chuck Syperski
Jian Zhang
Peter Gilchrist
Tom Elliott
GNU GPL Version 3
###########################################
Version: 1.3.0 Installer/Updater
- Found FOG Settings from previous install at: /opt/fog/.fogsettings
- Performing upgrade using these settings…
Starting Debian / Ubuntu / Kubuntu / Edubuntu Installtion.
#####################################################################
FOG now has everything it needs to setup your server, but please
understand that this script will overwrite any setting you may
have setup for services like DHCP, apache, pxe, tftp, and NFS.It is not recommended that you install this on a production system
as this script modifies many of your system settings.This script should be run by the root user on Redhat or with sudo on Ubuntu.
** Notice ** Redhat users will need to disable SELinux and iptables in
order to use FOG
Please see our wiki for more information at http://www.fogproject.org/wikiHere are the settings FOG will use:
Base Linux: Debian
Detected Linux Distribution: Debian
Installation Type: Normal Server
Server IP Address: 192.168.83.134
DHCP router Address: 192.168.83.2
DHCP DNS Address: 192.168.83.2
Interface: eth0
Using FOG DHCP: 1
Internationalization: 0
Image Storage Location: /images
Donate: 0Are you sure you wish to continue (Y/N) y
Installation Started…
Installing required packages, if this fails
make sure you have an active internet connection.-
Preparing apt-get
-
Installing package: apache2
-
Installing package: php5
-
Installing package: php5-json
-
Installing package: php5-gd
-
Installing package: php5-cli
-
Installing package: php5-mysql
-
Installing package: php5-curl
-
Installing package: mysql-server
We are about to install MySQL Server on
this server, if MySQL isn’t installed already
you will be prompted for a root password.Press enter to acknowledge this message.
Paketlisten werden gelesen…
Abhängigkeitsbaum wird aufgebaut…
Statusinformationen werden eingelesen…
mysql-server ist schon die neueste Version.
0 aktualisiert, 0 neu installiert, 0 zu entfernen und 0 nicht aktualisiert.- Installing package: mysql-client
- Installing package: isc-dhcp-server
- Installing package: tftpd-hpa
- Installing package: tftp-hpa
- Installing package: nfs-kernel-server
- Installing package: vsftpd
- Installing package: net-tools
- Installing package: wget
- Installing package: xinetd
- Installing package: sysv-rc-conf
- Installing package: tar
- Installing package: gzip
- Installing package: build-essential
- Installing package: cpp
- Installing package: gcc
- Installing package: g++
- Installing package: m4
- Installing package: htmldoc
- Installing package: lftp
- Installing package: openssh-server
- Installing package: php-gettext
Confirming package installation.
- Checking package: apache2…OK
- Checking package: php5…OK
- Checking package: php5-json…OK
- Checking package: php5-gd…OK
- Checking package: php5-cli…OK
- Checking package: php5-mysql…OK
- Checking package: php5-curl…OK
- Checking package: mysql-server…OK
- Checking package: mysql-client…OK
- Checking package: isc-dhcp-server…OK
- Checking package: tftpd-hpa…OK
- Checking package: tftp-hpa…OK
- Checking package: nfs-kernel-server…OK
- Checking package: vsftpd…OK
- Checking package: net-tools…OK
- Checking package: wget…OK
- Checking package: xinetd…OK
- Checking package: sysv-rc-conf…OK
- Checking package: tar…OK
- Checking package: gzip…OK
- Checking package: build-essential…OK
- Checking package: cpp…OK
- Checking package: gcc…OK
- Checking package: g++…OK
- Checking package: m4…OK
- Checking package: htmldoc…OK
- Checking package: lftp…OK
- Checking package: openssh-server…OK
- Checking package: php-gettext…OK
Configuring services.
-
Setting up and starting MySql…OK
-
Backing up user reports…OK
-
Setting up and starting Apache Web Server…OK
You still need to install/update your database schema.
This can be done by opening a web browser and going to:
Press [Enter] key when database is updated/installed.
- Configuring Fresh Clam…OK
- Setting up storage…OK
- Setting up and starting NFS Server…OK
- Setting up and starting DHCP Server…OK
- Setting up and starting TFTP and PXE Servers…OK
- Setting up and starting VSFTP Server…OK
- Setting up sudo settings…OK
- Setting up FOG Snapins…OK
- Setting up and building UDPCast…OK
- Installing init scripts…OK
- Setting up FOG Services…OK
- Starting FOG Multicast Management Server…OK
- Starting FOG Image Replicator Server…OK
- Starting FOG Task Scheduler Server…OK
- Setting up FOG Utils…OK
Setup complete!
You can now login to the FOG Management Portal using
the information listed below. The login information
is only if this is the first install.This can be done by opening a web browser and going to:
http://192.168.83.134/fog/management Default User: Username: fog Password: password
Script done on Di 09 Sep 2014 13:59:59 CEST
[/CODE]I hope you can help me
-
I’m aware of a problem with multicast starting and the only work around I have found is to truncate your multicastSessions and multicastSessionsAssoc tables. On the fog server, kill all the current udp-sender processes.
[code]mysql -u root [ -p’PASSWORDHERE’ #only if you have a mysql password] fog
truncate table multicastSessions;
truncate table multicastSessionsAssoc;
delete * from tasks WHERE taskTypeID=‘8’;
exit;
sudo killall udp-sender; sudo killall udp-sender; sudo killall udp-sender
sudo service FOGMulticastManager restart;[/code]Then recreate your multicast task (not the one to “join” session). All should work.
-
No sorry i had no luck. multicast still not working. unicast for create/deploy an image are working. he still hangs at the same point at the partclone screen. I also tried debug mode - everything looks fine, no errors on screen until this point.
-
Has there been any news on this on development point of view?
-
[quote=“Tom Elliott, post: 36435, member: 7271”]I’m aware of a problem with multicast starting and the only work around I have found is to truncate your multicastSessions and multicastSessionsAssoc tables. On the fog server, kill all the current udp-sender processes.
[code]mysql -u root [ -p’PASSWORDHERE’ #only if you have a mysql password] fog
truncate table multicastSessions;
truncate table multicastSessionsAssoc;
delete * from tasks WHERE taskTypeID=‘8’;
exit;
sudo killall udp-sender; sudo killall udp-sender; sudo killall udp-sender
sudo service FOGMulticastManager restart;[/code]Then recreate your multicast task (not the one to “join” session). All should work.[/quote]
HiI tried it but without success, it stays blocked at “starting to restore image”
-
Not sure this is related to any of the previous issues in the thread, but I had several Ubuntu 14.04 Fog servers using 1.2.0 that would unicast perfectly but get “starting to restore image” when multicasting.
I finally found that the FOG_UDPCAST_INTERFACE value under Multicast Settings was wrong. It was set to eth0 while my adapter was eth1. Not sure how I managed that, but hope it helps someone.