Can not create deployment tasks
I’m currently utilizing Fog 1.1.1, and every time I try to create a multicast deployment task I get the following error.
[CENTER][SIZE=16px][FONT=Ubuntu][COLOR=#ff0000]Download task failed to create for[/COLOR][/FONT][/SIZE][/CENTER]
[CENTER][SIZE=16px][FONT=Ubuntu][COLOR=#ff0000]FOGFTP: Failed to connect. Host: 10.1.0.69, Error: Undefined index: conn[/COLOR][/FONT][/SIZE][/CENTER]
Running off CentOS 6.5 with default installation for a normal server, everything appears to be running and I can get the PXE Boot, register, and have the client pull an image but not create a deployment task for it.
As far as I can tell all settings and configuration are correct. I’m not sure what I’m missing here.
Any help would be appreciated, thank you.
I ran into a lot of the problems. Since it’s only an imaging server I literally opened up TCP/UPD 0:65535 ports and disabled Selinux. Everything works great now.
Looks like the FOGMulticastManager service didn’t start up at all, from default install or even by reboot on, CentOS 6.5 at least.
FOGMulticastManager is stopped
Starting FOGMulticastManager: [ OK ]
Found the service and started it, multicast imaging is now working!
EDIT: For the record, in case anyone else has this issue and comes across this thread, you WILL get a process for the UDP being sent out. You should get two, and one’s gonna be several lines long.
[root@fog]# ps aux | grep udp
root 19163 0.0 0.0 106096 1180 pts/0 S 09:58 0:00 sh -c exec cat “/home/images/TestImage/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd;cat “/home/images/TestImage/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd;cat “/home/images/TestImage/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd;cat “/home/images/TestImage/d1p4.img”|/usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd;cat “/home/images/TestImage/d1p5.img”|/usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd;cat “/home/images/TestImage/d1p6.img”|/usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd;
root 19277 4.8 0.1 34720 8280 pts/0 Sl 10:00 0:08 /usr/local/sbin/udp-sender --min-receivers 1 --portbase 54172 --interface eth0 --full-duplex --ttl 32 --nokbd
root 19505 0.0 0.0 103252 840 pts/0 S+ 10:03 0:00 grep udp
If your only result back is the “grep udp” then you’re not working.
When I create a multicast deployment task and the client is at the screen of “Starting to restore image” should I see a udp-sender program running in my processes? I’m running down the troubleshooting page I found on the wiki, and it seems to work. [url]http://www.fogproject.org/wiki/index.php/Troubleshooting_a_multicast[/url]
I can’t get the manual part at the bottom to work to force imaging (gunzip keeps ignoring cause it’s a directory) but I can do the first tests with 1 and 2 clients for sending the multicast.log file fine.
I’ll fiddle with it tomorrow to see what else I can wiggle from it to see about multicast to verify if it’s looking good or not through a tcpdump and all.
I uploaded an image and then tried to download the same image of my laptop and that apparently has borked it so now in repair mode on that.
Is there a specific reason to use single system multicast?
I still want to help fix it, but as you said you have multiple nodes, my suspicion is that’s why you’re seeing the waiting screen, it’s not looking at the correct node to get the image from, or hear the udp-cast message.
That error is gone, and now I get to partclone and it overs at “Starting to restore image” when multicast sending it (Only one PC but want to test the functionality before I start testing mass devices). If I do a direct download task it works fine.
I found another thread where group multicast wasn’t working, but single was. ([url]http://fogproject.org/forum/threads/fog-1-1-0-multicast-sits-at-starting-to-restore-image-to-device-dev-sda1.10782/[/url]). Seems I’m experiencing the reverse.
Multicast is/should be configured correctly on this test network at least. The same switch is doing multicast for our IPTV so the traffic should be working fine I imagine.
The .mntcheck should be in /images/.mntcheck (or in your case, /home/images/.mntcheck) and in /images/dev/.mntcheck (or in your case, /home/images/dev/.mntcheck)
You can accomplish this with:
chmod 777 -R /home/images[/code]
Hopefully this helps!
Alright I’m not out of the woods yet. I can upload an image from my client to the server, but when I go to do a multicast deployment back down to the client, I get this error.
[QUOTE]Fatal Error: Failed to mount NFS Volume.
- If you believe the file system is mounted, make sure you have a file called .mntcheck in the directory you are mounting on the server
Computer will reboot in one minute[/QUOTE]
So in my folder (/home/images/Cgrosslaptop) I don’t see the .mntcheck but I do see it in /home/images/dev/.mntcheck. If I copy it over it fails as well.
I also do a symbolic link for /images to go to /home/images due to partitioning issues (CentOS installer kept getting super mad).
Looks like that worked Tom, after changing the password I guess it took a bit and everything just “cleared up” on its own. I’ve done a re-install to re-partition the server and everything’s working immediately without touching the fog user password.
I believe I’m set and ready to go here on this problem, now just time to go read the wiki and forums on some other things. Thanks for all the help, I appreciate the quick responses and helpful advice from all.
10.1.0.69 is the storage node, correct. I did a Normal install when I set Fog up.
I clicked the eye to see the plaintext and it’s correctly typed in. I had someone double check for me to verify I wasn’t making a mistake as well visually.
EDIT: I just went to make the task again to get the error to show someone and it says that the host is already a member of an active task. The machine is off so I’m gonna boot it and see what it does.
Is 10.1.0.69 a storage node as well?
It looks like it’s trying to ftp for a download task, but that’s not right.
Download Tasking deployments have NOTHING to do with FOGFTP except to verify if the image actually exists where it’s trying to do the image from.
[quote]10.1.0.69, Username:fog, Password: *******[/quote]
The password, is it masked on the GUI as well, or does it display in plain text? I’m not asking you to get me the password, but I am asking that the password it’s displaying (if it is) is the same password you used to ftp in?
Alright all 3 have the correct password but it’s still giving me that error from the Fog WebGUI. However I can now successfully FTP in from another machine.
Name (10.1.0.69:root): fog
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
Try resetting the FOG Server’s fog user password with
Then Set that password in Storage Node’s Management Password AND in:
FOG Configuration->FOG Settings->TFTP Settings->FOG_TFTP_FTP_PASSWORD
The problem with 6.4 and 6.5 is:
/etc/sysconfig/selinux should be a link to /etc/selinux/config
However this didn’t always happen. So checking in both locations works best.
Looks like selinux was still enabled, so the wiki instructions for 6.4 that says are the same for 6.5 need a minor edit, whoops.
Now I get a new error! Progress has been made.
Multi-Cast task failed to create for Test with image 1001PXD
FOGFTP: Login failed. Host: 10.1.0.69, Username:fog, Password: *******, Error: (ftp_login(): Please specify the password
As you’re running CentOS have you disabled iptables and selinux?
It appears it fails out. This is using the U/P for the Storage Management Default Member.
root@DevBox:~# ftp 10.1.0.69
Connected to 10.1.0.69.
220 (vsFTPd 2.2.2)
Name (10.1.0.69:root): fog
331 Please specify the password.
530 Login incorrect.
I should note I was able to create a task to tell a machine to upload its image to the server, but after a bit it started spitting the error out on the screen of the client.
What happens if you try ftping to:
It should present a similar screen to:
Enter the user fog
Then it will present the password
enter the FOG User password (the password should be in storage node management password)
Yes, I’m tailing the log.
The problem is when I try to deploy a task for a machine to be imaged from the server I get the error, not sure what steps that has to do with storage replication currently (I’m pretty new to Fog).
Are you tailing the log from command line?
/opt/fog/log watch tail fogreplicator.log
it should give you info pertaining the replication state of the storage group.