Deploy starts then stops - can't deploy ubuntu image to netbook
-
Hi,
I’ve used fog for a while, and it’s always been great. Lately however, I can’t get any linux images to deploy. I’m using ubuntu server 10.04.3, core 2 processor, 2 gbs of ram, and v .32 of FOG. Upload “seems” to work okay, it will occasionally stall, but will eventually complete. Deploy freezes at different points (even when trying to deploy the same image), sometimes 44%, 87%, even 93%, which was a heartbreaker…
Clients are EEE PC 1000HE netbooks, running ubuntu 10.04.3.They have been successfully imaged using FOG in the past (though that was at another site, different server, and v .29). I don’t know why the deploy is freezing, but after it does, the client “thinks” it has been deployed, and goes on to process the rest of the partitions (just swap, so they fail, but that’s never been a problem before).
I have tried this on 3 fog server machines, and 2 hard drives.
Deployment speed (before it hangs) is right around 2.3 gb/min for the first 1/3, then drops to 1.7 for the rest of the imaging.
Can I increase the tolerance for slow deploys, so the client waits longer before reporting “image restored?”
Thanks in advance - we have nearly 200 of these netbooks, and FOG has been invaluable in the past.
Marc
-
FOG 0.29 used a very different boot image that some people have had better luck with. You might want to try reverting to the older boot image and see if that resolves your issue.
-
Yeah - I was thinking of trying that, except that I have been successful imaging these sorts of machines with v.32.
I’ll keep tinkering…
BTW - are you suggesting just using an older kernel, or actually converting down to v .29?
Thanks,
M
-
Have you been able to image the Eee PC’s with 0.32?
All you need to do is replace /tftpboot/fog/images/init.gz with the init.gz from a 0.29 installation.
[CODE]mv /tftpboot/fog/images/init.gz /tftpboot/fog/images/init.gz.32
cd /opt
wget http://sourceforge.net/projects/freeghost/files/FOG/fog_0.29/fog_0.29.tar.gz
tar xzvf fog_0.29.tar.gz /opt/fog_0.29
cp fog_0.29/packages/tftp/fog/images/init.gz /tftpboot/fog/images/[/CODE]Then try imaging on of the Eee PC’s
-
Perfect - I’ll try that now.
Thanks,
M
-
I started that process, then realized I could probably just copy the init.gz from an existing install of v.29 to /tftboot/fog/images/ directory.
I’m trying a deploy now - let me know if there’s some other magic I need to do (I already renamed the previous init.gz to init.gz.32, as per your suggestion).
Thanks,
M
-
Rats - crapped out at about 87% of restoring the first partition.
Any other ideas? I can’t tell whether the server is stalling the push, or the client is choking on it.
Is there a way to increase the capacity of the push, or at least to see whether the server is having a hard time keeping up with I/O?
M
-
With these intermittent issues I’m wondering if it’s a hardware problem. Have you tried with different Eee PC’s? Does the server have any issues with collecting or deploying with other hosts? Have you tried different kernels?
FOG does not have any traffic shaping features at this point, only queuing.
The balancing act between hardware support maintained in the kernel versus modules in the boot image can be tricky, and it could be that you’re experiencing some sort of buffer overflow due to bad hardware support, thus the suggestion of using the 0.29 boot image. While trying various kernel and boot image combinations is not ideal, it may be the simplest way.
-
I restarted my server after the first failure, thinking that perhaps the v.29 boot image wnasn’t getting deployed. Now when I re-pushed it, it appears to have completed, and went to 100%. Awaiting reboot…
It worked! Wild! Okay - let me try it again with a few other machines…
-
Okay - well, it’s worked with two machines now, and that’s all I have to test with for now. Thanks. I’ll have to double check my notes on which netbooks have been successfully imaged with v.32, and which ones with .29. It must be that I imaged this set before I upgraded that fog server (we have 4 throughout the district) to v.32.
Thanks again for your suggestion.
M
-
Okay - well, this has stopped working. I’m running v.32, but am running the kernel from .29. The machine will appear to begin imaging, the progress bar will move along but will stop somewhere between 50% and 80%. I’ve run top on the server to see what’s going on, and it doesn’t look like it’s really working. The load levels are low.
Anyone have any ideas of things to try? FWIW - these same machines can upload an image fine, even though that takes quite a long time (20 minutes). Could it be I/O related?
Also - the deployment stops while processing the first partition, and even though it’s stopped w/out completing, the display on the client computer reads: image restored
Thanks,
M
-
We had the same problem. It turns out that the bzImage that ships with 0.32 does not have the correct nic card drivers for the netbooks. You need to downgrade the kernel in the server interface. Complete information can be found here: [url]http://fogproject.org/forum/threads/get-error-when-trying-to-download-image.453/#post-1961[/url]
-
Excellent news! I had tried various kernel and boot kernel combinations, but gave up and just used an older version of FOG. It’s nice to know we can avail ourselves of the new version and interface (and just switch boot kernels when needed). Thanks for the update!