Hanging on "Booting the kernel" with a "#"
-
Hi everyone
This is my first post here so apologies for my n00bness. I am quite new to FOG and Ubuntu. Also, I am sorry if this topic has already been covered in another thread that I have failed to notice - please just point me in the right direction.
I have recently upgraded my FOG server to Ubuntu 12.04 and the latest FOG version 0.32. I finally got everything working after a lot of troubleshooting but I keep getting a problem which seems to only occur randomly. I can register hosts fine, upload images fine etc, but every so often I will get a machine that it will just not deploy an image to.
I will PXE boot the machine and everything will seem to go fine, but after “booting the kernel” I will just get a “#” and it will hang then do nothing. I have tried using multiple kernels to no avail. The strange thing is that it will work on an identical machine with the same hardware. For example, I am currently loading 10 x HP DC530 SFF. All of them imaging fine until I get to number 5 and it’s just hanging there with a “#” just before the blue deploy screen. I have had this happen with a Dell 745, 755 and also a D600 laptop now and I just can’t figure out what the problem is. I have been using Google and can’t seem to find anyone with this problem, and anyone who has a slightly similar problem I will follow instructions with no joy.
It is not a hard drive error, as I can pull the HDD out and image it on another HP D530 and it works fine - it just seems to be machine specific. It could well be some settings is the BIOS, which I have tried to troubleshoot myself but I am now reaching out for help.Sorry if there is some information I have failed to supply. Please let me know and I will do my best
Thanks in advance -
Anybody have any ideas on this at all? I was thinking of running the latest Ubuntu updates, but can anyone confirm that they are all stable with the latest version of FOG? (I tried updating an earlier version of Ubuntu with an earlier version of FOG before and it stopped working :/)
Thanks
-
OK, so the lack of response on this thread leads me to believe there is nobody else out there that is having this problem and nobody who has a clue how to fix it. However, this problem still persists for me but I have had a bit of a break-through today (!) and thought I would share it with the community on the off chance that it’ll be useful to somebody one day, and if not that then just for bragging rights!
So I had another biggish order to process today (10 machines) and they were all Dell GX620s, PDs, 2GB and Win 7 Pro. I set up the fist machine, registered it on Fog and renamed the Host to “Gems 1”. (I always rename the Host name from the mac address to a reference of the customer) Selected the image I wanted to deploy and clicked update. Decided to do a quick image as I will probably just use this one machine to load all of the HDDs for the order so I didn’t create a task for it. Selected Quick Image from the PXE boot menu and I got an error saying “Quick Image Failed… Invalid Host Information”. So went back onto Fog Management and created a task for it instead. I get the DREADED HASH AGAIN! :mad:Getting a bit fed up of this now I went back to the host information on Fog Management and decided to tweak it a little. First thing I did was rename the host from “Gems 1” to “Gems0”. Then I deleted all of the “Host Description” (Usually Fog will put in the date and time the host was created by default) so it was completely blank. Then in the Host Kernel, I pointed it to an old Kernel that we used on an old Fog server that never had this problem (Kitchen Sink 2.6.31.1) and finally in Host Arguments I typed in “nomodeset”.
Now, the last two things I mentioned (Alternative Kernel and nomodeset) I have tried many times before and it got me nowhere but I thought I’d try them with the other things I did just for good measure. Clicked update on the Host, killed the previous task for it and then created another one. PXE booted the machine and it imaged fine!!YES!!!
Now I’m not completely sure which variable actually did the trick but it worked. It could have been the name change, deleting the Host Description, or it could have been a combination of the whole lot. The next time I get this problem with a machine though I will try one method at a time and try to figure out what is actually causing this.
Will post as soon as I get answers!