Imaging Issue
-
I’ve got a weird issue going on. Fog SVN 3281, Ubuntu 12.04. Deploying an image works fine on most stations, but on a certain type of machine it never starts. I had to take a video of the screen and play it back to catch what’s going on because it happens to fast, but basically it gets to the screen where it’s checking things. It says Checking OS…Win7, checking cpu cores…2, Send method…NFS, Attempting to send Inventor… and then after that point, it does something that looks like you’re holding down the enter key. The screen is all black and there’s a white cursor in the bottom left hand corner. If I press a letter, I see it flash by and go up off the top of the screen - just as if I was holding enter in a terminal screen.
It definitely appears to be an issue with this certain group of machines as imaging works fine elsewhere. The only thing that comes to mind is to change the boot loader, so I’ve tried undionly.kpxe and undionly.kkpxe but they both exhibit the same behavior. I’m not even sure if that’s a good place to start, but thought it might be worth a try.
Any ideas?
-
Can you try a system information and run compatibility check?
-
It looks good.
This computer appears to be compatible with FOG
Network…Pass
Disk…Pass -
you can create an image task with debug mode so you can see exactly where things go wrong and report your findings
-
Of interesting note while I’m in the client system info section… I chose option 2 for IP Information. My network adapter is named enp3s0 and there is no IP associated with it. And when I choose option 5 to ping a host, that fails with sendto: network is unreachable.
I’ll give the debug a go and see what I get. -
I don’t seem to be able to do much in debug mode either, the machine doesn’t get an IP and can’t talk to fog at that point.
-
What is the model of this problem computer?
-
Is the NIC USB or onboard?
-
They are a Lenovo M72e Tiny. The NIC is onboard.
I was on SVN 2961 before this and they worked fine there, if that gives you any reference point. -
Can you try upgrading again? Maybe we’ll have some luck?
-
I’ll give it a try tomorrow when I’m back at work. Thanks!
-
Latest SVN still has the same issue
-
I still want to figure this out if possible. Can you enable kernel debug and set console level to 7 and then try booting to debug task?
-
Sure. I have turned that on and see a whole lot of info fly by before it gets to [root@fogclient /]#
What specifically are you looking for, or is there a way for me to copy all that from the client and send to you? -
Take a video with a smartphone, and just upload it to YouTube, then paste here.
Or you can share the video file via DropBox or Mega.
-
Here they are:
Video of debug task: [media=youtube]wOMm6Lmj7XI[/media]
Video of imaging task: [media=youtube]TUhRb1vYP7U[/media] -
Ok, new info - it might be/probably is a network issue on our end of things. The same style machine works in another building. So I took one machine from the problem lab to this other building and it works there as well.
Spanning tree portfast is enabled everywhere. But thinking this through now, I imaged that lab fine 3 weeks ago on SVN 2961 without issue. There haven’t been any changes on the network switches or anything as far as that goes. Is it possible that the timeout between link down and link up was shortened?
-
In the first video, the network info doesn’t show the interface to have a valid IP address.
This is sort of confirmed in the second video, where it says “link down”.
Obviously it’s getting an IP when it tries to network boot, because it loads the boot file.
Something is happening between the boot file and the rest of the process…You say it works fine on r2961, and these problem computers work fine in other buildings… what revision is the other buildings on? Any newer than r2961 ??
-
Do you have a hub (not a switch) ?
We can use a hub to capture all traffic that is going to one of these computers to see what it’s trying to do, using Wireshark.
If you don’t have a hub, you can run TCPDump on the FOG server and at least see every broadcast message and all messages to/from that computer and the FOG server.
Here’s a tutorial on TCPDump: [url]http://fogproject.org/wiki/index.php/TCPDump[/url]
Unless someone else has a better idea.
-
Right, that’s sort of what it looks like to me. It’s almost like it needs to grab an IP a second time before imaging and that’s where it fails.
All buildings share one common fog server. That server has been updated to the latest revison, so I can’t go back and check 2961 very easily. The computer in question is what I was moving back and forth. In building A, I have the problem where it doesn’t get the IP in the debug. I disconnect it, walk across the parking lot to building B, plug it back in, and it works perfectly.