Imaging Issue
-
you can create an image task with debug mode so you can see exactly where things go wrong and report your findings
-
Of interesting note while I’m in the client system info section… I chose option 2 for IP Information. My network adapter is named enp3s0 and there is no IP associated with it. And when I choose option 5 to ping a host, that fails with sendto: network is unreachable.
I’ll give the debug a go and see what I get. -
I don’t seem to be able to do much in debug mode either, the machine doesn’t get an IP and can’t talk to fog at that point.
-
What is the model of this problem computer?
-
Is the NIC USB or onboard?
-
They are a Lenovo M72e Tiny. The NIC is onboard.
I was on SVN 2961 before this and they worked fine there, if that gives you any reference point. -
Can you try upgrading again? Maybe we’ll have some luck?
-
I’ll give it a try tomorrow when I’m back at work. Thanks!
-
Latest SVN still has the same issue
-
I still want to figure this out if possible. Can you enable kernel debug and set console level to 7 and then try booting to debug task?
-
Sure. I have turned that on and see a whole lot of info fly by before it gets to [root@fogclient /]#
What specifically are you looking for, or is there a way for me to copy all that from the client and send to you? -
Take a video with a smartphone, and just upload it to YouTube, then paste here.
Or you can share the video file via DropBox or Mega.
-
Here they are:
Video of debug task: [media=youtube]wOMm6Lmj7XI[/media]
Video of imaging task: [media=youtube]TUhRb1vYP7U[/media] -
Ok, new info - it might be/probably is a network issue on our end of things. The same style machine works in another building. So I took one machine from the problem lab to this other building and it works there as well.
Spanning tree portfast is enabled everywhere. But thinking this through now, I imaged that lab fine 3 weeks ago on SVN 2961 without issue. There haven’t been any changes on the network switches or anything as far as that goes. Is it possible that the timeout between link down and link up was shortened?
-
In the first video, the network info doesn’t show the interface to have a valid IP address.
This is sort of confirmed in the second video, where it says “link down”.
Obviously it’s getting an IP when it tries to network boot, because it loads the boot file.
Something is happening between the boot file and the rest of the process…You say it works fine on r2961, and these problem computers work fine in other buildings… what revision is the other buildings on? Any newer than r2961 ??
-
Do you have a hub (not a switch) ?
We can use a hub to capture all traffic that is going to one of these computers to see what it’s trying to do, using Wireshark.
If you don’t have a hub, you can run TCPDump on the FOG server and at least see every broadcast message and all messages to/from that computer and the FOG server.
Here’s a tutorial on TCPDump: [url]http://fogproject.org/wiki/index.php/TCPDump[/url]
Unless someone else has a better idea.
-
Right, that’s sort of what it looks like to me. It’s almost like it needs to grab an IP a second time before imaging and that’s where it fails.
All buildings share one common fog server. That server has been updated to the latest revison, so I can’t go back and check 2961 very easily. The computer in question is what I was moving back and forth. In building A, I have the problem where it doesn’t get the IP in the debug. I disconnect it, walk across the parking lot to building B, plug it back in, and it works perfectly.
-
Hmm that’s a thought. I don’t have a hub, but I can set up a monitor session on the switch. Give me about 10 minutes and I’ll upload a capture file.
-
How strange…
Maybe you’ve got a rogue DHCP in building A. Maybe it’s just a defunct patch cable?
-
No rogue DHCPs that I can see. I did a wireshark on a dhcp renewal and only got an offer from our known dhcp server.
The wireshark in debug mode really doesn’t show much, it’s mostly just TCP segments when it’s downloading the kernel.
I’m ruling out the patch cable because there are 60 machines with this same issue.