Intel UNDI Stuck Initializing
-
[quote=“Uncle Frank, post: 45333, member: 28116”]So iPXE is initializing the device but we don’t know which one it is. I tried to find out which NIC is in that 11e and it took me at least 20 minutes. I really hate all those websites where you read “Tech specs… CPU … Gigabit-Ethernet …”!! No NIC brand or model name, arrgh!
Found it: Should be RTL8111/8168B, right?? This NIC is should be supported by iPXE from what I can see in the source code. Seems like we need mode debugging to get this working. The easier step (I think) is to capture a packet dump to see if iPXE is sending any packets after initializing the device. The first packet should be a broadcast (DHCP request) so you should see it everywhere on your network. Please install wireshark on one of your PCs, connect it to the same net with FOG server and client and capture while booting up the 11e Thinkpad. You should see DHCP request when the NIC itself greps an IP first, then iPXE comes up. Do you see another DHCP request after that??
Second, you need to build a custom iPXE binary with debug enabled. I just added a section to the existing wiki article about debugging, see here: [url]http://fogproject.org/wiki/index.php/Building_undionly.kpxe#Debugging[/url]
In your case I’d start with adding ‘DEBUG=undi,dhcp’. Hopefully you can see where it stucks. Please post a picture of it.
I really wonder why ipxe.kpxe is not working for you. To explain: undionly uses the general UNDI interface to communicate with the NIC. If you want to use real iPXE drivers you need to use one of the ipxe.xxx binaries. You can give that a try with debugging enabled (DEBUG=realtek) too!!
Edit: I just found a couple of threads about iPXE and realtek 8111… Most are saying that they got it working with that NIC.
[url]http://fogproject.org/forum/threads/realtek-8111-8168-undionly-kpxe-hangs-on-initialising-devices.10453/[/url]
[url]http://lists.ipxe.org/pipermail/ipxe-devel/2014-October/003837.html[/url][/quote]Once I have created a debug kpxe where does the debug file get dropped?
-
It get’s placed in the /tftpboot directory as the replacement to the originally named file.
For example. If you’re building the undionly.kpxe debug built file, you would copy the fresh built undionly.kpxe to your /tftpboot/undionly.kpxe.
Hopefully that helps.
-
[quote=“SeqSupport@Edkey, post: 45387, member: 27616”]Once I have created a debug kpxe where does the debug file get dropped?[/quote]
Not sure if understand your question. As Tom already noted, place the compiled binary into your TFTP root directory. If you were asking where the debug output will end up! That you will see right on the screen when booting up the client… no file output. -
[quote=“Uncle Frank, post: 45391, member: 28116”]Not sure if understand your question. As Tom already noted, place the compiled binary into your TFTP root directory. If you were asking where the debug output will end up! That you will see right on the screen when booting up the client… no file output.[/quote]
So we have figured out that there is an option in the bios that protects the memory from malicious attacks and that was blocking ipxe. But unfortunately it gets to configuring net0 and errors out with a 0x040ee119 now. Seeing what debug info I can get off of the undionly.kkpxe I built.
-
That means for some reason ipxe is not able to receive an IP address for that nic.
-
Also, do you have STP (Spanning Tree Protocol) on your network? If you do, is there anyway you can disable it or use Rapid STP/PortFast?
-
[quote=“Tom Elliott, post: 45394, member: 7271”]Also, do you have STP (Spanning Tree Protocol) on your network? If you do, is there anyway you can disable it or use Rapid STP/PortFast?[/quote]
I would have to speak to the boss about the network related issue. We started from scratch and have not updated to the newest SVN. Would this possibly make a difference? We did upgrade the tftpboot folder but that is all.
-
It could pose an issue, except it’s not getting far enough to get to the point of having an issue to begin with. It needs to get dhcp first.
-
Is it possible to connect DHCP (windows), FOG and the client to a small office switch just for testing?? If you don’t see a difference then we know it’s a real iPXE/FOG issue. But I am pretty sure you don’t run into DHCP timeout in a small setup.
To see what’s really going on you can use wireshark and tcpdump to capture the packets on the network.
Either you install wireshark on your windows DHCP server (not sure if you are allowed to do this but if might come in handy again). Or you can use a hub to connect in front of the client (your normal switch - hub - client) or configure a monitoring port on that switch where your client is connected.
Use a laptop to capture the traffic on that hub. You’ll see a lot of stuff, I am sure. Try display filters ‘bootp’ (DHCP) and ‘tftp’. You are welcome to upload the saved pcap file for us to inspect. -
[quote=“Uncle Frank, post: 45403, member: 28116”]Is it possible to connect DHCP (windows), FOG and the client to a small office switch just for testing?? If you don’t see a difference then we know it’s a real iPXE/FOG issue. But I am pretty sure you don’t run into DHCP timeout in a small setup.
To see what’s really going on you can use wireshark and tcpdump to capture the packets on the network.
Either you install wireshark on your windows DHCP server (not sure if you are allowed to do this but if might come in handy again). Or you can use a hub to connect in front of the client (your normal switch - hub - client) or configure a monitoring port on that switch where your client is connected.
Use a laptop to capture the traffic on that hub. You’ll see a lot of stuff, I am sure. Try display filters ‘bootp’ (DHCP) and ‘tftp’. You are welcome to upload the saved pcap file for us to inspect.[/quote]We have a portable laptop we use as a fog server for imaging networks which do not have a dhcp server. We installed a fresh copy of Debian and Fog 1.2.0 and updated to the latest SVN. Tested both undionly.kkpxe and .kpxe plus .efi and uefi and still the same result. I will do a wireshark on the fog laptop and see what I get.
-
Here is the wireshark capture.
[url=“/_imported_xf_attachments/1/1868_Wireshark Capture.zip?:”]Wireshark Capture.zip[/url]
-
Unfortunately I am not able to see what’s going on by looking at the packet dump. I see a full DHCP conversation (discover, offer, request, ack). That’s probably when the NIC itself requests and IP (and PXE) on boot up. After that I can see another DHCP discover and offer. But the client does not seem to handle the offer coming from the DHCP server. Anyone else got an idea what might be wrong here?
Could you please build your own iPXE binary and add “DEBUG=dhcp” to the make call?? Might be interesting to see the output. Please take a picture!
-
[quote=“Uncle Frank, post: 45479, member: 28116”]Unfortunately I am not able to see what’s going on by looking at the packet dump. I see a full DHCP conversation (discover, offer, request, ack). That’s probably when the NIC itself requests and IP (and PXE) on boot up. After that I can see another DHCP discover and offer. But the client does not seem to handle the offer coming from the DHCP server. Anyone else got an idea what might be wrong here?
Could you please build your own iPXE binary and add “DEBUG=dhcp” to the make call?? Might be interesting to see the output. Please take a picture![/quote]
[ATTACH=full]1870[/ATTACH]
[url=“/_imported_xf_attachments/1/1870_Debug.jpg?:”]Debug.jpg[/url]
-
Hmmm, seems like you end up with a (ipxe) shell. Could you please try again and enter the following command on that shell:
[CODE]ifstat[/CODE]
For more information see here: [url]http://ipxe.org/cmd/ifstat[/url]I am wondering if you see “RX:0” or “RX:4” …?? If you see zero it means that it was unable to receive the DHCP answers (although we see that they are on the network).
Next step could be to build iPXE with realtek driver only:
[CODE]make bin/realtek.kpxe EMBED=… DEBUG=realtek:3[/CODE]For further posts please specify what ipxe binary you were exactly using, to prevent that we all get confused when re-reading the thread…
[B]Edit: I just tried to lookup your MAC address. [url=“http://www.macvendorlookup.com”]www.macvendorlookup.com[/url] is telling me that it is “QUANTA COMPUTER INC.”. Maybe I am wrong with realtek NIC?? Could you please boot that machine with a live CD or something else and tell me which NIC is in that 11e. Best would be if you can find out about the PCI ID!![/B]
-
[quote=“Uncle Frank, post: 45539, member: 28116”]Hmmm, seems like you end up with a (ipxe) shell. Could you please try again and enter the following command on that shell:
[CODE]ifstat[/CODE]
For more information see here: [url]http://ipxe.org/cmd/ifstat[/url]I am wondering if you see “RX:0” or “RX:4” …?? If you see zero it means that it was unable to receive the DHCP answers (although we see that they are on the network).
Next step could be to build iPXE with realtek driver only:
[CODE]make bin/realtek.kpxe EMBED=… DEBUG=realtek:3[/CODE]For further posts please specify what ipxe binary you were exactly using, to prevent that we all get confused when re-reading the thread…
[B]Edit: I just tried to lookup your MAC address. [URL=‘http://www.macvendorlookup.com’]www.macvendorlookup.com[/URL] is telling me that it is “QUANTA COMPUTER INC.”. Maybe I am wrong with realtek NIC?? Could you please boot that machine with a live CD or something else and tell me which NIC is in that 11e. Best would be if you can find out about the PCI ID!![/B][/quote]
[SIZE=16px][FONT=Calibri][COLOR=#000000]03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 10)[/COLOR][/FONT][/SIZE]
[FONT=Calibri][COLOR=#000000][SIZE=16px]With a debug it does not get past initializing device. I was able to get dhcp to work with the device by using gPXE which is odd because it is discontinued.[/SIZE][/COLOR][/FONT]
-
That leads me to think, then, that the wireless nic on the system is trying to load with an ip and failing because there’s nothing to connect to. Would you be so willing to change the boot.php lines and reverse mac0 and mac1 in the code?
-
[quote=“Tom Elliott, post: 45560, member: 7271”]That leads me to think, then, that the wireless nic on the system is trying to load with an ip and failing because there’s nothing to connect to. Would you be so willing to change the boot.php lines and reverse mac0 and mac1 in the code?[/quote]
And where is the file located? I cannot find it on the fog machine nor in the iPxe source.
-
/var/www/fog/service/ipxe/boot.php
-
[quote=“Tom Elliott, post: 45568, member: 7271”]/var/www/fog/service/ipxe/boot.php[/quote]
Same thing Configuring net0 fails and reboots. And the reason it shows as Quanta is because Realtek uses Quanta for their ODM mfg.
-
Ah,
Then if you can change back the file to make the mac0’s proper again, we know then that that is not the issue.