Network Boot forgets Ethernet card exists after booting



  • Server
    • FOG Version: SVN 6063
    • OS: Debian Jessie
    Client
    • Service Version: N/A
    • OS: N/A
    Description

    PXE Booting is going to be the death of me! I need to image some computers, and then I learned that FOG is not working. So, I start the troubleshooting, and here’s what I found:

    1. The default menu appears for SOME clients (this is new to me, I’ve never seen it before. The only option is “fog”). After the zero second time out (where is it getting this info from?), it moves on to try to Network Boot the Wifi card, which fails.
    2. On the one I care about imaging right now, it starts the Network Boot, gets its IP address and info from the DHCP server, then starts to load the IPXE environment, which tries to boot the Wifi card, and does not seem to detect the Ethernet card that booted it.

    I have tried to re-install the latest version from git, and the issue is still here. It seems like the IPXE kernel is not understanding the basic NIC in two different test machines - a VirtualBox machine, and an Asus EeePC.

    DHCP options:
    next-server 192.168.0.4;
    filename “undionly.kpxe”;

    FOG server is located at 192.168.0.4, and the /tftpboot directory looks normal:

    /tftpboot# ls -lah
    total 7.1M
    drwxr-xr-x  6 fog  root 4.0K Jan 27 12:09 .
    drwxr-xr-x 25 root root 4.0K Jan 27 12:09 ..
    drwxr-xr-x  2 fog  root 4.0K Jan 27 12:09 10secdelay
    -rw-r-xr-x  1 fog  root  840 Jan 27 12:09 boot.txt
    -rw-r-xr-x  1 fog  root  426 Jan 27 12:09 default.ipxe
    drwxr-xr-x  2 fog  root 4.0K Aug 25 09:00 i386-7156-efi
    drwxr-xr-x  2 fog  root 4.0K May 31  2016 i386-efi
    -rw-r-xr-x  1 fog  root 195K Jan 27 12:09 intel7156.efi
    -rw-r-xr-x  1 fog  root 216K Jan 27 12:09 intel.efi
    -rw-r-xr-x  1 fog  root  92K Jan 27 12:09 intel.kkpxe
    -rw-r-xr-x  1 fog  root  92K Jan 27 12:09 intel.kpxe
    -rw-r-xr-x  1 fog  root  92K Jan 27 12:09 intel.pxe
    -rw-r-xr-x  1 fog  root 921K Jan 27 12:09 ipxe7156.efi
    -rw-r-xr-x  1 fog  root 959K Jan 27 12:09 ipxe.efi
    -rw-r-xr-x  1 fog  root 846K Jan 27 12:09 ipxe.iso
    -rw-r-xr-x  1 fog  root 337K Jan 27 12:09 ipxe.kkpxe
    -rw-r-xr-x  1 fog  root 337K Jan 27 12:09 ipxe.kpxe
    -rw-r-xr-x  1 fog  root 337K Jan 27 12:09 ipxe.krn
    -rw-r-xr-x  1 fog  root 337K Jan 27 12:09 ipxe.pxe
    -rw-r-xr-x  1 fog  root 121K Jan 27 12:09 ldlinux.c32
    -rw-r-xr-x  1 fog  root 184K Jan 27 12:09 libcom32.c32
    -rw-r-xr-x  1 fog  root  26K Jan 27 12:09 libutil.c32
    -rw-r-xr-x  1 fog  root  26K Jan 27 12:09 memdisk
    -rw-r-xr-x  1 fog  root  29K Jan 27 12:09 menu.c32
    -rw-r-xr-x  1 fog  root  43K Jan 27 12:09 pxelinux.0
    -rw-r-xr-x  1 fog  root  43K Jan 27 12:09 pxelinux.0.old
    drwxr-xr-x  2 fog  root 4.0K May 31  2016 pxelinux.cfg
    -rw-r-xr-x  1 fog  root 195K Jan 27 12:09 realtek7156.efi
    -rw-r-xr-x  1 fog  root 216K Jan 27 12:09 realtek.efi
    -rw-r-xr-x  1 fog  root  93K Jan 27 12:09 realtek.kkpxe
    -rw-r-xr-x  1 fog  root  93K Jan 27 12:09 realtek.kpxe
    -rw-r-xr-x  1 fog  root  93K Jan 27 12:09 realtek.pxe
    -rw-r-xr-x  1 fog  root 194K Jan 27 12:09 snp7156.efi
    -rw-r-xr-x  1 fog  root 215K Jan 27 12:09 snp.efi
    -rw-r-xr-x  1 fog  root 194K Jan 27 12:09 snponly7156.efi
    -rw-r-xr-x  1 fog  root 215K Jan 27 12:09 snponly.efi
    -rw-r-xr-x  1 fog  root  92K Jan 27 12:09 undionly.kkpxe
    -rw-r-xr-x  1 fog  root  92K Jan 27 12:09 undionly.kpxe
    -rw-r-xr-x  1 fog  root 374K May 31  2016 undionly.kpxe.INTEL
    -rw-r-xr-x  1 fog  root  92K Jan 27 12:09 undionly.pxe
    -rw-r-xr-x  1 fog  root  30K Jan 27 12:09 vesamenu.c32
    

    Suggestions of where to look? The DHCP server is also Debian Jessie, but on a different server.

    I ran a tcpdump on the interface, and looked at it through Wireshark, as suggested. The only errors I see is an unknown error at the top, and missing files, such as pxelinux.cfg/80833fa6-f091-4681-2969-485b39123be5, then other weird ID numbers, until it finds pxelinux.cfg/default, and continues on. I can post the capture file if it would help futher


  • Moderator



  • @Sebastian-Roth Using isc-dhcp-server on a separate Debian Jessie server (192.168.0.1)

    @Wayne-Workman I’ll give that a try and report back. Because I never installed FOG’s DHCP server, I never knew about those lines.


    Edit: I got this working. Issue with the switch, and I’m guessing, Auto-Negotiation. Turned it down to 100M Full Duplex, and it worked on the next boot! No change with the extra lines added to DHCP though.


  • Developer

    @lukebarone What kind of DHCP server/software are we talking about here. The config snippet you posted is partly DHCP and partly DNS (zone ...). I am not aware of any software being able to handle this kind of config. dnsmasq can do DHCP and DNS but has a different config syntax as far as I remember.


  • Moderator

    @lukebarone You’re missing all the pxe options in the configuration. Below is what the fog installer puts at the top of the dhcpd.conf file. Put this at the top of the file and then give dhcpd a restart.

    option space PXE;
    option PXE.mtftp-ip    code 1 = ip-address;
    option PXE.mtftp-cport code 2 = unsigned integer 16;
    option PXE.mtftp-sport code 3 = unsigned integer 16;
    option PXE.mtftp-tmout code 4 = unsigned integer 8;
    option PXE.mtftp-delay code 5 = unsigned integer 8;
    option arch code 93 = unsigned integer 16; # RFC4578
    

    Also towards the bottom of your configuration, the below lines are redundant, as you have it set globally already.

    option routers 192.168.31.254;
    option netbios-name-servers 192.168.0.3;
    


  • @Wayne-Workman Here it is:

    authoritative;
    option domain-name "sd57.lan";
    option domain-name-servers 192.168.0.1,199.175.16.2;
    option netbios-name-servers 192.168.0.3;
    option local-pac-server code 252 = text;
    option domain-search "sd57.bc.ca", "sd57.lan";
    option routers 192.168.31.254;
    
    ddns-updates           on;
    ddns-update-style      interim;
    ignore                 client-updates;
    update-static-leases   on;
    
    default-lease-time 3600;  #1 Hr
    max-lease-time 28800;
    log-facility local7;
    
    include "/etc/dhcp/ddns.key";
    
    zone sd57.lan. {
      primary 127.0.0.1;
      key DDNS_UPDATE;
    }
    
    zone 0.168.192.in-addr.arpa. {
      primary 127.0.0.1;
      key DDNS_UPDATE;
    }
    
    subnet 192.168.0.0 netmask 255.255.224.0 {
    	authoritative;
    	ddns-domainname "cla.sd57.bc.ca";
    	next-server 192.168.0.4;	# FOG Server
    #	filename "ipxe.pxe";
    #	filename "intel.pxe";
    #	filename "pxelinux.0";
    	filename "undionly.kpxe";	# The closest thing I have to something working
    #	filename "undionly.pxe";
    	range 192.168.1.1 192.168.8.254;
    	option subnet-mask 255.255.224.0;
    	option broadcast-address 192.168.31.255;
    	option routers 192.168.31.254;
    	option netbios-name-servers 192.168.0.3;
    	option netbios-node-type 8;
    }
    

  • Moderator

    @lukebarone It’s dropping you to an ipxe shell.

    So, you said previously that you had the correct file configured in dhcpd.conf, however later found that this wasn’t the case. What did you find? What did you change? Do you have more than one DHCP server? What was nextserver configured as (option 066 in windows) ?

    Can you post your dhcpd.conf file here so I can look through it for issues?



  • OK, I got it to use the correct filename. Now, the machines start the booting from the network, and when iPXE loads (1.0.0+ 26050), it goes to:

    Configuring (net0 <MAC Address>)...... No configuration methods succeeded
    Configuring (net0 <MAC Address>)...... OK
    iPXE>
    

    The advantage now is that the correct MAC address is attempting to boot; the unfortunate part is that the FOG menu is still not coming up :-(

    I also tried with a “dumb switch” just before the computers in question - no change. I know that STP is disabled on all my switches leading back to the FOG Server (when Googling other forums, this came up quite a bit to check for).



  • @Tom-Elliott Well, this is awkward…

    Captured a tcpdump from the DHCP server, and you’re right! It’s dishing out pxelinux.0, not what I specified in the /etc/dhcp/dhcpd.conf file! More investigating on my part now… Thanks for the direction to look in!


  • Senior Developer

    I think the problem is we need to see what the DHCP is actually handing out.

    From what I can see, it’s pointing at pxelinux.0.

    pxelinux.0 is passing to ipxe.krn. The ipxe.krn has an embedded script that is told to look at the default.ipxe file.

    pxelinux.0, when initially loaded, looks for the “uuid” first, and on down until there’s nothing found and then it tries default. Default is what’s handing out the data back to the client machine (telling it to load ipxe.krn).

    If your dhcp is handing out undionly.kpxe as your original post suggests, then you’re not looking at the right place because your clients are definitely NOT looking at what you think they’re looking at.



  • @Tom-Elliott OK, I can find the ipxe.krn file (not ipxe.lkrn) in my /tftpboot folder. Assuming this is correct, why will the FOG menu not appear? Is there more work I can trace back to help?


  • Senior Developer

    Based on your tcpdump your pxe server is being handed information by the pxelinux.0 file, not ipxe. This would be, from my understanding, passing to an ipxe.lkrn file.

    The pxelinux.cfg is the indicator to me for this.



  • This post is deleted!

Log in to reply
 

354
Online

39.3k
Users

11.0k
Topics

104.4k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.