Some machines won't boot to Fog menu



  • Server
    • FOG Version: 1.3.5-RC-10
    • OS: CentOS 7
    Client
    • Service Version: 6066
    • OS:
    Description

    I have added a new fog server on CentOS 7 using the latest version of fog. I am able to get some of our machines to boot fine but others get stuck in a reboot loop during iPxe. I am very new to this program and am trying to copy the setup from a prior install done before I starting working here. I have changed our linux DHCP server to undionly.kpxe but have noi idea what else to do to get it working. We have a mix of hardware and are running mostly Windows 7 and a few 8.1 and 10 machines. Thank you for any help you may be able to provide. I have included a pic of the error we get.
    0_1487773405652_2017-02-22 09.12.32.jpg



  • @SBodager That seems to have fixed the issue for the physical machines that were having issues but not the vm’s


  • Moderator

    @SBodager In theory (I only say that because I haven’t personally tested it). You can change your dhcp option 67 to 10secdelay/undionly.kpxe or what ever boot file you need.

    But to answer your question they 10 second delay files are on the FOG server in /tftpboot/10secdelay



  • @Tom-Elliott Thank you. Are the 10-second delay files the .kkpxe files? I’m new to all this, since the guys who set it up are no longer with the company I’m being pressed into learning it on my own.



  • Thanks @Tom-Elliott I saw the folder. Sorry for being lazy. I’ll try it out and report back.



  • @Tom-Elliott Where can I get it?
    They are old systems. Some as old as 5 yrs others about 3 years.


  • Senior Developer

    @SBodager You can also try the 10 second delay files. These were created for cases exactly such as this.


  • Senior Developer

    @TaTa we have 10 second delayed ipxe files that would likely fit the bill for your systems.

    While the network, itself, may not have changed, I’m willing to guess that the systems having problems are relatively new?



  • I’m too having this same problem. Client machines fail to boot to FOG mneu about 40% and if they reboot the 2nd time the failure rate usually go down to 10%. However, if I hit “s” and then type “autoboot” at the prompt, they are able to get to FOG menu. I’ve spoken with our network admins and they said that nothing has changed in their switches configuration in a few years. This only happened recently when I upgraded FOG to 1.3.3 and 1.3.4. Is there anyway we can delay FOG or speed up iPXE?



  • @george1421

    Thank you for the info. I am looking into this but the machine is in another location so it is taking me some time to get the time to get out there.


  • Moderator

    @SBodager try undionly.kkpxe first as the filename.

    I’ll give you a better configuration tonight to support more stuff.

    Also, do the tests that @george1421 posted - based on the outcome of that we will know for certain if it is or isn’t a network issue. Don’t assume it’s not, test and know.



  • @Wayne-Workman

    authoritative;
    ddns-updates on;
    ddns-domainname “wooster.net”;
    ddns-update-style interim;
    default-lease-time 28800;
    max-lease-time 43200;
    option domain-name-servers 192.168.10.14, 192.168.10.20;
    option domain-name “wooster.net”;
    option ntp-servers 192.168.10.16;
    option netbios-name-servers 192.168.10.20;

    subnet 172.16.0.0 netmask 255.255.0.0 {
    option routers 172.16.1.200;
    ’# range 172.16.5.1 172.16.5.11;
    }

    subnet 172.17.0.0 netmask 255.255.0.0 {
    option routers 172.17.1.200;
    range 172.17.1.110 172.17.1.190;
    }

    subnet 192.168.101.0 netmask 255.255.255.0 {
    option routers 192.168.101.200;
    range 192.168.101.101 192.168.101.118;
    }

    subnet 192.168.102.0 netmask 255.255.255.0 {
    option routers 192.168.102.200;
    range 192.168.102.101 192.168.102.122;
    }

    subnet 192.168.103.0 netmask 255.255.255.0 {
    option routers 192.168.103.200;
    range 192.168.103.101 192.168.103.125;
    }

    subnet 192.168.104.0 netmask 255.255.255.0 {
    option routers 192.168.104.200;
    range 192.168.104.101 192.168.104.115;
    }

    subnet 192.168.105.0 netmask 255.255.255.0 {
    option routers 192.168.105.200;
    range 192.168.105.101 192.168.105.105;
    }

    subnet 192.168.106.0 netmask 255.255.255.0 {
    option routers 192.168.106.200;
    range 192.168.106.101 192.168.106.140;
    }

    subnet 192.168.107.0 netmask 255.255.255.0 {
    option routers 192.168.107.200;
    range 192.168.107.101 192.168.107.120;
    }

    subnet 192.168.108.0 netmask 255.255.255.0 {
    option routers 192.168.108.200;
    range 192.168.108.101 192.168.108.151;
    }

    subnet 192.168.109.0 netmask 255.255.255.0 {
    option routers 192.168.109.200;
    range 192.168.109.101 192.168.109.118;
    }

    subnet 192.168.110.0 netmask 255.255.255.0 {
    option routers 192.168.110.200;
    range 192.168.110.101 192.168.110.121;
    }

    subnet 192.168.111.0 netmask 255.255.255.0 {
    option routers 192.168.111.200;
    range 192.168.111.101 192.168.111.125;
    }

    subnet 192.168.112.0 netmask 255.255.255.0 {
    option routers 192.168.112.200;
    range 192.168.112.101 192.168.112.121;
    }

    ‘# Addressing for computers being “FOG” imaged
    if substring (option vendor-class-identifier, 0, 9) = “PXEClient” {
    ’# next-server 172.16.1.99;
    ’# filename “pxelinux.0”;
    next-server 172.16.1.98;
    filename “undionly.kpxe”;
    }


  • Moderator

    This has the appearance that this could be a spanning tree issue. If you have an unmanaged (dumb) switch, put that unmanaged switch between the troubled computer (above) and the building switch. Then test again.

    What I think is at issue, if standard spanning tree protocol is enabled it takes about 27 seconds from the time the link goes up to starts forwarding data. As the PXE rom hands off control of the network adapter to the iPXE kernel, iPXE will momentarily rest the network adapter causing the link to drop very quickly. Well that starts the 27 second counter again. FOG boots so fast by the time 27 seconds clicks by, the booting process has already given up.

    So try the unmanaged test first then get with your infrastructure folks and make sure if you use spanning tree (a good thing) that you use one of the fast spanning tree protocols to allow the port to start forwarding traffic right away.


  • Moderator

    @SBodager Can you share a copy of your /etc/dhcp/dhcpd.conf file? You can IM me via the forums for my email if you don’t want to post it here.



  • We have a seperate DHCP server, and it is the only one.


  • Moderator

    Edit your post and try the pic again, you didn’t give it enough time to upload.

    Is the FOG server doing dhcp or a separate Linux server?

    Do you have more than one dhcp server?


Log in to reply
 

437
Online

39.3k
Users

11.0k
Topics

104.5k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.