• I have a dedicated FOG server on a Proxmox box I’ve been setting up all day today. I get the following when trying to PXE boot a client.

    https://imgur.com/a/1t2osMZ

    I’ve read through all the other TFTP threads and tried all the troubleshooting steps I could find, clean re-installing multiple times along the way but I still have the same issue. I’m able to succesfully pull undionly.kpxe from a separate windows machine so I can see that TFTP is working.

    No clue what’s going on.

  • Senior Developer

    @choppaholic26 Oh well, I should have had look at the topic line again to know it’s about TFTP timeout.

    I think the only way to maybe find out is to capture the network traffic exactly where this is happening. Either you have a managed switch and can enable a monitoring port or one of these really old network hubs to pick up that traffic from/to the PXE booting client machine.


  • @sebastian-roth said in Another TFTP timeout issue:

    Do you still get the exact same error message “No DHCP or proxyDHCP offer received”? If not, post another picture.

    No I never got that error I always got the TFTP timeout error.

    alt text

  • Senior Developer

    @choppaholic26 said in Another TFTP timeout issue:

    Yes I work at a recycler there’s lots of different equipment I’ve checked.

    It’s really strange none of the physical machines would PXE boot if VirtualBox can! Do you still get the exact same error message “No DHCP or proxyDHCP offer received”? If not, post another picture.


  • @sebastian-roth I started from scratch with a new VM and used a different subnet to see if that was the issue but it’s all the same.

    Thank you for reminding me about those changes. I added them to dhcpd.conf again but still no dice. I’m back to square one.

    Yes I work at a recycler there’s lots of different equipment I’ve checked.

  • Senior Developer

    @choppaholic26 I am at a loss on what’s wrong in that setup. I wonder how you got rid of the “Either DHCP failed or we were unable to access …” issue on the other hand. Just doesn’t make sense to me.

    Looking at the new PCAPs I see a few things:

    1. Seems like you have changed the IP subnet. Hope that you are aware of changing a FOG servers IP needs manual adjustments in several places - see documentation.
    2. The DHCP options 66 and 67 are still not set as we see in the PCAP so I expect the manual adjustments to the dhcpd.conf were lost along the way.

    Have you tried other physical machines to see if it’s specific to this laptop you are trying to PXE boot?


  • @sebastian-roth @sebastian-roth It’s getting even weirder now. I thought I managed to fix it after trying another clean install.

    I was able to even capture a few windows images using VirtualBox. But I just figured out that everything works fine ONLY in VirtualBox. Nothing has changed when trying to PXE boot a physical laptop or PC. Still get a TFTP timeout error.

    A dumb switch doesn’t make a difference for either the virtual or physical PXE attempt. (Virtual works every time, physical does not). I’m not using any fancy networking. The physical laptop uses the same exact adapter that is bridged to VirtualBox when it succeeds so I have no idea what is going on. I have restarted both the VM’s and the actual server but that hasn’t fixed it.

    Server ->VirtualBox - Success
    Server->Dumb Switch->VirtualBox-Success
    Server->Physical Laptop-Fail
    Server->Dumb Switch->Physical Laptop-Fail

    I captured two pcap files. One from the successful VirtualBox and one from the unsuccessful TFTP timeout physical laptop.

    virtual.pcap

    physical.pcap

  • Senior Developer

    @choppaholic26 Ok. Those manual tests just work fine from what we see. Thought of a spanning tree issue kicking in late (not on the early PXE boot but only when the FOS kernel boots up) but that shouldn’t happen when you connect the notebook directly to the Proxmox host. Just to be very sure this is not something caused by a direct connection can you connect a dump mini switch in between those two too keep up the ports?

    Yes I made those changes but nothing changed with either the attempt to PXE boot VirtualBox or the physical laptop.

    Did you restart the dhcpd service on the FOG server or the whole VM to make sure the changes were applied? If so you might want to take another packet dump so we can check the DHCP options in the packets.

    It’s really strange you stumble into so many issues right from the beginning. Too bad this keeps you from diving into the real fun with imaging using FOG.


  • @sebastian-roth said in Another TFTP timeout issue:

    With debug enabled you should get to a command shell after a while. Please run the following commands and take a picture:

    Here you go:

    alt text
    alt text

    By the way, did you see my post on extending your dhcpd.conf to get option 66/67? https://forums.fogproject.org/post/141631

    Yes I made those changes but nothing changed with either the attempt to PXE boot VirtualBox or the physical laptop.

  • Senior Developer

    @choppaholic26 Ok, but after that we should also see the requests for http://192.168.70.11/fog//index.php in that access log! Ignore the double slash in the URL, it looks ugly but Apache can handle that.

    So as a next step I would suggest you manually create the VirtualBox VM as a host in the FOG web UI (using its MAC address) and schedule a debug deploy task for it. I know there is nothing to deploy but this way we can get to a debug command shell.

    In the FOG web UI click on the host you just manually created -> Basic Tasks tab -> Deploy -> enable checkbox for debug task and create the task. Now PXE boot the VirtualBox VM as usual. This time you should not get the menu but it should boot straight up to the last issue. With debug enabled you should get to a command shell after a while. Please run the following commands and take a picture:

    ip a s
    echo "${web}"
    curl -Ik "${web}"/index.php
    

    Note: The first curl parameter is an I as in India, not l or 1. If that call fails for a non obvious reason you might add the verbose switch to get more information curl -Ikv ...

    By the way, did you see my post on extending your dhcpd.conf to get option 66/67? https://forums.fogproject.org/post/141631


  • @sebastian-roth said in Another TFTP timeout issue:

    Sorry my bad! Newer Ubuntu versions seem to log those requests to /var/log/apache2/other_vhosts_access.log…

    Roger that. I tried it again and I do see the 3 requests for those files.

    alt text

  • Senior Developer

    @choppaholic26 said in Another TFTP timeout issue:

    I PXE booted to the failed registration attempt but unless I’m doing it wrong this file is completely empty for me.

    Sorry my bad! Newer Ubuntu versions seem to log those requests to /var/log/apache2/other_vhosts_access.log


  • @sebastian-roth said in Another TFTP timeout issue:

    Can you please run tail -f /var/log/apache2/access.log while PXE booting the VirtualBox VM? You should see requests for boot.php, then bzImage and init.xz and lastly another request as connection check which we see in the picture. Do you see all those?

    I PXE booted to the failed registration attempt but unless I’m doing it wrong this file is completely empty for me. See below:
    alt text


  • This post is deleted!
  • Senior Developer

    @choppaholic26 The new pictures you posted clearly show that it does get an IP from the DHCP. So from the message “Either DHCP failed or we were unable to access http://192.168.70.11/…” I would imagine the later one to fail. Though the URL looks all fine.

    Can you please run tail -f /var/log/apache2/access.log while PXE booting the VirtualBox VM? You should see requests for boot.php, then bzImage and init.xz and lastly another request as connection check which we see in the picture. Do you see all those?

  • Senior Developer

    @choppaholic26 said in Another TFTP timeout issue:

    However, if I wait a couple of seconds while in the FOG menu, the registration will start (I tested waiting a few seconds on memtest and it started running fine) but DHCP will fail.

    Well, that’s awkward. Maybe some rate limiting firewall on the Proxmox host?! Just a wild guess.

    As far as the setup, I have my physical proxmox server with the installed FOG VM setup to use its interfaces. I have a laptop with VirtualBox running that is plugged directly into the back of the server. To PXE boot I have a bridged adapter for VirtualBox to use to PXE in the VirtualBox settings.

    Ok, sounds reasonable. I do understand that using VirtualBox to test can make things easier and in this case it really is of help to see that TFTP actually works.

    Edit: Thought I should clarify though, that the TFTP timeout issue when PXE booting the actual laptop hasn’t changed.

    Ok. As mentioned before some PXE booting devices are happy to use whatever information they get while others are more picky and won’t boot unless you provide exactly what they are after. So from what we’ve seen so far it looks like the physical notebook doesn’t like the missing DHCP options (66/67).

    I have to admit that I haven’t looked at the DHCP offer/ack packets that isc-dhcp sends out in a long time. Seems like we actually miss the options 66 and 67 in our default config. Probably because 99.9% of machines properly PXE boot with the information given in the DHCP body as mentioned below.

    To get the extra options you need to adjust your config and add option tftp-server-name and option bootfile-name to make it look like this:

    ...
        next-server 192.168.70.11;
        option tftp-server-name "192.168.70.11";
        class "Legacy" {
            match if substring(option vendor-class-identifier, 0, 20) = "PXEClient:Arch:00000";
            filename "undionly.kkpxe";
            option bootfile-name "undionly.kkpxe";
        }
    ...
    

  • @sebastian-roth Disabled the firewall but the issue persisted. I got something else though.

    Here’s the pcap file for the following registration attemp I’ve outlined below: output.pcap

    I tried doing a registration like you suggested and I got this:

    alt text

    However, if I wait a couple of seconds while in the FOG menu, the registration will start (I tested waiting a few seconds on memtest and it started running fine) but DHCP will fail. See here:

    alt text
    alt text

    As far as the setup, I have my physical proxmox server with the installed FOG VM setup to use its interfaces. I have a laptop with VirtualBox running that is plugged directly into the back of the server. To PXE boot I have a bridged adapter for VirtualBox to use to PXE in the VirtualBox settings.

    I’ve tried eliminating the complexity by also by just physically PXE booting the actual laptop. But I keep getting the same results anyways so it’s just faster to use VirtualBox for me without having to restart over and over again.

    Edit: Thought I should clarify though, that the TFTP timeout issue when PXE booting the actual laptop hasn’t changed. Just thought that VirtualBox getting past that might say something about what issue could be.

  • Senior Developer

    @choppaholic26 Wow, seems like an endeavor and I am not exactly sure where to jive in. Although George is right the DHCP packets seem to be missing the actual DHCP options 66 and 67 I still think that some clients can handle this and PXE boot using the information seen in the DHCP base body.

    I would focus on capturing more traffic on the FOG server to see if the TFTP requests actually make it to the server. If you have used the command as suggested in the post by George (tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4011) we should actually see TFTP requests coming in (as from the initial picture you posted we know it tries to).

    You might check if the Ubuntu firewall is blocking it! sudo ufw disable and try again.

    By the way, where did you setup the VirtualBox stuff so it would PXE boot off you FOG server running in a VM on the Proxmox server. I am wondering if it’s more due to complexity in the setup that things don’t work. On the other hand the new screenshots show it actually pull the iPXE binary from the server (so TFTP transfer must work). Beside booting into the memdisk task, have you tried doing a registration through the FOG menu yet?


  • @george1421 I think I might have something.

    1. I can:
      • Ping the fog server (192.168.70.11) successfully.
      • View the fog management web portal in my browser.
      • My workstation picks up an IP (192.168.70.33) so I can see that fog’s DHCP is working.
    2. Tried to PXE boot with several laptops, got the TFTP timeout:

    I tried something and I think I got something. I tried PXE booting using Virtualbox instead. IT DOES SUCCESSFULLY PXE BOOT but take a look at network interface. During the PXE boot you can see that it’s not up initially.

    alt text
    alt text

    After getting to the fog menu this is what happens when you try and pick an item (in this case memtest). You can see that the connection is dropping for whatever unknown reason.

    alt text

    The cables are good. The equipment and ports are good, everything lights up. DHCP is working. Fog management page is accessible.

    I’m out of ideas here.


  • @george1421 I’m using Proxmox 6.2 and Ubuntu Server 20.04 for fog.

    Does the networking inside the VM and in Proxmox look right to you? I’m sorry for bothering but I’m a little out of my element here.

    alt text
    alt text
    alt text

299
Online

8.5k
Users

15.3k
Topics

143.3k
Posts