Manually typing TFTP address after booting
-
@Oleg is there a specific case where it is requesting the info to be entered?
From what it sounds like, to me based on the info I am gathering from this post, it seems to be potentially related to ip-helpers not being where needed. My guess, and that’s all I can do at this point, is systems within the same scope as the fog server are getting the next-server/filename properly, while the systems that are failing are not in the same scope. Maybe the non working systems are behind another switch-vlan?
-
I have been experiencing this as well since a few updates ago, no change in netwerk architecture (yet), so not sure why that’s suddenly occuring.
It clearly knows the tftp address as it loads the ipxe.kpxe file from it, but then it asks for tftp address anyway.
Also, this does not always happen, just almost always. (talking about the same machines here, it happens network wide, but sometimes the machines don’t ask, but most of the time they do)
-
On bootup a clients requests DHCP information three times! First the PXE ROM itself. Finding the correct information it can load iPXE via TFTP. iPXE sends the second DHCP request and should get the exact same info back from the DHCP server. iPXE does not care about filename information (option 67) but it does need option 66 (next-server)! It uses this information to load default.ipxe via TFTP. So I am really wondering why iPXE wouldn’t receive the next-server info. Do you see the message ‘Received DHCP answer on interface net0’??
After loading the linux kernel via HTTP the third DHCP request is being sent… Here both PXE options are irelevant/ignored. Just needs an IP to communicate with the FOG server.
-
I’ll check when I’m in at work tomorrow morning, but from what I recall it says something along the lines of “Configuring net0 (macaddress)…”
After which it will ask for tftp address.
-
Sorry for the glare, but the relevant info should be visible anyway.
I should note, I tested a Dell LatitudeE6410 with the exact same cable and everything which did not have this issue, no matter how many times I tried, which makes me think it has something to do with the NIC/drivers
-
@Quazz Thanks for that! Interesting stuff. The checks we use in the embedded iPXE script are meant to only proceed on net0 device if it got an answer from the DHCP server. So this seams to be the case. At least I hope so. Would you be able to capture DHCP traffic on the wire? For example using a hub just in front of the elitebook or on the DHCP server?? Please let me know if you need assistance capturing the traffic. Please upload a pcap-file of the whole bootup here and I’ll have a look!
-
I’m a bit busy today, but hopefully I’ll have some time for that later today.
I did notice some clients reboot right after loading the ipxe binary (and presumably before I have to or not have to enter the tftp address)
Also noticed that the sql backup is now 7mb as opposed to the 1mb I had before, not sure if that’s normal.
I have a feeling some of these are connected, but I could be way off.
-
I captured the DHCP traffic on the fog server when one of the clients was pxe booting (and had the tftp address prompt).
edit: The client that previously rebooted after ipxe initialization now claims it gets no filename (after server reboot), other clients working as before (tftp prompt and all)
-
@Quazz Thanks for the PCAP file. Three things I find are a bit awkward:
- Even the very first DHCP discovery in this capture file has DHCP option 175 set. This is usually not the case on a PXE booting NIC but only when iPXE/etherboot is being used. So where is the very first DHCP communication or why is option 175 set (what kind of client is this?)?
- This first DHCP discovery is being answered by the same DHCP server (192.168.1.156) twice offering two different IP addresses to the client! Seams like the client can handle this but this is not a good clean way and might be part of the issue.
- After ipxe.kpxe is being transfered via TFTP there is another round of DHCP communication (perfectly fine as iPXE requests an IP) but this time I see two different DHCP servers answering (and offering different IPs): 192.168.1.156 and 192.168.1.1
From my point of view this DHCP setup might need a bit of a “cleanup” and hopefully your issues will go away.
-
@Sebastian-Roth The client in question that I captured traffic from was a VM, might that have something to do with option 175? I had finished prepping it for capture, shut it down, then started capture and booted the client. The client is in bridged mode (otherwise it wouldn’t even work if I’m not mistaken), so it should all work properly.
I think what happened with the two IP addresses is that the lease expired so the DHCP server had to correct itself? (the lease was 6 hours on the dot for the first offered IP).
As for the last point, there are indeed two dhcp servers, but unfortunately I can’t do much about that. (we are unable to login on the modem/router even with the login data provided by the ISP) I did try to get proxydhcp to work, but unfortunately some clients acknowledged the proxy, but then continued to request the file from 192.168.1.1 anyway, even though it doesn’t even offer a next-server option. (link here)
Yet, it seems to sometimes win the dhcp race after which it offers a empty next-server which would indeed force me to input me the address manually. The strange part is that this has only been happening recently, although I suppose it’s possible the ISP pushed a firmware update to the modem/router to make that happen.
Are there any solid alternatives I could look into? I’ve looked around a bit, but most attempts seem either kind of hacky or are in alpha stages of development.
EDIT: Talked to some people who know about more about that ISP and their modems (I use a different ISP at home who employ an entirely different system). I may have a way to login into the modem/router and disable DHCP, will test it out tomorrow, will let you know if this improves the situation as well.
-
No luck on the logging in to modem/router thing. It’s going to continue to be a nuisance whether or not I use isc-dhcp-server or proxydhcp since it gives an empty next-server option to clients (from what I’ve read online this is because it gives a specific next-server and filename to the digital TV)
So in other words I have to choose between proxydhcp not working 10% of the time or isc-dhcp-server nearly always asking for tftp prompts and not working 10% of the time, ugh.
I’d like to thank you guys for helping me figure this out though, keep up the good work
-
@Quazz Thanks for explaining a bit more and linking back to the other tread. I somehow had the feeling that we had talked about this but couldn’t find it.
Yes, booting a VM (possibly virtualbox) is perfectly explaining option 175. So that’s fine!
Not sure about the lease time coincidence. But on the other hand I don’t see why isc-dhcp should send two offers with to different IPs (to the same MAC) within just one second. Doesn’t make any sense to me. If the client does not respond fast enough I would think the DHCP server would just send the exact same offer again.
Ahhhh, there is something else I just noticed: The first two DHCP offers from 192.168.1.156 are send to the unicast addresses (192.168.1.54 and 192.168.1.47). At first I was confused because I thought ISC-DHCP usually does send to broadcast. But I was wrong: https://lists.isc.org/pipermail/dhcp-users/2008-April/006219.html
The first request (your VM PXEing) has bootp flags set to unicast but our iPXE requests broadcast answers from the DHCP server. Learning something new every day.If you can’t change your ISP modem (you are right, I don’t see next-server/filename options in those DHCP answers), then DHCP proxy should be your friend. It’s really strange that clients would still want to request the boot file from 192.168.1.1! I’ve looked through the packet dump files again (as well the old one) and I can’t see next-server being sent by 192.168.1.1 at all. Maybe it does this just once on a while?? OR iPXE is seeing the option “Relay agent IP address” in the DHCP offer as next-server. I kind of doubt this but I am not absolutely sure.
If you get a chance to get to the iPXE console/shell at some point just type
config
and you can see all the variables being set. See here: http://ipxe.org/cmd/config -
@Sebastian-Roth I have indeed chosen for ProxyDHCP in the mean time. I will try the config thing tomorrow.
The clients now always say they get DHCP offerings from both the DHCP server and the DHCP proxy (after configuring net0), most solid clients are able to select the proper answer from the ProxyDHCP, but some are not (this usually professional vs consumer grade).
As far as I can tell from the packet dump, it does not send out a dhcp option 66 as you say, but it does specify for next-server to be 0.0.0.0 (not sure if that’s normal behavior or not though, I kind of thought if it didn’t send out the option then it shouldn’t even contain that information at all)
There’s definitely something going on, hopefully there’s still some way to improve the situation, I’ll let you know how it goes.
-
This post is deleted! -
@Sebastian-Roth Anything specific you want me to look for in the ipxe config menu?
As far as I understand, these options are all set as intended at the current stage of the boot process, is that correct?
-
@Quazz Yeah, that should give us an idea of what iPXE sees. From what I can see on those two screens I guess this is before iPXE has tried to request DHCP info. So please go back to the shell (reboot) and type
dhcp net0 && config
. I would suspect you’d see gateway/netmask/ip being set (infos from 192.168.1.1) but not filename/next-server. If you select ‘net0/’ at the top you should see an entry ‘proxydhcp’… Check the values… -
@Sebastian-Roth Alright, there is indeed no next-server of filename info coming from the DHCP. And they properly receive the info (next-server and filename) from the proxydhcp as well. (it’s the only info it hands out). ProxyDHCP does refer to itself as to be the DHCP server but I’m guessing that’s normal.
All values are what we can expect, proper gateway, dns and so on.
-
@Quazz said:
ProxyDHCP does refer to itself as to be the DHCP server but I’m guessing that’s normal.
Could you post another screen of those settings? Just to make sure. Have you tried several times in a row? Sounds a bit like this is working fine sometimes but also having issues at times. See if you always get the correct values in the config dialog.
-
@Sebastian-Roth I think I may have accidently misled you earlier.
What I meant with not working 10% of the time I meant 10% of clients (generally lower end consumer garbo). It seems to work fine on anything remotely professional or decent consumer stuff. I’m fairly certain the fault lies with the NICs at this point, but I’ll continue to monitor the situation and try a couple more times.
Created an imgur album if for nothing else but historic purposes
-
@Quazz Thanks for the album! This is interesting. Clearly there is no next-server setting within ‘net0’ (so
dhcp net0
didn’t get the next-server info) but still next-server is set on the main screen. If I remember correctly from my tests this was not the case for me. We added this check to see if DHCP server and DHCP Proxy send next-server to inform the user that their DHCP setup might be “screwed”. But it looks like this is causing issues (maybe only on low end consumer NICs??).@Tom-Elliott Do you think changing the check would make a difference.
Current check:isset ${proxydhcp/next-server} && isset ${next-server} && echo Duplicate option 66 ...
New check:isset ${proxydhcp/next-server} && isset ${net0/next-server} && echo Duplicate option 66 ...
But about machines using net1 instead of net0? Maybe I added too many checks.