Manually typing TFTP address after booting
-
I’m a bit busy today, but hopefully I’ll have some time for that later today.
I did notice some clients reboot right after loading the ipxe binary (and presumably before I have to or not have to enter the tftp address)
Also noticed that the sql backup is now 7mb as opposed to the 1mb I had before, not sure if that’s normal.
I have a feeling some of these are connected, but I could be way off.
-
I captured the DHCP traffic on the fog server when one of the clients was pxe booting (and had the tftp address prompt).
edit: The client that previously rebooted after ipxe initialization now claims it gets no filename (after server reboot), other clients working as before (tftp prompt and all)
-
@Quazz Thanks for the PCAP file. Three things I find are a bit awkward:
- Even the very first DHCP discovery in this capture file has DHCP option 175 set. This is usually not the case on a PXE booting NIC but only when iPXE/etherboot is being used. So where is the very first DHCP communication or why is option 175 set (what kind of client is this?)?
- This first DHCP discovery is being answered by the same DHCP server (192.168.1.156) twice offering two different IP addresses to the client! Seams like the client can handle this but this is not a good clean way and might be part of the issue.
- After ipxe.kpxe is being transfered via TFTP there is another round of DHCP communication (perfectly fine as iPXE requests an IP) but this time I see two different DHCP servers answering (and offering different IPs): 192.168.1.156 and 192.168.1.1
From my point of view this DHCP setup might need a bit of a “cleanup” and hopefully your issues will go away.
-
@Sebastian-Roth The client in question that I captured traffic from was a VM, might that have something to do with option 175? I had finished prepping it for capture, shut it down, then started capture and booted the client. The client is in bridged mode (otherwise it wouldn’t even work if I’m not mistaken), so it should all work properly.
I think what happened with the two IP addresses is that the lease expired so the DHCP server had to correct itself? (the lease was 6 hours on the dot for the first offered IP).
As for the last point, there are indeed two dhcp servers, but unfortunately I can’t do much about that. (we are unable to login on the modem/router even with the login data provided by the ISP) I did try to get proxydhcp to work, but unfortunately some clients acknowledged the proxy, but then continued to request the file from 192.168.1.1 anyway, even though it doesn’t even offer a next-server option. (link here)
Yet, it seems to sometimes win the dhcp race after which it offers a empty next-server which would indeed force me to input me the address manually. The strange part is that this has only been happening recently, although I suppose it’s possible the ISP pushed a firmware update to the modem/router to make that happen.
Are there any solid alternatives I could look into? I’ve looked around a bit, but most attempts seem either kind of hacky or are in alpha stages of development.
EDIT: Talked to some people who know about more about that ISP and their modems (I use a different ISP at home who employ an entirely different system). I may have a way to login into the modem/router and disable DHCP, will test it out tomorrow, will let you know if this improves the situation as well.
-
No luck on the logging in to modem/router thing. It’s going to continue to be a nuisance whether or not I use isc-dhcp-server or proxydhcp since it gives an empty next-server option to clients (from what I’ve read online this is because it gives a specific next-server and filename to the digital TV)
So in other words I have to choose between proxydhcp not working 10% of the time or isc-dhcp-server nearly always asking for tftp prompts and not working 10% of the time, ugh.
I’d like to thank you guys for helping me figure this out though, keep up the good work
-
@Quazz Thanks for explaining a bit more and linking back to the other tread. I somehow had the feeling that we had talked about this but couldn’t find it.
Yes, booting a VM (possibly virtualbox) is perfectly explaining option 175. So that’s fine!
Not sure about the lease time coincidence. But on the other hand I don’t see why isc-dhcp should send two offers with to different IPs (to the same MAC) within just one second. Doesn’t make any sense to me. If the client does not respond fast enough I would think the DHCP server would just send the exact same offer again.
Ahhhh, there is something else I just noticed: The first two DHCP offers from 192.168.1.156 are send to the unicast addresses (192.168.1.54 and 192.168.1.47). At first I was confused because I thought ISC-DHCP usually does send to broadcast. But I was wrong: https://lists.isc.org/pipermail/dhcp-users/2008-April/006219.html
The first request (your VM PXEing) has bootp flags set to unicast but our iPXE requests broadcast answers from the DHCP server. Learning something new every day.If you can’t change your ISP modem (you are right, I don’t see next-server/filename options in those DHCP answers), then DHCP proxy should be your friend. It’s really strange that clients would still want to request the boot file from 192.168.1.1! I’ve looked through the packet dump files again (as well the old one) and I can’t see next-server being sent by 192.168.1.1 at all. Maybe it does this just once on a while?? OR iPXE is seeing the option “Relay agent IP address” in the DHCP offer as next-server. I kind of doubt this but I am not absolutely sure.
If you get a chance to get to the iPXE console/shell at some point just type
config
and you can see all the variables being set. See here: http://ipxe.org/cmd/config -
@Sebastian-Roth I have indeed chosen for ProxyDHCP in the mean time. I will try the config thing tomorrow.
The clients now always say they get DHCP offerings from both the DHCP server and the DHCP proxy (after configuring net0), most solid clients are able to select the proper answer from the ProxyDHCP, but some are not (this usually professional vs consumer grade).
As far as I can tell from the packet dump, it does not send out a dhcp option 66 as you say, but it does specify for next-server to be 0.0.0.0 (not sure if that’s normal behavior or not though, I kind of thought if it didn’t send out the option then it shouldn’t even contain that information at all)
There’s definitely something going on, hopefully there’s still some way to improve the situation, I’ll let you know how it goes.
-
This post is deleted! -
@Sebastian-Roth Anything specific you want me to look for in the ipxe config menu?
As far as I understand, these options are all set as intended at the current stage of the boot process, is that correct?
-
@Quazz Yeah, that should give us an idea of what iPXE sees. From what I can see on those two screens I guess this is before iPXE has tried to request DHCP info. So please go back to the shell (reboot) and type
dhcp net0 && config
. I would suspect you’d see gateway/netmask/ip being set (infos from 192.168.1.1) but not filename/next-server. If you select ‘net0/’ at the top you should see an entry ‘proxydhcp’… Check the values… -
@Sebastian-Roth Alright, there is indeed no next-server of filename info coming from the DHCP. And they properly receive the info (next-server and filename) from the proxydhcp as well. (it’s the only info it hands out). ProxyDHCP does refer to itself as to be the DHCP server but I’m guessing that’s normal.
All values are what we can expect, proper gateway, dns and so on.
-
@Quazz said:
ProxyDHCP does refer to itself as to be the DHCP server but I’m guessing that’s normal.
Could you post another screen of those settings? Just to make sure. Have you tried several times in a row? Sounds a bit like this is working fine sometimes but also having issues at times. See if you always get the correct values in the config dialog.
-
@Sebastian-Roth I think I may have accidently misled you earlier.
What I meant with not working 10% of the time I meant 10% of clients (generally lower end consumer garbo). It seems to work fine on anything remotely professional or decent consumer stuff. I’m fairly certain the fault lies with the NICs at this point, but I’ll continue to monitor the situation and try a couple more times.
Created an imgur album if for nothing else but historic purposes
-
@Quazz Thanks for the album! This is interesting. Clearly there is no next-server setting within ‘net0’ (so
dhcp net0
didn’t get the next-server info) but still next-server is set on the main screen. If I remember correctly from my tests this was not the case for me. We added this check to see if DHCP server and DHCP Proxy send next-server to inform the user that their DHCP setup might be “screwed”. But it looks like this is causing issues (maybe only on low end consumer NICs??).@Tom-Elliott Do you think changing the check would make a difference.
Current check:isset ${proxydhcp/next-server} && isset ${next-server} && echo Duplicate option 66 ...
New check:isset ${proxydhcp/next-server} && isset ${net0/next-server} && echo Duplicate option 66 ...
But about machines using net1 instead of net0? Maybe I added too many checks. -
I have been having a similar issue. I am not using the fog server for DHCP. Options 66 and 67 are set correctly on my Windows server. 99% of the time I am asked for the IP of the TFTP server. Since I changed the boot order of 800+ machines to boot to the NIC first when I installed and successfully used FOG 0.32, it has become a real issue while testing 1.2 (latest Trunk).
Once I enter the TFTP (fog server) ip, it seems to work. Firewall is off. Permissions on TFTP folder are correct. Any help would be greatly appreciated.
-
@Tim-M I can totally understand that you are worried about this having 800+ clients which don’t want to boot on their own. As a quick solution I can compile a custom iPXE binary for you that does not do the check. Please tell me which binary are you using? undionly.kpxe, undionly.kkpxe, ipxe.pxe, …?
On the other hand it would be awesome if you could provide the same information than Quazz did. PCAP dump file of a client booting (best to use a hub to connect the client and another PC to capture the packets) and possibly also pictures of the iPXE config menu.
Edit: @Tim-M and @Quazz Tom added the fix I suggested. So you can give it a try by upgrading to the very latest trunk and re-running the installer. Please let us know if this makes a difference with your machines.
-
@Sebastian-Roth I’ll switch back to ISC-DHCP-SERVER in a bit to check for real, but currently on proxydhcp it seems to no longer complain about getting information from both the DHCP and ProxyDHCP server, so that seems to have improved things greatly.
EDIT: ISC-DHCP-SERVER no longer asks for tftp address on my end, as far as I’ve tested, so far so good!
EDIT2: Encountered the TFTP prompt again, think I’ll stick to proxydhcp for now.
EDIT3: I noticed sloppy name lookup was off, but I am unable to enable it (the checkbox clears itself when you try to save the information. In fact, I can’t change any settings it seems, and there’s some other glitches, I’ll post a picture.
WebGUI is now also unresponsive after restarting the server.
The unrespoviness seems to be related to the following apache error as far as I can tell
[Fri Jan 22 11:27:43.497568 2016] [:error] [pid 1367] [client 192.168.1.29:53467] PHP Warning: Division by zero in /var/www/html/fog/lib/pages/dashboardpage.class.php on line 79
Manually navigating to other pages of the webgui (avoiding the dashboard) works
-
Dropped my database, purged mysql and apache and reran an older revision to fix the previously mentioned glitches. I know it’s extreme, but my database had quite a few issues and this will be cleaner and faster for me.
-
@Quazz I believe you saw from an update that I pushed last night.
My push last night (on jquery and on templates) was to ensure any unset template value was defaulted to an empty string.
However my approach forgot that you can include other templates within the page.
For example, in the code:
$this->templates = array( '${field}', '${input}', ); $fields = array( _('This is a test to show how things can be added upon to the templates. The name of this new template will be ${name}') => _('This is the input param from above, also with an included template in directly defined: ${iamanugget}'), ); foreach ($fields AS $field => &$input) { $this->data[] = array( 'field' => $field, 'input' => $input, // Now i still need to add the other templated items to have it print out 'name' => 'Hello World', 'iamanugget' => 'Sorry for being an idiot', ); }
The output should print something along the lines of:
This is a test to show how things can be added upon to the templates. The name of this new template will be Hello World This is the input param from above, also with an included template in directly defined: Sorry for being an idiot
This worked typically but if a templated value was not set in the data, it would show similar issues as you saw in my mistake.
I stupidly didn’t iterate the data as it’s passed to replace the found entries which is why you were seeing these ‘${service_value}’ and on other pages.
This should now be fixed.
-
@Tom-Elliott Alright, good to hear, I’ll update to the newest revision in a bit after I’ve captured these images.
I believe everything is thus resolved on my end, thanks