Clients PXE booting from another subnet
-
Looks like your advice was very useful. In the end I found the following setting allowed the client (from the 192.168.2.0/24 subnet) to boot into the FOG menu (192.168.0.87):
next-server 192.168.0.87; # Select the correct PXE boot file depending on whether Legacy or UEFI booting is requested class "pxeclients" { match if substring (option vendor-class-identifier, 0, 9) = "PXEClient"; if option architecture-type = 00:07 { filename "intel.efi"; } else { filename "undionly.kkpxe"; } }
The only issue now is that I’m getting the following message when attempting to do a Quick Registration of the client:
udhcpc: sending discover udhcpc: sending discover udhcpc: sending discover udhcpc: no lease, failing Either DHCP failed or we were unable to access http://192.168.0.87/fog//index.php for connection testing No DHCP responce on interface eth0, skipping it. Failed to get an IP via DHCP! Tried on interfaces(s): eth0 Please check your network setup and try again! Press enter to continue
Note the DHCP server is not running on the FOG server but the production server.
Perhaps I need to do some more tweaking ?
-
@jgiovann You need to push a gateway/router to those clients in the other subnet. Add
option routers 192.168.2.X;
to your DHCP config. There must be a router/gateway between those two networks and you need to specify its IP from the 192.168.2.0/24 side. -
@Sebastian-Roth Thanks for the recommendation. Admittedly the option you’ve mentioned is already in the DHCP config
subnet 192.168.2.0 netmask 255.255.255.0 { option routers 192.168.2.1; option subnet-mask 255.255.255.0;
The target machine correctly obtains an IP address (statically assigned) when PXE booting into the FOG menu. A truncated snippet of the log files is shown here
Dec 06 16:40:01 DHCPOFFER on 192.168.2.89 to 48:4d:7e:d5:66:a5 via 192.168.2.1 Dec 06 16:40:02 DHCPDISCOVER from 48:4d:7e:d5:66:a5 via 192.168.2.1 Dec 06 16:40:02 DHCPOFFER on 192.168.2.89 to 48:4d:7e:d5:66:a5 via 192.168.2.1 Dec 06 16:40:04 DHCPREQUEST for 192.168.2.89 (192.168.0.20) from 48:4d:7e:d5:66:a5 via 192.168.2.1 Dec 06 16:40:04 DHCPACK on 192.168.2.89 to 48:4d:7e:d5:66:a5 via 192.168.2.1
The problem is when one attempts to register the host that there is no response on the interface eth0
-
Perhaps I should step back and not run dnsmasq in the first place, i.e. simplify the setup ? In fact if I stop the dnsmasq service (on the FOG server), I can boot into the FOG menu
-
The log files on the FOG server don’t provide any details as to what is going on here. Perhaps I need to tweak the FOG server configuration to explicitly tell it where to find the DHCP server and other relevant details ?
-
-
@jgiovann said in Clients PXE booting from another subnet:
Perhaps I should step back and not run dnsmasq in the first place, i.e. simplify the setup ? In fact if I stop the dnsmasq service (on the FOG server), I can boot into the FOG menu
Yes, as you seem to have your other DHCP server setup correctly now you don’t need dnsmasq anymore. I don’t think it causes the issue you have right now but better disable the service.
About the issue on registration. On boot the client checks if it can reach the FOG server via HTTP. You can manually do that. Boot up a Windows client in the 192.168.2.0 network and open the FOG web UI URL in the browser. Does it work?
-
Getting closer …
When I enter the URL http://192.168.0.87//fog//management/index.php from a client in the 192.168.2.0/24 subnet (as reported by the host registration step) I get the default FOG Project login screen (not able to attach a screenshot). i.e. I can connect to the URL
Note:
-
The URL is redirected from http://192.168.0.87//fog//management/index.php to http://192.168.0.87//fog//management/index.php
-
Are the double slashes significant in the URL ?
-
I also checked for TCP connections on the FOG server. There are no TCP connections on port 80 via IPv4. The primary DHCP server is configured for IPv4 (not IPv6).
# netstat -ant | grep -v 127.0.0.1 | head -15 Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:60313 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:20048 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:21 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:38871 0.0.0.0:* LISTEN tcp 0 0 192.168.0.87:22 192.168.2.12:44596 ESTABLISHED tcp 0 0 192.168.0.87:50226 192.168.0.216:389 ESTABLISHED tcp 0 0 192.168.0.87:22 192.168.2.12:39532 ESTABLISHED tcp 0 0 192.168.0.87:22 192.168.2.12:39414 ESTABLISHED tcp6 0 0 ::1:25 :::* LISTEN # netstat -ant6 | grep -v 127.0.0.1 | head -15 Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp6 0 0 ::1:25 :::* LISTEN tcp6 0 0 :::443 :::* LISTEN tcp6 0 0 :::56638 :::* LISTEN tcp6 0 0 :::2049 :::* LISTEN tcp6 0 0 :::39500 :::* LISTEN tcp6 0 0 :::111 :::* LISTEN tcp6 0 0 :::80 :::* LISTEN tcp6 0 0 :::20048 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN tcp6 0 0 192.168.0.87:80 192.168.2.12:34100 TIME_WAIT tcp6 0 0 192.168.0.87:80 192.168.0.87:53770 TIME_WAIT tcp6 0 0 192.168.0.87:80 192.168.0.87:53604 TIME_WAIT tcp6 0 0 192.168.0.87:80 192.168.2.12:34116 TIME_WAIT
I can provide more details.
-
-
@jgiovann said in Clients PXE booting from another subnet:
The URL is redirected from http://192.168.0.87//fog//management/index.php to http://192.168.0.87//fog//management/index.php
From what I see the URLs are exactly the same. So how would there be a redirect? Please post again those URLs. We do redirecting in some cases but I can’t think of that causing the issue for you.
Are the double slashes significant in the URL ?
No, should not necessarily be there but we had those in the scripts for a long time and did not seem to cause trouble.I have just had a look at the scripts again and I wonder why it is showing the URL including that “/management/” part. From what I remember that should not be the case. Cannot remember from the top of my head if it’s in the storage node or general fog settings. Think it’s the later. Please check if you have messed with those. Web root is usually just
fog
… -
@jgiovann Ah sorry, just saw that I had miss-read one of your posts. Looking back to the older ones I see that your client seemed to try to connect to http://192.168.0.87/fog//index.php which is the right URL - not the one you posted last…
Now as I think of it it’s probably best you test this URL again and watch the apache access logs. On your FOG server run
tail -f /var/log/apache2/access.log
(debian/ubuntu) ortail -f /var/log/httpd/access_log
(centos/fedora/rhel), hit ENTER twice to see where the last state was and then open http://192.168.0.87/fog//index.php in your browser from the client in the 192.168.2.0/24 network. You probably see the request coming in. Now quickly PXE boot another client that you want to register, choose quick register and keep an eye on the access logs while it boots up. Keep hitting ENTER on the access log and see if you can find the entry that says the client. If you don’t see the client requesting on PXE boot we probably need to see if you have some weird firewall rules blocking only some of the 192.168.2.0/24 clients?! -
@Sebastian-Roth I rebuilt the server from scratch using the latest stable version of FOG (1.5.5). Also checked the firewall between the 2 subnets - there are no rules blocking communication between the client and FOG server. I also disabled SELinux and stopped the firewall on the FOG server (for the time being).
- Continuous pings to the target client show that the interface is correctly assigned an IP address after it boots into the FOG menu. However as soon as the registration process is launched, the client loses connectivity and is no longer able to communicate with the FOG server
64 bytes from 192.168.2.89: icmp_seq=85 ttl=64 time=40.2 ms 64 bytes from 192.168.2.89: icmp_seq=86 ttl=64 time=28.0 ms 64 bytes from 192.168.2.89: icmp_seq=87 ttl=64 time=0.301 ms ... (at this point the registration process is launched) From 192.168.2.12 icmp_seq=128 Destination Host Unreachable From 192.168.2.12 icmp_seq=129 Destination Host Unreachable From 192.168.2.12 icmp_seq=130 Destination Host Unreachable
- I’m attaching snippets of both the error log
[Wed Dec 12 10:36:40.973608 2018] [core:notice] [pid 5259] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Dec 12 16:12:01.720797 2018] [mpm_prefork:notice] [pid 5259] AH00170: caught SIGWINCH, shutting down gracefully [Wed Dec 12 16:12:34.842582 2018] [core:notice] [pid 5266] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0 [Wed Dec 12 16:12:34.852140 2018] [suexec:notice] [pid 5266] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Wed Dec 12 16:12:34.954624 2018] [auth_digest:notice] [pid 5266] AH01757: generating secret for digest authentication ... [Wed Dec 12 16:12:34.956680 2018] [lbmethod_heartbeat:notice] [pid 5266] AH02282: No slotmem from mod_heartmonitor [Wed Dec 12 16:12:35.121620 2018] [mpm_prefork:notice] [pid 5266] AH00163: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips mod_fcgid/2.3.9 PHP/5.6.39 configured -- resuming normal operations [Wed Dec 12 16:12:35.121645 2018] [core:notice] [pid 5266] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
- … and the access_log (filtered by the target client machine)
192.168.2.89 - - [12/Dec/2018:16:08:04 +1100] "GET /fog/service/ipxe/init.xz HTTP/1.1" 200 19286132 "-" "iPXE/1.0.0+ (960d1)" 192.168.2.89 - - [12/Dec/2018:16:12:41 +1100] "POST /fog/service/ipxe/boot.php HTTP/1.1" 200 2701 "-" "iPXE/1.0.0+ (960d1)" 192.168.2.89 - - [12/Dec/2018:16:12:42 +1100] "GET /fog/service/ipxe/bg.png HTTP/1.1" 200 21280 "-" "iPXE/1.0.0+ (960d1)" 192.168.2.89 - - [12/Dec/2018:16:14:54 +1100] "GET /fog/service/ipxe/bzImage HTTP/1.1" 200 8430224 "-" "iPXE/1.0.0+ (960d1)" 192.168.2.89 - - [12/Dec/2018:16:14:54 +1100] "GET /fog/service/ipxe/init.xz HTTP/1.1" 200 19286132 "-" "iPXE/1.0.0+ (960d1)"
- I double checked the URL redirect, I was meant to say that http://192.168.0.87//fog//index.php is re-directed to http://192.168.0.87//fog//management/index.php
. Is this the correct URL ? The re-directed URL is the management login page
As a final option, could I add a 2nd network interface on the FOG server which has an IP address in the 192.168.2.0/24 subnet ?
-
@jgiovann said in Clients PXE booting from another subnet:
Continuous pings to the target client show that the interface is correctly assigned an IP address after it boots into the FOG menu. However as soon as the registration process is launched, the client loses connectivity and is no longer able to communicate with the FOG server.
I have seen the client receiving a different IP address on different stages of the PXE boot process. In that whole process the client requests an address from the DHCP server three times. First the PXE ROM of your network card, second is iPXE and last the Linux Kernel. There should be no difference in the DHCP information the client gets for each of those three stages but you never know. Maybe there is another wild DHCP server in your network or a replicating DHCP server setup that is playing tricks.
To actually know what DHCP information is sent is key here I suppose. Setup a mirror port to capture the client port traffic using Wireshark.
If that is asking too much of you we could maybe do a Teamviewer session. The other thing you could check is when exactly does the ping stop? It should stop and pick up again several times if the IP does not change. Check your DHCP logs or leases to see which IP it recieves. As well pay attention on boot up, the Linux part should show the IP it gets as well.
Then see if you find that HTTP request done by the Linux FOS client after DHCP. It should be a so called HTTP HEAD request.
-
@Sebastian-Roth It turns out I was getting the same issue even with the client and FOG server resided on the same subnet. However not a problem in a virtual environment.
… after capturing port traffic with wireshark and doing some investigation, it turns out that turning off the spanning-tree protocol on the provisioning port allowed the client to register with the FOG server.
In a real production network (where turning off spanning tree is not be allowed), is it therefore possible to re-configure the FOG server to wait longer or retry more times before it gives up the registration retry loop ?
-
@jgiovann Great you figured this is a spanning treee thing! Seems like I was to focused on the issue might be routing problems that I didn’t notice the message
udhcpc: no lease, failing
.In a real production network (where turning off spanning tree is not be allowed), is it therefore possible to re-configure the FOG server to wait longer or retry more times before it gives up the registration retry loop ?
The problem cannot be solved in the registration retry loop I think but we’d need to tell the dhcp client to wait longer. But from my point of view you should be able to solve this by setting the client ports to “port fast”. There er different names for this but what it essentially does is disable spanning tree for particular ports where you surely know there are no other switches connected but only clients. On those ports you never ever need spanning tree because single clients connected to a port can never cause a loop (which spanning tree was invented to prevent from)!