Connection timed out. Chainloading failed
-
SERVER
FOG Version: 1.4.3
OS: Ubuntu 16.04 Server LTSCLIENT
Service Version:
OS: Windows 10DESCRIPTION
My current setup is the following:
- We have a fog server that acts as the imaging / Web UI interface and connects to the outside network.
- We have 2 Storage Nodes that act as dhcp / tftp servers that also connect to the outside network.
- We have ~40 hosts that connect to these storage nodes via switch
- Storage Node #1 (the one that works) is the master node
The first storage node works perfectly and images it’s hosts as expected, but hosts on the second storage node cannot get past the iPxe boot step. They fail with a “Connection timed out, Chainloading failed”
I have spent quite a bit of time trying to figure out what the difference is between these two nodes and I am at a bit of a loss. The default.ipxe files are the same between the two, as is /etc/network/interfaces
I also compared the the contents of /etc/dhcp/dhcpd.conf and the contents of /etc/default/isc-dhcp-server. No issues here either. Also, I moved one of the hosts from node 2 and connected it to node 1. The host no longer had any issues so this seems to be a problem with the storage node and not the clients.
I have attached two images, one is the error from a host on node 2, and the other is from a host on the working node.
Any ideas what I could be missing here?
-
@thebrennan47 said in Connection timed out. Chainloading failed:
Also, I moved one of the hosts from node 2 and connected it to node 1. The host no longer had any issues so this seems to be a problem with the storage node and not the clients.
Please explain exactly what it means to move a host from node 2 and connected it to node 1
How do you have the network setup between these two nodes? Are all systems on the same vlan(subnet)? The left image is kind of telling me that the target system doesn’t have a route to 10.8.88.42 from the 192.168.1.1 subnet. SO there is some kind of routing going on here.
-
Moving the host means physically moving it and plugging it into the switch that is managed by the other storage node. Both storage nodes are acting as dhcp servers and assigning ips, but they are connected to separate switches. I added a diagram that may help
-
@thebrennan47 That helps out quite a bit. I thought there was a bit more going on here than I originally imagined.
You need to change up your configuration. Your storage node is acting as a gateway between the fog server and the target systems. You need to change the served subnet on each storage node. Each storage node needs to be connected to a unique subnet. On the surface you have a routing problem not a fog problem.
Consider if you are using subnets 192.168.1.x on each storage node. If a computer on SN1 talks to the fog server, and a computer on SN2 talks to the fog server and each has the same subnet, how will the fog server know where/who to respond to.
So how to fix it? Give SN1 imaging network an IP subnet of 192.168.1.x./24. Give SN2 imaging network an IP subnet of 192.168.2.x/24. Give SN3 imaging network an IP subnet of 192.168.3.x/24.
Now go to your internet router and create a static route (or you can do this on your FOG Master node) that describes the subnets beyond each storage node. Like
ip route add 192.168.1.0 mask 255.255.255.0 via {SN1 LAN interface}
ip route add 192.168.2.0 mask 255.255.255.0 via {SN2 LAN interface}
ip route add 192.168.3.0 mask 255.255.255.0 via {SN3 LAN interface}You will need to change the dhcp server settings on each SN too to match their respective imaging network range.
Once you have routing and unique IP addresses in play FOG should work a bit better.