FOG With more than 1 subnet
-
@george1421 We have one PfSense as DHCP Server. We have options 66 and 67 configured on dhcp server of each subnet. undionly.kkpxe for bios and snponly.efi for uefi.
-
@igorpa2 OK very good. PFSense does work very well for pxe booting with FOG. It has both fields for uefi and bios as you noted.
So have you confirmed that pxe booting works correctly for both uefi and bios on each subnet? Are you sure you are not using the non-imaging interface for anything at the moment? Lets just focus on the interface that was defined when fog was installed. As you noted you can review the /opt/fog/.fogsettings file to see how the questions were answered when fog was installed.
-
@george1421 said in FOG With more than 1 subnet:
So have you confirmed that pxe booting works correctly for both uefi and bios on each subnet?
With our actual scenario, yes.(One master fog in 200 subnet with one network interface on it and another fog installed as storage node on 172 subnet with one network interface).
-
@igorpa2 OK then so what I understand then we can discount anything regarding tftp from your previous post.
So to the imaging point. You have two storage nodes. One is the master node and one is a slave node. In your FOG configuration are they in the same storage group? If yes, did you install the the Location plugin into the FOG server and assign each storage node to a location?
-
@george1421 said in FOG With more than 1 subnet:
So to the imaging point. You have two storage nodes. One is the master node and one is a slave node. In your configuration are they in the same storage group?
Yes
@george1421 said in FOG With more than 1 subnet:
If yes, did you install the the Location plugin into the FOG server and assign each storage node to a location?
No, i don’t have installed the location plugin.
-
@igorpa2 said in FOG With more than 1 subnet:
No, i don’t have installed the location plugin.
Without the location plugin the clients will typically image from the master node until the client count is reached then the next target computer will roll over to the slave node.
So go and install the location plugin. Create your two locations. They can be called anything just they must be named differently. Finally assign a storage node to a location.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 said in FOG With more than 1 subnet:
No, i don’t have installed the location plugin.
Without the location plugin the clients will typically image from the master node until the client count is reached then the next target computer will roll over to the slave node.
So go and install the location plugin. Create your two locations. They can be called anything just they must be named differently. Finally assign a storage node to a location.
Okay done, installed, created and assigned.
-
@igorpa2 Ok the last bit of setup for the location plugin is to assign target computers to the location that way they know which storage node is their home server.
Once you do that the pxe booting computer will contact the master node during pxe boot, load ipxe then find out which storage node to use to pull the image from.
Just be aware the way fog works, you can only capture images to the master node. Slave nodes are deploy only nodes.
So the linkage is storage node to a location and target computer to a location so they can both find each other.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 Ok the last bit of setup for the location plugin is to assign target computers to the location that way they know which storage node is their home server.
Once you do that the pxe booting computer will contact the master node during pxe boot, load ipxe then find out which storage node to use to pull the image from.
Just be aware the way fog works, you can only capture images to the master node. Slave nodes are deploy only nodes.
So the linkage is storage node to a location and target computer to a location so they can both find each other.
Well done, i have configured the hosts. Thank you George for help, tomorrow i will test if will work. Just to know, it’s possible to use only the Master FOG installed on 200 network with one network interface and this work to all others subnets ? I will test this environment again to see if i will have problems and/or to remember what problems i had when on i installed the FOG for first time. Thank you!
-
@igorpa2 said in FOG With more than 1 subnet:
Just to know, it’s possible to use only the Master FOG installed on 200 network with one network interface and this work to all others subnets
Yes this is how I have it setup on my campus is that just one fog server and one interface can image all 6 vlans. Understand that imaging across your vlans will put a network load on your vlan router which may impact your overall transfer rates. Imaging a fog server on a 1GbE network on the same vlan as the fog server, you should see transfer rates (according to partclone) in the 5.5 to 6.2GB/min using contemporary target computer as a baseline. Across your subnets I would expect in the lower 5GB/min range. Now my infrastructure used 10GbE in the core with a 10GbE router and I see 13-14GB/min to target computers attached to an access layer switch at 1GbE.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 said in FOG With more than 1 subnet:
Just to know, it’s possible to use only the Master FOG installed on 200 network with one network interface and this work to all others subnets
Yes this is how I have it setup on my campus is that just one fog server and one interface can image all 6 vlans. Understand that imaging across your vlans will put a network load on your vlan router which may impact your overall transfer rates. Imaging a fog server on a 1GbE network on the same vlan as the fog server, you should see transfer rates (according to partclone) in the 5.5 to 6.2GB/min using contemporary target computer as a baseline. Across your subnets I would expect in the lower 5GB/min range. Now my infrastructure used 10GbE in the core with a 10GbE router and I see 13-14GB/min to target computers attached to an access layer switch at 1GbE.
Yes, in our 1GbE network, we see 5 to 7GB/min transfer rate, like your network, but when client connect on node that are on another subnet(client on 172 subnet coneccting on 200 node) the transfer don’t exceed 2GB/min and some cases reaching a whopping 20~50MB/Min. I honestly don’t know why this happens, but I’ll check all the settings again and test using how you use them and see what problems we have. I will update this thread again with any conclusions I reach.
Thank you.
-
@igorpa2 We do have some tools built into the FOS Linux (the os that runs on the target computer) where we can put FOS Linux in debug mode and then test network throughput to see if the network links are able to pass 1GbE or not if you want to do some debugging. My bet is that your vlan router can’t maintain the normal traffic flow plus add imaging traffic. In some testing I’ve done, I can flood a 1GbE link on a server with just 3 unicast images running at the same time. 6.1GB/min equates to about 100MB/s or about 1Gb/s (the full bandwidth of a 1GbE link. Understand that number in partclone is actually the entire data path and not just network, so its a bit misleading. But know we have tools like iperf3 on FOS Linux so we can test bandwidth back to the fog server for debugging slow connections.
-
@george1421 I have tested to use only one FOG to use on all subnets and I have a problem that I can’t transmit the file via TFTP. My client on network 172 normally receives the IP from DHCP, but does not receive the file to boot.
I also did the following test: I entered my FOG from network 172, and used the TFTP get command for my main FOG server and got the same error when I try to boot with the 172 clients. The file does not come, it gives a timeout error . Do you have any idea what it could be? In clients from network 200, the boot occur normally.
-
@igorpa2 said in FOG With more than 1 subnet:
entered my FOG from network 172, and used the TFTP get command for my main FOG server and got the same error
I have see something similar but not with subnets on the same campus. I have seen this with a WAN configuration, where the MTU of the link is below the block size of tftp and the packets get fragmented and then discarded by the WAN router.
Lets rule out network connectivity.
- Can you ping the FOG server on the 200 vlan from the 172 vlan?
- Is there some type of screening router or firewall between the two vlans that might filter out tftp traffic?
-
- Yes, i can ping each other on both two subnets.
- Yes, pfsense itself. I set the rules to pass all ports and all protocols between the FOG IP Server and the 172 network.
The MTU is in blank, what seems to be the default 1500.
-
@igorpa2 OK then I guess you need to see if its an mtu issue then
Here is a good article on this: https://www.comparitech.com/net-admin/determine-mtu-size-using-ping/ look at the section “Find the path MTU with a Ping command”
I kind of don’t think is this the issue, but the test is pretty easy. From a windows or linux computer on the 172 subnet run the ping command as outlined in that document. I think the magic number is having an MTU larger than 1468, this is the default tftp block size. If your MTU is 1500 more less then this issue isn’t related to MTU. If your mtu is less than 1468 then we can adjust the block size on the fog server to be less than your mtu.
-
@george1421 If its not an mtu issue then lets see if you can connect to the port. Microsoft has a tool call portqry that we can use from 172 to connect to the fog server on 200. All this tool does is try to open a port at the defined IP address. It doesn’t know what the port does it just tries to reach it.
In the case of tftp its udp port 69.
The command might look like this
portqry.exe -n 192.168.200.55 -p udp -e 69
FWI: 192.168.200.55 represents whatever the fog server’s IP address is for the imaging network.
If you can’t connect to the port then we need to look at the fog server to see if some kind of firewall is enabled on the fog server, to-where why it only allows communication on the local subnet.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 OK then I guess you need to see if its an mtu issue then
Here is a good article on this: https://www.comparitech.com/net-admin/determine-mtu-size-using-ping/ look at the section “Find the path MTU with a Ping command”
I kind of don’t think is this the issue, but the test is pretty easy. From a windows or linux computer on the 172 subnet run the ping command as outlined in that document. I think the magic number is having an MTU larger than 1468, this is the default tftp block size. If your MTU is 1500 more less then this issue isn’t related to MTU. If your mtu is less than 1468 then we can adjust the block size on the fog server to be less than your mtu.
I discoverd the MTU is 1472. Sending a ping above this with the option “do not fragment” it returns the message “ping: local error: Message too long, mtu=1500”
-
@igorpa2 OK its not an MTU.
A little context here the default MTU is 1500, you measured 1472 for the packet size. The descrpancy is 28 bytes which is the ethernet header size. This link is normal.
Do the next test with portqry to see if you can reach the tftp port on the FOG server since pings work.
FWIW you can also use portqry to see if the web server is reachable by changing the protocol -p to tcp and the port -e to 80 in the command I previously provided.
-
@george1421 said in FOG With more than 1 subnet:
@george1421 If its not an mtu issue then lets see if you can connect to the port. Microsoft has a tool call portqry that we can use from 172 to connect to the fog server on 200. All this tool does is try to open a port at the defined IP address. It doesn’t know what the port does it just tries to reach it.
In the case of tftp its udp port 69.
The command might look like this
portqry.exe -n 192.168.200.55 -p udp -e 69
FWI: 192.168.200.55 represents whatever the fog server’s IP address is for the imaging network.
If you can’t connect to the port then we need to look at the fog server to see if some kind of firewall is enabled on the fog server, to-where why it only allows communication on the local subnet.
Here’s the command result:
(I hid the IP because it is a real ip)