FOG With more than 1 subnet
-
@igorpa2 said in FOG With more than 1 subnet:
No, i don’t have installed the location plugin.
Without the location plugin the clients will typically image from the master node until the client count is reached then the next target computer will roll over to the slave node.
So go and install the location plugin. Create your two locations. They can be called anything just they must be named differently. Finally assign a storage node to a location.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 said in FOG With more than 1 subnet:
No, i don’t have installed the location plugin.
Without the location plugin the clients will typically image from the master node until the client count is reached then the next target computer will roll over to the slave node.
So go and install the location plugin. Create your two locations. They can be called anything just they must be named differently. Finally assign a storage node to a location.
Okay done, installed, created and assigned.
-
@igorpa2 Ok the last bit of setup for the location plugin is to assign target computers to the location that way they know which storage node is their home server.
Once you do that the pxe booting computer will contact the master node during pxe boot, load ipxe then find out which storage node to use to pull the image from.
Just be aware the way fog works, you can only capture images to the master node. Slave nodes are deploy only nodes.
So the linkage is storage node to a location and target computer to a location so they can both find each other.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 Ok the last bit of setup for the location plugin is to assign target computers to the location that way they know which storage node is their home server.
Once you do that the pxe booting computer will contact the master node during pxe boot, load ipxe then find out which storage node to use to pull the image from.
Just be aware the way fog works, you can only capture images to the master node. Slave nodes are deploy only nodes.
So the linkage is storage node to a location and target computer to a location so they can both find each other.
Well done, i have configured the hosts. Thank you George for help, tomorrow i will test if will work. Just to know, it’s possible to use only the Master FOG installed on 200 network with one network interface and this work to all others subnets ? I will test this environment again to see if i will have problems and/or to remember what problems i had when on i installed the FOG for first time. Thank you!
-
@igorpa2 said in FOG With more than 1 subnet:
Just to know, it’s possible to use only the Master FOG installed on 200 network with one network interface and this work to all others subnets
Yes this is how I have it setup on my campus is that just one fog server and one interface can image all 6 vlans. Understand that imaging across your vlans will put a network load on your vlan router which may impact your overall transfer rates. Imaging a fog server on a 1GbE network on the same vlan as the fog server, you should see transfer rates (according to partclone) in the 5.5 to 6.2GB/min using contemporary target computer as a baseline. Across your subnets I would expect in the lower 5GB/min range. Now my infrastructure used 10GbE in the core with a 10GbE router and I see 13-14GB/min to target computers attached to an access layer switch at 1GbE.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 said in FOG With more than 1 subnet:
Just to know, it’s possible to use only the Master FOG installed on 200 network with one network interface and this work to all others subnets
Yes this is how I have it setup on my campus is that just one fog server and one interface can image all 6 vlans. Understand that imaging across your vlans will put a network load on your vlan router which may impact your overall transfer rates. Imaging a fog server on a 1GbE network on the same vlan as the fog server, you should see transfer rates (according to partclone) in the 5.5 to 6.2GB/min using contemporary target computer as a baseline. Across your subnets I would expect in the lower 5GB/min range. Now my infrastructure used 10GbE in the core with a 10GbE router and I see 13-14GB/min to target computers attached to an access layer switch at 1GbE.
Yes, in our 1GbE network, we see 5 to 7GB/min transfer rate, like your network, but when client connect on node that are on another subnet(client on 172 subnet coneccting on 200 node) the transfer don’t exceed 2GB/min and some cases reaching a whopping 20~50MB/Min. I honestly don’t know why this happens, but I’ll check all the settings again and test using how you use them and see what problems we have. I will update this thread again with any conclusions I reach.
Thank you.
-
@igorpa2 We do have some tools built into the FOS Linux (the os that runs on the target computer) where we can put FOS Linux in debug mode and then test network throughput to see if the network links are able to pass 1GbE or not if you want to do some debugging. My bet is that your vlan router can’t maintain the normal traffic flow plus add imaging traffic. In some testing I’ve done, I can flood a 1GbE link on a server with just 3 unicast images running at the same time. 6.1GB/min equates to about 100MB/s or about 1Gb/s (the full bandwidth of a 1GbE link. Understand that number in partclone is actually the entire data path and not just network, so its a bit misleading. But know we have tools like iperf3 on FOS Linux so we can test bandwidth back to the fog server for debugging slow connections.
-
@george1421 I have tested to use only one FOG to use on all subnets and I have a problem that I can’t transmit the file via TFTP. My client on network 172 normally receives the IP from DHCP, but does not receive the file to boot.
I also did the following test: I entered my FOG from network 172, and used the TFTP get command for my main FOG server and got the same error when I try to boot with the 172 clients. The file does not come, it gives a timeout error . Do you have any idea what it could be? In clients from network 200, the boot occur normally.
-
@igorpa2 said in FOG With more than 1 subnet:
entered my FOG from network 172, and used the TFTP get command for my main FOG server and got the same error
I have see something similar but not with subnets on the same campus. I have seen this with a WAN configuration, where the MTU of the link is below the block size of tftp and the packets get fragmented and then discarded by the WAN router.
Lets rule out network connectivity.
- Can you ping the FOG server on the 200 vlan from the 172 vlan?
- Is there some type of screening router or firewall between the two vlans that might filter out tftp traffic?
-
- Yes, i can ping each other on both two subnets.
- Yes, pfsense itself. I set the rules to pass all ports and all protocols between the FOG IP Server and the 172 network.
The MTU is in blank, what seems to be the default 1500.
-
@igorpa2 OK then I guess you need to see if its an mtu issue then
Here is a good article on this: https://www.comparitech.com/net-admin/determine-mtu-size-using-ping/ look at the section “Find the path MTU with a Ping command”
I kind of don’t think is this the issue, but the test is pretty easy. From a windows or linux computer on the 172 subnet run the ping command as outlined in that document. I think the magic number is having an MTU larger than 1468, this is the default tftp block size. If your MTU is 1500 more less then this issue isn’t related to MTU. If your mtu is less than 1468 then we can adjust the block size on the fog server to be less than your mtu.
-
@george1421 If its not an mtu issue then lets see if you can connect to the port. Microsoft has a tool call portqry that we can use from 172 to connect to the fog server on 200. All this tool does is try to open a port at the defined IP address. It doesn’t know what the port does it just tries to reach it.
In the case of tftp its udp port 69.
The command might look like this
portqry.exe -n 192.168.200.55 -p udp -e 69
FWI: 192.168.200.55 represents whatever the fog server’s IP address is for the imaging network.
If you can’t connect to the port then we need to look at the fog server to see if some kind of firewall is enabled on the fog server, to-where why it only allows communication on the local subnet.
-
@george1421 said in FOG With more than 1 subnet:
@igorpa2 OK then I guess you need to see if its an mtu issue then
Here is a good article on this: https://www.comparitech.com/net-admin/determine-mtu-size-using-ping/ look at the section “Find the path MTU with a Ping command”
I kind of don’t think is this the issue, but the test is pretty easy. From a windows or linux computer on the 172 subnet run the ping command as outlined in that document. I think the magic number is having an MTU larger than 1468, this is the default tftp block size. If your MTU is 1500 more less then this issue isn’t related to MTU. If your mtu is less than 1468 then we can adjust the block size on the fog server to be less than your mtu.
I discoverd the MTU is 1472. Sending a ping above this with the option “do not fragment” it returns the message “ping: local error: Message too long, mtu=1500”
-
@igorpa2 OK its not an MTU.
A little context here the default MTU is 1500, you measured 1472 for the packet size. The descrpancy is 28 bytes which is the ethernet header size. This link is normal.
Do the next test with portqry to see if you can reach the tftp port on the FOG server since pings work.
FWIW you can also use portqry to see if the web server is reachable by changing the protocol -p to tcp and the port -e to 80 in the command I previously provided.
-
@george1421 said in FOG With more than 1 subnet:
@george1421 If its not an mtu issue then lets see if you can connect to the port. Microsoft has a tool call portqry that we can use from 172 to connect to the fog server on 200. All this tool does is try to open a port at the defined IP address. It doesn’t know what the port does it just tries to reach it.
In the case of tftp its udp port 69.
The command might look like this
portqry.exe -n 192.168.200.55 -p udp -e 69
FWI: 192.168.200.55 represents whatever the fog server’s IP address is for the imaging network.
If you can’t connect to the port then we need to look at the fog server to see if some kind of firewall is enabled on the fog server, to-where why it only allows communication on the local subnet.
Here’s the command result:
(I hid the IP because it is a real ip)
-
@igorpa2 OK this is getting interesting. Its showing that the port is being filtered or blocked. Interesting. Something seems to be stopping the tftp communications.
Since the pfsense router is between the two, look at the firewall logs to see if pfsense for some reason is blocking that connection. Remember the key thing to look for is UDP and port :69 in the log. If you see it being blocked you can hit the plus next to the ip address to add it to the quick rule allow list.
-
@george1421 Ok, i will check de pfsense again. The first thing i really thought of was him. But since I was not successful, I came here to ask for help hehe. I don’t understand why it could be blocking, because, as I said, I released all the ports and protocols of everything that comes from the 172 network and that go to the main FOG address.
-
@igorpa2 While I know its probably a little annoying test this, try that, you should be learning some debugging steps here and maybe some tools you have not used before. So I think its a good thing. Learning is always good.
If your pfsense looks like its working good, we could create a rule specifically for tftp between 172 and 200, but in our case we would enable the traffic, but more importantly turn on logging, so when a packet matched we would get a log entry to know the packet was flowing through the firewall.
We are trying the logical path between the target computer and fog server. Right now we are having an issue with communications, this doesn’t have anything to do with FOG just yet, its at a lower level on the OSI stack.
If its not pfsense then we need to look at the fog server itself. Since you can ping the fog server from 172 then routing is working. So that should be the problem. What host OS is the fog server running? Is it ubuntu? If yes issue the command
sudo ufw status
if it responds with a status then the firewall is enabled on ubuntu. -
@george1421 said in FOG With more than 1 subnet:
What host OS is the fog server running?
I’m using debian.
After some searching on internet, i found this steps(https://gist.github.com/troyfontaine/59ace875a951154f881bfe3d297d1a10) to configure PFSense with VLAN and TFTP Server. After configuring the step 3, now i can boot on the other vlan(172) but it’s more slowly than the vlan 200 to receive the boot file. But it works!! I don’t really know yet if this is the problem, but it worked haha. Now i’ll make some tests and see if the imaging it’s working correctly.
Thank you.
-
I also did the following test: I entered my FOG from network 172, and used the TFTP get command for my main FOG server and got the same error when I try to boot with the 172 clients. The file does not come, it gives a timeout error . Do you have any idea what it could be? In clients from network 200, the boot occur normally.