FOG working with virtual interface
-
TFTP HOST is set to 192.100.1.1.
Tproblem:
Booting and getting into the FOG menu is working everytime.
In most cases after choosing fog registration, quick image or fog sysinfo, fog is running into the timeout mentioned before - but sometimes it works and the client is getting an ip address and doing the registration or fog is deploying the image. I would say, every 4th time it seems to work.I tried fog 1.5.8 and 1.5.9-RC1. Both versions have problems with providing DHCP addresses after the fog menu items.
I thought maybe it has something to do with the kernel, so tried to change the used kernel and ran into the update problem with the kernel.
-
@Oleg Do you have a cheap unmanaged network switch you can place between the target computer and the building switch. About 90% of the time where we see this condition is because your building switch is using spanning tree but not one of the fast spanning tree protocols like port-fast, RSTP, MSTP, fast-STP (or what ever your switch vendor calls it), and not the FOG server itself. A (dumb) unmanaged switch will typically mask the problem and as a test gives us an understanding of the problem.
There is another way if you don’t have an unmanged switch to test, but just placing an unmanged switch between the target computer and the networks is the quickest test.
-
@george1421
I tried to do a quick registration 10 times with my unmanaged tp-link switch and in 6 of 19 cases the client pc didn’t get an ip address and ran into the timeout.
In that cases, where the client got an ip address, the client waited at least 15 seconds until it got the dhcp address from fog. -
@Sebastian-Roth
i noticed, that the installation process (or ./installfog.sh) is generating a strange netmask in the dhcpd.conf, if a virtuel interface is configured:
network settings of system:auto eth0 iface eth0 inet static address 192.168.111.8 netmask 255.255.255.0 gateway 192.168.111.1 dns-nameservers 192.168.111.1 192.168.100.100
virtuel interface:
auto eth0:0 iface eth0:0 inet static address 192.100.1.1 network 192.100.1.0
generated DHCP settings:
subnet 192.32.1.0 netmask 255.255.255.0{ option subnet-mask 255.255.255.0; range dynamic-bootp 192.168.111.10 192.168.111.254; default-lease-time 21600; max-lease-time 43200; option routers 192.168.111.1; option domain-name-servers 192.168.111.1; next-server 192.168.111.8;
if I disable the virtuel interface before running installfog.sh, everything is ok then.
I manually change all network settings to the 192.100.1.x after the script is finished.
-
@Oleg said in FOG working with virtual interface:
client waited at least 15 seconds until it got the dhcp address from fog
See the issue is that if it IS spanning tree, standard spanning tree doesn’t start forwarding data until the 27 second timer counts down. So that means the switch will not forward data for 27 seconds. By then the FOS Linux engine has already given up. Typically once of those cheap unmanaged switches don’t support spanning tree so they keep the building switch port active as PXE hands off to IPXE and then iPXE hand off to FOS Linux.
-
@george1421
I disabled the virtuel interface and changed all settings to the 192.168.111.8 ip address of the fog server.
It seems to work a little better but still the clients are often running into timeouts after the fog menu.
FOG and the client are both connected with the tp-link switch and nothing else -
@Sebastian-Roth
any ideas regarding the issue with the kernel update?
I disabled the virtual interface and run the 1.5.8 installer again and tried the kernel update - still get the error with ftp_rename() I mentioned in my first post.My system: ubuntu server 18.4.4
-
@Oleg For the kernel update issue you wanna go through this wiki article: https://wiki.fogproject.org/wiki/index.php/Troubleshoot_FTP
-
@Sebastian-Roth
I went through the wiki article and everything seems to be fine.I saw that fog is downloading the kernels twice if you do an update. First time if you expand the kernel you want to download and click on “download”. (Kernel will be downloaded as “bzImage”) The second time after you enter the kernel name and click on “install” - this time the kernel will be downloaded with the new name.
-
I don’t know if it has to do something with the timeout issue but in /etc/default/grub I added
GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0"
because we want to use eth0 name instead of enp1s0.
I changed manually the kernels in /var/www/html/fog/service/ipxe/ to the 4.15.2 Kernel.
Now I get eth0 instead of enp1s0:
Starting syslogd: OK Starting klogd: OK Running sysctl: OK Populating /dev using udev: done Saving random seed: OK Staring haveged: haveged: listening socket at 3 OK Starting eth0 interface and waiting for the link to come u
With the older kernel the issue with the timeout is gone.
-
@Oleg said:
I don’t know if it has to do something with the timeout issue but in /etc/default/grub I added
We don’t use GRUB for PXE booting hosts usually! Seems like you have customized your setup a fair bit. I may ask you to tell us more about the customization to we are able to properly help you!
I changed manually the kernels in /var/www/html/fog/service/ipxe/ to the 4.15.2 Kernel.
With the older kernel the issue with the timeout is gone.That’s interesting. So it seems like a Linux kernel network driver issue from what we know so far. Let’s start by trying to find out what driver is used. Please boot Windows on that machine, open device management and get us the device ID from there. Usually in the form 12c4:5f78.