Multicast stuck
-
Hello,
Some weeks ago i changed my configuration into two vm’s. One installation as normal and the other as storage node.
After this my multicast is not working anymore. I have really no idea how this can be resolved. Is there anyone who can point me in the right direction?Both vm’s are running on almalinux 9 and FOG 1.5.10.16
multicast configuration
storage node configuration
multicast.log.udpcast.5 on storage node
[09-17-24 11:21:16 am] Task started Udp-sender 20200328 Using mcast address 234.54.68.102 UDP sender for /home/images/EEETopASUSPROA4110/d1p1.img at 10.54.68.102 on enp1s0 Broadcasting control to 224.0.0.1 New connection from 10.54.68.119 (#0) 00000009 New connection from 10.54.68.170 (#1) 00000009 Starting transfer: 00000009 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000 Dropping client #0 because of timeout Disconnecting #0 (10.54.68.119) Dropping client #1 because of timeout Disconnecting #1 (10.54.68.170) bytes= re-xmits=0000000 ( 0.0%) slice=0112 - 0 Transfer complete.^G Udp-sender 20200328 Using mcast address 234.54.68.102 UDP sender for /home/images/EEETopASUSPROA4110/d1p2.img at 10.54.68.102 on enp1s0 Broadcasting control to 224.0.0.1
multicast.log on storage node
[09-17-24 11:21:16 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 is new [09-17-24 11:21:16 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 image file found, file: /home/images/EEETopASUSPROA4110 [09-17-24 11:21:16 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 2 clients found [09-17-24 11:21:16 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 sending on base port 52186 [09-17-24 11:21:16 am] | Command: /usr/local/sbin/udp-sender --interface enp1s0 --min-receivers 2 --max-wait 600 --portbase 52186 --full-duplex --ttl 32 --nokbd --nopointopoint --file /home/images/EEETopASUSPROA4110/d1p1.img;/usr/local/sbin/udp-sender --interface enp1s0 --min-receivers 2 --max-wait 60 --portbase 5> [09-17-24 11:21:16 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 has started [09-17-24 11:21:26 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 is already running with pid: 130868 [09-17-24 11:21:36 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 is already running with pid: 130868 [09-17-24 11:21:46 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 is already running with pid: 130868 [09-17-24 11:21:56 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 is already running with pid: 130868 [09-17-24 11:22:06 am] | Task ID: 5 Name: Multi-Cast Task - Clone1 is already running with pid: 130868
.fogsettings on storage node
## Start of FOG Settings ## Created by the FOG Installer ## Find more information about this file in the FOG Project wiki: ## https://wiki.fogproject.org/wiki/index.php?title=.fogsettings ## Version: 1.5.10.29 ## Install time: Mon Jun 24 10:11:34 2024 ipaddress='10.54.68.102' copybackold='0' interface='enp1s0' submask='255.255.255.0' hostname='server-08.eazis' routeraddress='' plainrouter='' dnsaddress='' username='fogproject' password='aNjVWQhQeiW9KpXxeP*t' osid='1' osname='Redhat' dodhcp='' bldhcp='' dhcpd='dhcpd' blexports='1' installtype='S' snmysqluser='fogstorage' snmysqlpass='BL-XFitFm0lNFkhh5UGI' snmysqlhost='10.54.68.101' mysqldbname='fog' installlang='0' storageLocation='/home/images' fogupdateloaded=1 docroot='/var/www/html/' webroot='/fog/' caCreated='yes' httpproto='http' startrange='' endrange='' packages='bc curl gcc gcc-c++ genisoimage git gzip httpd lftp m4 make mariadb mariadb-server mod_ssl mtools net-tools nfs-utils openssl php php-bcmath php-cli php-common php-fpm php-gd php-ldap php-mbstring php-mysqlnd php-process syslinux tar tftp-server unzip util-linux-user vsftpd wget xz-devel' noTftpBuild='' tftpAdvOpts='' sslpath='/opt/fog/snapins/ssl/' backupPath='/home/' armsupport='' php_ver='8.0' sslprivkey='/opt/fog/snapins/ssl//.srvprivate.key' sendreports='Y' ## End of FOG Settings
-
@george1421 said in Multicast stuck:
[09-18-24 9:36:35 am] Interface not ready, waiting for it to come up: 10.54.68.101
[09-18-24 9:36:45 am] Interface not ready, waiting for it to come up: 10.54.68.101
[09-18-24 9:36:55 am] Interface not ready, waiting for it to come up: 10.54.68.101If you disable the storage node then you will get this message inside the log.
I found out that it wasn’t a FOG problem but i mixed up some vlans together.
Multicast works now. Problem solved. -
@Eazis So you went from physical servers to VMs on the same subnet?
These are VMs running under VMWare? If yes is your vSwitch configured for promiscuous mode?
There isn’t a lot of useful info here. We need a bit more detail about your environment. What else changed other than physical to virtual?
Does the fog server master node have more than 1 network interface in it?
FWIW only the master node will send out multicast images, so the storage node shouldn’t be in the picture here.
-
@george1421 Thanks for you answer.
Ok i will give you much info as needed.
My first setup about a year ago was one physical server. This physical server was retired and needed to be replaced.
I replaced this with a HP DL360 Gen9 and put this into my virtual datacenter lab (which is ovirt), Here work’s my multicast.
On this time i had only one VM running FOG.
Then I decided to change to a cluster with 3x HP DL360 Gen9 and split FOG into master and storage node, to have a failover.
All 3 servers have a physical network link to the same switch (Netgear GS108PE) and these networks are only used inside the VM.
Both VM’s are on the same subnet. Other network ports from the HPs are in use for gluster network and management network which are needed inside ovirt and are not available inside both vm’s and are connected to my ubiquiti switches.The storage node is the master node. Maybe this is wrong? Maybe i mixed some things up?
FOG installation -> 10.54.68.101
FOG storage node -> 10.54.68.102My FOG installation multicast.log says this ->
[09-18-24 9:36:35 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:36:45 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:36:55 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:37:05 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:37:15 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:37:25 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:37:35 am] Interface not ready, waiting for it to come up: 10.54.68.101 [09-18-24 9:37:45 am] Interface not ready, waiting for it to come up: 10.54.68.101
network interface on master
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 56:6f:26:4b:00:09 brd ff:ff:ff:ff:ff:ff inet 10.54.68.101/24 brd 10.54.68.255 scope global noprefixroute enp1s0 valid_lft forever preferred_lft forever inet6 fe80::546f:26ff:fe4b:9/64 scope link noprefixroute valid_lft forever preferred_lft forever
network interface on node
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 56:6f:26:4b:00:08 brd ff:ff:ff:ff:ff:ff inet 10.54.68.102/24 brd 10.54.68.255 scope global noprefixroute enp1s0 valid_lft forever preferred_lft forever inet6 fe80::546f:26ff:fe4b:8/64 scope link noprefixroute valid_lft forever preferred_lft forever
If you need any more info from me, i can give you this.
-
@Eazis The storagenode can be a master of a different storage group if I recall correctly, but the default storage group should have the master (where the web server is) as the master. There’s a lot of ways to configure it, but for simplicity in testing this out I would suggest 1 storage group with the master fog server as the master node.
What version of FOG and kernel is on the master and the nodes. They should all be the same, latest stable, latest dev-branch, or latest working-1.6-beta would be best as we may have already fixed the issue if it’s not a configuration issue.
-
@Eazis said in Multicast stuck:
[09-18-24 9:36:35 am] Interface not ready, waiting for it to come up: 10.54.68.101
[09-18-24 9:36:45 am] Interface not ready, waiting for it to come up: 10.54.68.101
[09-18-24 9:36:55 am] Interface not ready, waiting for it to come up: 10.54.68.101I think this is the most interesting bit of info. Why would udpsend wait for an interface to come up that is clearly up. In the global configuration make sure the proper interface is defined for the imaging interface. Also there may be an interface section in the multicast section of the configs. I don’t have a fog server near me at the moment but it should be under fog ui->fog configuration->fog settings. Hit the expand all button then search for
multi
Also when you schedule the multicast deployment from the fog server console run
ps aux|grep udpsend
and save the output. I think part of the udpsend command parameters will also list the interface udpsend is using. -
@george1421 said in Multicast stuck:
[09-18-24 9:36:35 am] Interface not ready, waiting for it to come up: 10.54.68.101
[09-18-24 9:36:45 am] Interface not ready, waiting for it to come up: 10.54.68.101
[09-18-24 9:36:55 am] Interface not ready, waiting for it to come up: 10.54.68.101If you disable the storage node then you will get this message inside the log.
I found out that it wasn’t a FOG problem but i mixed up some vlans together.
Multicast works now. Problem solved. -