Wiki errors - "Troubleshooting a multicast"
-
@3mu Looking at the code (I’m not a programmer) but it looks like the multicast task service is what creates the udp-send task and command syntax. I still haven’t found what database record is used to source the interface name.
-
@3mu @george1421 Those settings were used in the past but nowadays FOG tries to determine the interface no matter what setting you have from the system. I know we should have removed those settings at some point but I guess someone forgot. Code where interface is being determined for multicast: 1, 2, 3
@3mu Please run the following command as root and post output here:
/sbin/ip route
The IP address configured for the storage node might play a role here too. Please post the IP set for the storage node as well. -
@george1421 - Current uptime was 1 day 14 hours. I have just now:
- Cleared all tasks, rebooted and created a new multicast task - the log file still says “–interface dev”.
- Cleared the tasks, changed the multicast interface to “eth0” (which doesn’t exist) and created a new multicast task - same result.
- Cleared the tasks, rebooted and created a new multicast task - same result.
The only things running on this server are FOG and dnsmasq. It is a VM under KVM and has 4GB RAM and a single processor. I have done a single in-place upgrade from 1.5.7 since my original install. I did have to recover the MySQL password (I was distracted and lost the new password before I could record it), but I recovered it without issue.
I have also updated the fogservice.class.php and FOGSnapinReplicator.service files as per the Interface not ready, waiting for it to come up post.
-
@3mu said in Wiki errors - "Troubleshooting a multicast":
I have also updated the fogservice.class.php and FOGSnapinReplicator.service files as per the Interface not ready, waiting for it to come up post.
Good point. I just remembered that too.
Please run the following command as root and post output here:
/sbin/ip route
The IP address configured for the storage node might play a role here too. Please post the IP set for the storage node as well. -
@Sebastian-Roth said in Wiki errors - "Troubleshooting a multicast":
/sbin/ip route
default via 10.0.0.138 dev ens3 proto dhcp src 10.0.0.203 metric 100 10.0.0.0/24 dev ens3 proto kernel scope link src 10.0.0.203 10.0.0.138 dev ens3 proto dhcp scope link src 10.0.0.203 metric 100
(run under sudo)
Storage Node config:
-
@3mu said in Wiki errors - "Troubleshooting a multicast":
default via 10.0.0.138 dev ens3 proto dhcp src 10.0.0.203 metric 100
Here we’ve got it. You seem to have configured your network interface ens3 via DHCP - looks like it from the output. While our code should not fail to find the right interface I still wonder why you would configure a servers IP via DHCP!?
Just configure the same IP but static in Linux network settings, restart your FOG server and it will most probably work!
-
@Sebastian-Roth - I look after a network with over 600 servers at work. The ones that cause me problems are the ones with static IP addresses - generally because somebody makes a typo when they are configuring an interface. We also do a lot of migrations, mergers and other changes, so reserved addresses in DHCP make that much easier, as well as enforcing consistency and a basic level of self-documentation. DHCP also makes it easier to deploy in environments where the user does not have control of the network.
I’m sorry for taking up your time, @george1421, and thank you both.
-
@3mu I will be looking into fixing this issue! Though I can’t promise you when the next release will be out. Would you want me to post the fix here so you could manually add it?
-
That would be great, thanks @Sebastian-Roth. I’m happy to test.
-
@3mu Sorry for the delay. I just had a play with this and I think it’s best to go for a simple fix: https://github.com/FOGProject/fogproject/commit/21460c1d8ba7dae1b2988b9287c188595ce01e9d
-
Thanks, @Sebastian-Roth. I’ll test as soon as I finish sorting out a few disasters - hopefully in the next day.
-
@Sebastian-Roth - Sorry for the delay - some other unrelated issues and then a pandemic to deal with! The fix works perfectly.