Wiki errors - "Troubleshooting a multicast"
-
@3mu said in Wiki errors - "Troubleshooting a multicast":
ps only gives:
[username] 18913 0.0 0.0 13136 1092 pts/0 S+ 01:39 0:00 grep --color=auto udp-send
On first sight I would guess that it tries to start udp-sender but fails due to it not finding the interface
dev
and will quit. -
@3mu Will you show us a screen shot of the fog configuration->multicast settings page. I’m confused where its getting the
dev
name from. You don’t have the network interface setup as/dev/ens3
for some reason? -
There don’t appear to be any spaces before or after “ens3” - I deleted the contents and typed it in again. -
@3mu Well two things.
- So far no one else has reported this strange behavior. That doesn’t mean your install isn’t doing this, it just means there is something unique with your setup because if this was a systematic issue (programming) everyone would have the same issue.
- I guess we need to reverse engineer where that interface name comes from.
Just to confirm you have rebooted your fog server since setting that field?
-
@3mu Looking at the code (I’m not a programmer) but it looks like the multicast task service is what creates the udp-send task and command syntax. I still haven’t found what database record is used to source the interface name.
-
@3mu @george1421 Those settings were used in the past but nowadays FOG tries to determine the interface no matter what setting you have from the system. I know we should have removed those settings at some point but I guess someone forgot. Code where interface is being determined for multicast: 1, 2, 3
@3mu Please run the following command as root and post output here:
/sbin/ip route
The IP address configured for the storage node might play a role here too. Please post the IP set for the storage node as well. -
@george1421 - Current uptime was 1 day 14 hours. I have just now:
- Cleared all tasks, rebooted and created a new multicast task - the log file still says “–interface dev”.
- Cleared the tasks, changed the multicast interface to “eth0” (which doesn’t exist) and created a new multicast task - same result.
- Cleared the tasks, rebooted and created a new multicast task - same result.
The only things running on this server are FOG and dnsmasq. It is a VM under KVM and has 4GB RAM and a single processor. I have done a single in-place upgrade from 1.5.7 since my original install. I did have to recover the MySQL password (I was distracted and lost the new password before I could record it), but I recovered it without issue.
I have also updated the fogservice.class.php and FOGSnapinReplicator.service files as per the Interface not ready, waiting for it to come up post.
-
@3mu said in Wiki errors - "Troubleshooting a multicast":
I have also updated the fogservice.class.php and FOGSnapinReplicator.service files as per the Interface not ready, waiting for it to come up post.
Good point. I just remembered that too.
Please run the following command as root and post output here:
/sbin/ip route
The IP address configured for the storage node might play a role here too. Please post the IP set for the storage node as well. -
@Sebastian-Roth said in Wiki errors - "Troubleshooting a multicast":
/sbin/ip route
default via 10.0.0.138 dev ens3 proto dhcp src 10.0.0.203 metric 100 10.0.0.0/24 dev ens3 proto kernel scope link src 10.0.0.203 10.0.0.138 dev ens3 proto dhcp scope link src 10.0.0.203 metric 100
(run under sudo)
Storage Node config:
-
@3mu said in Wiki errors - "Troubleshooting a multicast":
default via 10.0.0.138 dev ens3 proto dhcp src 10.0.0.203 metric 100
Here we’ve got it. You seem to have configured your network interface ens3 via DHCP - looks like it from the output. While our code should not fail to find the right interface I still wonder why you would configure a servers IP via DHCP!?
Just configure the same IP but static in Linux network settings, restart your FOG server and it will most probably work!
-
@Sebastian-Roth - I look after a network with over 600 servers at work. The ones that cause me problems are the ones with static IP addresses - generally because somebody makes a typo when they are configuring an interface. We also do a lot of migrations, mergers and other changes, so reserved addresses in DHCP make that much easier, as well as enforcing consistency and a basic level of self-documentation. DHCP also makes it easier to deploy in environments where the user does not have control of the network.
I’m sorry for taking up your time, @george1421, and thank you both.
-
@3mu I will be looking into fixing this issue! Though I can’t promise you when the next release will be out. Would you want me to post the fix here so you could manually add it?
-
That would be great, thanks @Sebastian-Roth. I’m happy to test.
-
@3mu Sorry for the delay. I just had a play with this and I think it’s best to go for a simple fix: https://github.com/FOGProject/fogproject/commit/21460c1d8ba7dae1b2988b9287c188595ce01e9d
-
Thanks, @Sebastian-Roth. I’ll test as soon as I finish sorting out a few disasters - hopefully in the next day.
-
@Sebastian-Roth - Sorry for the delay - some other unrelated issues and then a pandemic to deal with! The fix works perfectly.