FOG Trunk 5161 AutoNumbering
-
@StahnAileron for Multicasting, you have to set the interface for the master storage node correctly.
CentOS and Fedora can use weird names. You can get the names of the interfaces like this:
ip link show
, that will just show the names and info. For addresses too, you would useip addr
, You’d put that interface information into the master storage node’s settings.I think it also helps to have a master storage node set… I believe in Trunk, multicasting only happens from a master storage node.
As far as the auto-numbering, I’m afraid I’ll need to leave that question to the more experienced users @Developers @Moderators , I’m not familiar with it. When they reply about it, I’ll be familiar though, because I read every post on here.
And Tom is right, we need to know an exact version.
-
@Tom-Elliott @Wayne-Workman My apologies. We started with a 1.2 install and then switched (upgraded( to a Trunk install. Web GUI says 5161 somewhere… So my primary issues are now in Trunk 5161.
@Wayne-Workman For the multicasting, could you define “master storage node set”? I only just got into FOG (and Linux). We currently have a single machine with CentOS 6.5 and just FOG Trunk 5161 on it. All the services required to run FOG are ran from that single server. So the only storage node with have is technically the FOG server itself. I’m currently home, so I won’t be able to look at the machine again until Monday, but I’ll take at look interfaces like you mentioned. I do recall having a slight issue with interfaces at one point, though that seems to be corrected for now. (Unicast and Torrent-cast worked.)
@Tom-Elliott Actually, now that I think about again, the multicast issue I had was similar to other threads I’ve seen here: The job is started, but the log says it stops and “completed” 10 seconds later. The hosts that are part of the multicast job just hang at the PartClone screen, waiting. This was in FOG 1.2 (so not quite relevant currently, I guess.) Currently in Trunk 5161, I think the 3 multicast services were spamming the log, terminating with a 255 error repeatedly.
I haven’t looked into the current multicast issues as much as I would’ve like. I got hung up on testing an troubleshooting the Auto-reg/Auto-numbering issue I came across. For now, I just want to focus on the AutoReg/-Number problem I have.
Thanks for the quick replies!
-
@StahnAileron I’d recommend checking the Storage node. This is especially true on Trunk.
1.2 and possibly prior always made the assumption that eth0 was the interface your NIC was on. You could change it, yes, but it wasn’t until about a month or two ago that i realized it was specifically the interface causing the problem as you describe (Task starts, and next checkin it “Completes and cleaned”)
This is because the UDP sender job is created, but most likely looking for interface named eth0.
The UDP Sender command starts successfully, but then fails. The MulticastTask is tracking this, and because none of the tasks have checked in, and none of them have been cancelled, the only viable solution is that the task must’ve completed successfully (even if it was only 10 seconds…or so).
To fix the multicast issue, Go to Storage Management Page->Choose your relevant storage node (probably named DefaultMember) and look at the interface setting on it. It most likely is eth0 or blank. Open a terminal, or ssh, or whatever, on your FOG Server and run the command:
ip addr show
orifconfig -a
.Look for your relevant interface.
It will probably look something like:
root@fogserver ~]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno16777728: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether hh:hh:hh:hh:hh:hh brd ff:ff:ff:ff:ff:ff inet 333.333.333.333/333 brd 333.333.333.355 scope global eno16777728 valid_lft forever preferred_lft forever inet6 hhhh::hhh:hhhh:hhhh:hhhh/64 scope link valid_lft forever preferred_lft forever root@fogserver ~]# ifconfig -a ifconfig -a eno16777728: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 333.333.333.333 netmask 333.333.333.333 broadcast 333.333.333.355 inet6 hhhh::hhhh:hhhh:hhhh:hhhh prefixlen 64 scopeid 0x20<link> ether hh:hh:hh:hh:hh:hh txqueuelen 1000 (Ethernet) RX packets 9437234 bytes 6288619143 (5.8 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 8709833 bytes 1822128317 (1.6 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 625597 bytes 84696575 (80.7 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 625597 bytes 84696575 (80.7 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Yes I’m well aware my mac addresses are impossible as well as my ip addresses and what not. That’s on purpose.
So if my IP were possible and I knew it to be 333.333.333.333, using either or both of the commands there I get the same interface name: eno16777728, which is the name i would set my Storage Node’s interface to.
Multicast should magically start working.
I will take a look at the code dealing with auto numbering and see if I can figure out what’s wrong and get a suitable fix. I’m a bit tired today, so I hope to have it solved maybe tomorrow.
Hopefully I’ve helped a little bit.
-
Nevermind what I said.
I found the autonumber bug and fixed it as well as a rather disastrous bug and a couple other minor bugs. I even added some partial functionality that the auto number system will auto populate itself and increase until it finds a host that does not have that number automatically.
-
@StahnAileron Look at the pictures below, these pictures were taken from my home FOG server. On it, if I wanted, I could have many nodes; even though it’s only just one “self contained” server. To multicast, I think you have to have a Master Node set, and the interface for that node must be correct.
-
@Tom-Elliott @Wayne-Workman Oh my… Thank you both!
I thought I already checked the interface. I was having minor issues with that when we switched from 1.2 to Trunk. Guess I didn’t check hard enough (though I never would’ve thought to check the interface name in relation to UDP Multicasting.) I’ll be double-checking once I’m in to look at the server.
@Tom-Elliott Thanks for the quick fixes to that (those) bug(s)! I’ll be heading in to my school in a few hours. I’m guessing I’ll just have to update the trunk copy I have (via SVN) and “re-install”/update FOG, correct?
Again, thank you for the help and support. I’ll follow up with a progress report once I get to work on the server once more.
-
@StahnAileron said:
I’m guessing I’ll just have to update the trunk copy I have (via SVN) and “re-install”/update FOG, correct?
Yup.
-
For future readers, I’ve further updated the troubleshooting multicast article based on things I’ve posted in here.
https://wiki.fogproject.org/wiki/index.php/Troubleshoot_Downloading_-_Multicast
-
Well, the auto-naming/-numbering now works. Registering hosts for imaging is now far easier. Thanks for the quick fix!
As for multicasting: still having problems. I did check the interface name for the relevant settings (for the master node and one under FOG Settings). Still stalls at the Partclone screen. However, the FOG log still states that the various multicasting services are perpetually crashing and restarting. I’m guessing this is the current issue I need to resolve to get multicasting to work. They each stop working with the log stating they exit with error 255. If the services keep crashing, I assume that would screw over the multicasting jobs I set, no? What can I look at and/or try to stabilize the services?
-
@StahnAileron Try to start the fog services 30 seconds after boot. There is an example in this article: https://wiki.fogproject.org/wiki/index.php/Fedora_21_Server
Also, try to clear out the two relevant tables in MySQL, there are steps for that here: https://wiki.fogproject.org/wiki/index.php/Troubleshoot_Downloading_-_Multicast
Also, to further simplify the problem, you might try to use a basic non-managed Layer 2 switch to multicast with till you get it working.
-
@Wayne-Workman I’ll try the procedures you posted. We do have a managed switch in the mix out of 3 total. The others are dumb switches. I’ll look into getting it arranged (physically) so the managed switch is out of the loop. It didn’t give us problems with the FOG 0.32 server we had originally, but shrug never know, right?
-
Final Progress Report
So I actually got Multicasting to work. Apparently the switch from 1.2 to Trunk left some files behind and/or wasn’t truly complete. Some stuff I had tinkered with in 1.2 was held over in Trunk. So I essentially screwed myself over on that one.
In any event, trimming down the install (i.e. deleting almost everything from FOG), pulling a fresh copy of the trunk via SVN, and MAKING SURE the installation truly completed all the way fixed my problems. I hate self-induced problems like this, but live and learn.
Thanks for all the help! I now have a better feel for mucking around in FOG.
Some thoughts:
I noticed that if I had DHCP already running the script would abort because setting up DHCP failed. This apparently was part of my problem. (I wasn’t paying enough attention until my attempts at re-installing the system pointed me to an incomplete re-installation process.) I had to comment out that line in the script to make it easier to re-install FOG. DCHP was always running. (I’d just restart manually it afterwards, just in case.) It seems that the script interprets DHCP already running as a failure. (It doesn’t seem to have this problem with any other running service.)
Also, would it be too much to ask for an option in the installation script to purge the current settings and start fresh? I had to delete the hidden .fogsettings file a couple of times during my endeavor.
-
@StahnAileron the script for the installer already has a switch to disable update and perform a full/fresh install. It can be done with:
./installfog.sh -U
or./installfog.sh --no-upgrade
-
@Tom-Elliott Oh, did not know that. I was using the Wiki as a reference and it never mentioned that switch. The Wiki simply states to delete the .fogsettings file. Nice to know that now. I’ll have to take note of that for our documentation.
-
@StahnAileron I’ve added a ton of functionality (I think) to the installer.
If even works with typical switches and output to assist in knowing more.
For example:
./installfog.sh -?
will print potential usage options.