RC19 - Server sending magic packets to all hosts
-
@Tom-Elliott said in RC19 - Server sending magic packets to all hosts:
We found some issues, now, with ondrej repository.
This is one of the many things that scares me about having a 1.3.0 release that requires internet access to install stuff.
-
@Wayne-Workman There’s not much of a way around it though. Unless you plan to compile every package for each OS during installation (and providing the binaries to do compile from).
-
@Tom-Elliott How did 1.2.0 accomplish it?
-
@Wayne-Workman it didn’t.
-
Remoted in to help out. Issue to install the working-RC-24 with the new “items” which failed out was the code I added to try to ensure things were cleaned up had some issues. Now we wait, I suppose, to see if WOL is still doing the “additionals”.
-
@Tom-Elliott said in RC19 - Server sending magic packets to all hosts:
@Wayne-Workman it didn’t.
I guess the major difference is the inits, kernels, and client. I think those three things should be packaged with the release, and then used if
fogproject.org
cannot be contacted instead of just flat failing. I guess that 1.2.0 could be installed in a closed-system because it came with these things, and because perhaps there was a red hat satellite server available or a repo server available or something, or people pre-installed the needed components. -
@Tom-Elliott said in RC19 - Server sending magic packets to all hosts:
Remoted in to help out. Issue to install the working-RC-24 with the new “items” which failed out was the code I added to try to ensure things were cleaned up had some issues. Now we wait, I suppose, to see if WOL is still doing the “additionals”.
Hopefully RC24 will be released soon. Scary
-
@x23piracy What’s scary?
You can look in the code base and try to see what’s causing it, and I cannot see it. I did add a “safety” if you will, but even then it’s still really strange. I am only guessing here as to the cause if FOG is indeed the one responsible – directly. The only thing I could think of that would make all systems keep “stacking up” is if there’s a “stuck” element in either powermanagement or scheduledtasks.
These two items are cycled on a timer and will be operated on the next time their “turn” comes around. If there’s something found in the tasking relationship, but the item is blank it could return ALL items.
That said, the way I’m seeing it described is rather a “massive” wol packet that has all the macs rolled into one. It sounds, to me, like the prior WOL requests are being stacked upon rather than each WOL item being treated separately. There’s only one point in the code that “could” do this but it’s a localized variable. This localized variable basically means that it’s only available at the initial call. Once it’s gone, the variable is gone.
Again these are just my guesses at this point. You could validate it relatively quickly though.
If you have two systems side by side.
Reboot your fog server.
Task one of them with something WOL related.
Once system turns on, turn it off.
Task the other system in the same way.
Does the first system ALSO turn on?
-
@Wayne-Workman I don’t understand the concern here. While I can publish static binaries when I release, having it download hasn’t been a problem for a VERY long time.
FOG 1.2.0 could not be installed in a closed-system unless it had already been installed with an “open-system”.
The issues that are occurring now are unrelated to static vs. dynamic binaries. They’re literally repository related.
The reason I decided to use specific repos? To ensure everybody is on the same page for the installation. How can we expect people to just know how to install PHP 5.5 or greater? I mean we have an installer, and we’re requiring this version. How can we “break” the installer simply because their system doesn’t have the version of software we’re requiring?
The major differences between 1.2.0 and 1.3.0-RC series are solely because we’re trying to make a better product with limited requirement on the administrator/user installing. 1.2.0 did not install any repositories or update any of the “required” packages. If it were a fresh install, it would most certainly require internet connection to perform the package installation. If it couldn’t get to the internet, it would fail in much the same way.
Ondrej repository literally changed things from how they were to a totally new system. Granted the new system they’ve adopted is a bit simpler, but it did mean a slight bit of a headache until it was all figured out.
I don’t know why you’re concerned with the init’s, kernels, and client when that had nothing to do with the problems that were being seen.
-
@Tom-Elliott Just want to clarify one bit of the WoL issue I saw - it wasn’t one big packet trying to wake a ton of MAC’s. It was thousands of individual WoL packets each with their own MAC to wake.
When I caught this yesterday, there was one task in the system to do a deploy. I couldn’t even find any WoL packets trying to wake this machine coming from Fog. Instead, of the few packets I checked, they were for random machines all over the district. And for many of them, there would be upwards of 50-100 WoL packets for just one MAC.
Is there a system level way for me to disable WoL? Could that be added easily? We don’t use WoL at all currently and that isn’t going to change anytime soon.
-
@MRCUR There isn’t a “global” on/off switch for WOL. This is semi-intentional though. There’s only a few cases where FOG performs WOL at all.
-
Powermanagement -> Ondemand or scheduled.
-
Host deploy tasking -> Every task now has a “WOL” checkbox with the exception being the “pure” WOL tasking.
-
Group deploy tasking -> Every task now has a “WOL” checkbox with the exception being the “pure” WOL tasking.
FOG Task Scheduler (FOGScheduler service) checks every minute. If any “scheduled” Powermanagement task or any Deployed tasking is scheduled either by cron or delayed and the time is up and wol was enabled, it will send WOL packets to those systems. If there’s an instant tasking created and hasn’t checked in yet, Task Scheduler will re-send WOL packets as well (to try to limit human involvement as much as possible.)
-
-
Is this still occurring?
-
@Tom-Elliott Haven’t had it happen again yet.