Fog Replication and best config setup for multisite masters
-
Also my Indy site appears to be transmitting but has not received an image to transmit. Any thoughts on that one?
-
First, sorry for the long post, but I think you’re the kind of person that really wants to learn so… here goes.
Your questions
-
Whatever you feel comfortable with. If you’re confident enough, it’s not hard to change a FOG server’s IP address. I just updated our article on changing a fog server’s IP address ( @testers @Moderators @Developers heads up ), it’s here:
https://wiki.fogproject.org/wiki/index.php?title=Change_FOG_Server_IP_Address -
Yes. Anything else isn’t supported, it would be a disaster, see below. Also, you have two choices here - standard supported and - not standard, not supported. In standard supported setup, you only have one full fog server, all other servers are installed as “Storage Nodes” using the fog installer. It’s one of the questions it asks you when it runs. I would strongly recommend being standard.
-
There’s no way to clear the logs from the web interface. You can easily create a BASH script that runs every hour via crontab to look at the size of the logs, and if they go over a certain amount to archive and clear, or just to clear. Of course you can do it via CLI manually.
Other thoughts
I want to start this off by saying there are a whole lot of ways to configure FOG replication. The way FOG is built now allows a whole lot of freedom in this area, however there are still rules that cannot be broken - save putting together something not intended or officially supported, and save using 3rd party stuff to get replication done and a lot of custom back-end scripting. I’d urge you to stay standard, but it’s up to you.
This “multisite masters” wording you use, that’s something @george1421 likes to use because he’s awesome like that. It takes care and maintenance to maintain this, it’s not hands free, I wouldn’t recommend this to most people. It’s not officially supported nor intended. This setup works by doing a “FULL” installation at each location, and setting up all locations as nodes on the true “real” master. In this setup, uploads are disallowed (see 1. below) at all locations except for the “real” master. The entries for images must be groomed at all locations except the “real” master for each change (see 2. below). This means manually doing it, or carefully automating it with bash scripting. This also means that all computers cannot be controlled from any one point, because not all hosts nor servers have authority over all hosts in this setup. The “real” master server in this setup doesn’t even have control over other servers except for placing images. Again, I wouldn’t recommend this sort of setup. There are two benefits to this setup though, one is a down WAN link doesn’t prevent imaging at any location in most situations. Two is mistake mitigation, techs at each site are limited to controlling only hosts at their site. Again, the work involved in maintaining this, and the disadvantages of being unable to control everything from one point are a loss of major features in FOG. And I’d think if the WAN link was down, focus would be 100% on that and not on imaging. At work, we have boxes ready to go in our office, we can do a physical swap-out of any end-user box at any moment should we need to. We aren’t reliant on imaging to fix a problem, we can buffer a few machines until we have fog fixed.
-
FOG replication checks if the master or primary group master files match all “slave” or “other group master” files. In this setup, there can be only one “real” master or primary group master for a set of files (snapins or images). If the files on the real master don’t match others, the master deletes the remote files and recopies them. If you had a “full” installation at each location, and had every location setup as a node at every other location, if any location anywhere uploads an image, you will create a disaster replication loop where remote deletion happens over and over infinitely. Because each master will detect that it’s files (which may be in the process of copying) aren’t the same as other files on other servers, and it’ll just delete all the files on all the other servers. This of course would disrupt copying and destroy the newly uploaded image wherever it was, too.
-
Changes (Create/Delete/Edit) of any image definition on the “real” master would then require that the same changes be made at all other servers. This can be done manually which would be painful, or can be skillfully automated with BASH. Of course a script could be developed to do this, but it’s not in existence yet and it would take time to write and test and adjust.
3 (other thoughts). There could probably be some modifications written to change how uploads work and how replication would work in a multi-master setup. My thoughts are to simply change where the upload would go to on the “non-real” master, to first upload to /images/dev as usual but upon completion move the image to something like a /images/done directory and have some back-end BASH scripts run an NFS copy on those files to the “real” master’s /images/dev/ directory, and when that’s finally done to issue an FTP command to move that image just as an upload would do, to the “real” master’s /images directory so that then replication would trigger as normal, as if it were uploaded at the “real” master. This idea is pretty complex and would take a while to develop and test - and I’d never suggest to the developers to try to implement this right now, we are in Release Candidate mode and trying to get to release. This would just be a pet side-project. @george1421 thoughts?
One rule of FOG in a standard setup that is supported, is that all uploads always go to the master of the group, or to the primary master. What is the difference between a master and a primary master? A master is simply a storage node that is designated as the “Master” node in a group of storage nodes. Anything on this master will be replicated to all other nodes in the group. A “Primary Master” relates to a specific image or snapin. An individual image can be created on group ABC, and shared with group DEF too. In this setup, the master of ABC copies the image to the master of DEF. Then the masters of these groups then copy to the members in their groups. If this particular image is re-uploaded from anywhere, the image will always upload to the master of ABC, because that is the “Primary Master” for this image. Then from there, the process repeats. the master of ABC copies to the master of DEF and then all the masters copy to their group members.
So, sure, in a standard setup you may upload a new copy of an image from anywhere - but that new copy will always go to the master, or the primary master, as explained above. Uploading across a slow WAN link takes a very long time, but depending on the link speed and reliability it can be done. I wouldn’t even attempt it with a WAN link slower than 100Mbps.
Replication setups
I hope you’ve watched the video here:
https://wiki.fogproject.org/wiki/index.php?title=Location_PluginThere is a video that explains the most basic setup for replication and preventing imaging over the WAN. This setup would allow you to upload from anywhere, but remember uploads go to the master (or primary master) always.
Another setup is to configure every storage node into it’s very own group, and setting each node as a master. When an image is created and you desire it to be shared with one or more other groups, within the image definition you click “Storage Groups” and then just click the box to show what groups it can be shared with, and pick the groups you want, and save. The image would then be replicated to those selected groups. This area is also how you would change the “primary master” of an image - but only do this after replication has happened.
-
-
@Wayne-Workman You are right I need to watch my words. A multi-master setup is not a supported configuration, but it worked for me. YMMV.
I can explain how I use this setup and a bit why, but let me say this FOG was originally developed and intented (IMO) to be a one site local image deployment tool. Moving it to an enterprise level was not (and is still not) the focus of the developers. (Understand this is my opinion only and not a reflection of the current state of FOG, its developers, my role as a Moderator, or FOGs usefulness). To make FOG really enterprise ready the FOG Project needs a few more developers (i.e. the current core team can’t do everything) that are skilled in multisite organizations. The current structure of a single master node and multiple storage nodes is really geared towards a single (possibly large) site and not a disperse organization. The storage nodes don’t have their own databases and can not stand alone if needed, they need direct and immediate access to the master node (typically at HQ, connected over a low bandwidth connection). The second issue with FOG is that the levels of access control is very limited, you are either an administrator with full access to everything or you are a mobile user who can only deploy. In a multi-site situation you may need to limit certain administrators to specific deployment servers or to limit what systems they can deploy images to. I would hate to have an IT tech at site A accidentally deploy a new image to an unsuspecting computer/user at site B. The current level of access control would allow that.
To address those concerns I (myself not as a Moderator) took the existing FOG system and twisted it a bit to work in my environment how I mostly needed it to work. Eventually this multi-master setup could be a real thing with a little coding, but today is just something I concocted that is not supported by the FOG Project.
(be aware that this is totally made up organization) To keep things simple lets say I have 3 location NYC, ALT, LA and ATL is HQ. At each location I have a fully functional FOG server. It is the Master Node for each site. At LA (since it is a big campus) I also have a storage node. So on the LA FOG Server (master node) I have a storage group called… “LA storage group”. Now at ATL (HQ) I have 2 fully functional FOG server FOG-ATL and FOG-DEV. On the FOG-ATL server (FOG deployment server for the ATL site) There is a storage group created called “Biz storage group” that includes FOG-ATL, FOG-LA, and FOG-NYC. And finally at HQ I have a storage group created on FOG-DEV “Dev storage group” that includes FOG-DEV and FOG-ATL. So now you see I have have basically 3 storage group rings.
- FOG-DEV (master node) and FOG-ATL (marked as storage node even though its a full fog server)
- FOG-ALT (master node), FOG-NYC, FOG-LA ( both marked as storage node even though its a full fog server)
- FOG-LA (master node) and STORAGE-LA (traditional storage node setup)
This setup function just as FOG was intended for replication. Replication always happens from Master Node in the storage group to storage nodes in the storage group. This replication will only happen one way. This is not a two way replication. You must always use the top down model.
So now with this setup if I drop an image on FOG-ATL that image will be replicated to all FOG servers and storage nodes in my network with the exception of FOG-DEV this is because on the “Dev storage group” FOG-ATL is a slave node and FOG-DEV is the Master node. As I said before replication only happens top down. Master to slave or storage nodes.
I’ll try to wrap this up quickly because I see this is more on a level of tutorial and not embedded in a post.
In my environment I create the master images on FOG-DEV. I configure the images (when I create them) to not replicate from FOG-DEV until they have been approved. We do all of the development and certification of the images on FOG-DEV (we also deploy to the test lab from FOG-DEV). Once the image has been approved we update the image to allow replication. This triggers the image to be first replicated to the FOG-ATL FOG server and then from the FOG-ATL servers to FOG-NYC and FOG-LA.
Now here is the manual part. We need to synchronise the image databases between all of the FOG servers. Just because replication happens doesn’t meed the FOG database knows about the images. The images will be copied to all FOG servers in this setup, but we have to export the database configuration on the FOG-DEV server and then import them to all other FOG serves on our network via the web gui. We can update images on our FOG-DEV server and these changes will be replicated no problem. We only have to log into each FOG server when we add a new image because that means we need a new record in the FOG server’s database.
This environment does work as long as you accept the caveats of the limited manual intervention with updating the image table. As I think about it, I should create a tutorial to explain this a bit better (as I did with the FOG-PI setup) because I think this could be a supported “thing” without much programming. I also realize that the developers are working hard on FOG 2.0 so taking time to create this multi-master configuration is not in their area of focus right now.
-
@george1421 I believe I could script something to keep the image definitions sync’d up with a designated “real” master. I think it can be an extra php file that lists in a standard format information about the images, perhaps in JSON, and a simple little bash script on all the other “non-real” masters that reads that web file, and makes changes locally according to it.
I think details about images wouldn’t be much of a security issue, and I think the keypair that FOG already has can be used to sign the produced output, the public key can be stored permanently during setup on all the nodes to check the signature and data against.
thoughts?
-
@Wayne-Workman I think this is a valid discussion (probably not in this thread since we both can post walls of text). But I think (based on my best guess) is that with more than one full fog server involved each fog server will be the owner of its own ssl key so that may not allow the cross site communications. Since it will be two fog servers talking and not a fog and storage node (which uses the fog server’s key).
-
Back on point to the OPs questions.
Your Goal is not obtainable with the current FOG design. The replication happens only one way FOG Server -> Storage node or FOG Server A -> FOG Server B. It is not a n-way replication model.
- Wayne has a script for that, you can reinstall by updating the .fogsettings file then running the installfog.sh script, but you may have to manually edit a few setting via the gui because I don’t think the installer will update all db records if they already exist.
- If you use the traditional model you will have one master node and then a storage node at each site. That storage node can be the site tftp/pxe boot/imaging server. There will just be no local GUI to manage that storage node, all management happens on the Master Node at HQ.
- For the log files, I don’t have an answer.
Issue:
There is a replication/replicator log file on the master node. You need to ensure you have the storage nodes defined correctly on the Master Node with the proper LINUX fog user ID and password for each storage node. -
George & Wayne,
Thank you so for your valuable input, I appreciate the detailed responses as they have been very helpful! I am definitely eager to learn but I am also a novice so I am not going to try to reinvent the wheel here either. So based upon what I learned here I will plan on the following. I will try to cover as much of the information as you gave me so forgive me if I miss something. I
Our Needs:
Just to make sure that you understand what I am wanting Fog to do I will try to give more detail first. As I mentioned we have about 10 sites or so (spread across several states) that I want to implement FOG at. Each of these sites are connected by an 10Mb MPLS or higher. We will not be using FOG heavily and may only image a couple of machines per month at each site but when we do a roll out it could be more. The main thing is just to make sure that we have a standard image (we don’t do anything fancy) that will allow us to image a machine/s as needed to reduce the amount of time it takes to set it up. I would like to get it to join the pc to the domain and things in the future but that is later. The time to replicate is not bad I believe it takes about 4 hours to replicate per image. Our images are pretty vanilla and we don’t plan on having more than a couple images at most and they do not change often so right now I am okay with that.
The most important thing to me right now is to just make sure I have FOG setup in a stable way that it will just work. I have been doing as much reading as I can on how to set it up for our needs but haven’t found any real clear documentation on that as of yet. I am not sure if it is just me or it doesn’t exist So maybe if I can get this figured out I can do a write up on it if one is needed.Mulit-Master Setups
A Multi-Master setup might be possible but not recommended therefore I will have one site as the master and the rest as storage nodes. I believe it is as simple as installing them as storage nodes (as was recommended) and then making sure INITs and Kernals are selected for that site so that the systems will try to image from the local server and not over the WAN. Correct? If that is the case, then a tech from any site would only connect to the master interface to perform imaging regardless of what site they are at?IP Address Change
Regarding the IP Address Change. I ended up just reinstalling the system on Saturday night. Using the instructions found here (https://wiki.fogproject.org/wiki/?title=Uninstall_FOG ). It was really quite easy and not that difficult considering this is a new setup anyway.Regarding the Replication setups video I did watch it and found it very helpful. Thanks
-
You will want to throttle replication back to maybe 5Mbps so that the link isn’t completely consumed for 4 hours straight. You can do this in
Web Interface -> Storage Management -> [click node] -> Replication Bandwidth (Kbps)
It’s very easy to setup domain joining, just install the fog client onto your reference image prior to capture, and plug in the domain settings into
Web Interface -> FOG Configuration -> FOG Settings -> Active Directory Defaults
. Whenever you register a host via the boot menu, you can specify you want it to join the domain. You can also use the web interface to apply the defaults to hosts en-mass, or to individual hosts.The Location Plugin article I posted is the best documentation about replication and the location plugin right now. I’ve been intending to expand on it but just haven’t yet.
I will have one site as the master and the rest as storage nodes. I believe it is as simple as installing them as storage nodes (as was recommended) and then making sure INITs and Kernals are selected for that site so that the systems will try to image from the local server and not over the WAN. Correct?
Correct, use the location plugin as the video showed.
If that is the case, then a tech from any site would only connect to the master interface to perform imaging regardless of what site they are at?
Right, or from the FOG boot menu of the computer they might be standing in front of.
-
This post is deleted! -
@Obsidian reported that replication worked fine after he setup his remote servers as Storage Nodes, as recommended.
Also, this topic has gone off topic, and has been forked to here:
https://forums.fogproject.org/topic/8747/when-restoring-image-disk-not-found