Create the concept of a ForeignMasterStorage (deployment) node


  • Moderator

    I’ve looked into the possibility to create a slave node deployment node by setting up a master node in the traditional manner. Then creating a proposed slave node as you would in the traditional way. But at the end of the process pointing the Slave node to the Master nodes database. This will work for most of the tables except for the FOG server specific tables like globalSettings. These setting are unique to the individual FOG server. I can see if your FOG Slave server is location in a different subnet or if there is conflicting settings between the Master node and Slave node there will be a setting clash. If the globalSettings table had an additional field that represented the unique FOG installation ID the (global)settings could be created to each individual FOG server. I didn’t check into many other tables for FOG settings clash but it looks like the current FOG system could be extended to a Master-Slave configuration.

    The other way I though about is to keep the fog databases isolated and then just send JSON or other types if IPC messages (they could be done as http POST calls between the systems for that matter) between the master and slave(s) FOG servers. This would allow the FOG installations to be run stand alone if needed but also communicate with a master node. Personally I like this approach a bit better for a scalability and robustness standpoint.


  • Moderator

    @Wayne-Workman said:

    I would suggest we pool our knowledge to just create some base-level scripts that will sync two DBs based on the exact same rules that the FOGImageReplicator follows. I’ve outlined these rules before in other threads.

    I’m not thinking anything drastic. Its more like how pfsense sends http calls to a remote node to sync its configuration data. While its a bit deeper discussion that we should do here. The idea would be for the FOGReplicator to move the files as they do today. When all of the files in the current image directory have been moved, then make a http call to a php page the remote node (it should already know everything it needs to know to do this [i.e. no new database fields]) which adds the image information to the remote database.


  • Moderator

    @george1421 said:

    The best solution is to have this done natively within the program and not use any external hacks.

    Linux is the largest collection of hacks in any one spot ever lol.

    So,

    I’m strongly against the @Developers making such drastic code-base changes. I want to see a 1.3.0 release soon and this will not only delay the release, but most likely create a slew of bugs that need worked out… again… But they can do as they please.

    The new FOG Client that is being developed by @Jbob runs on GUI or CLI only linux, just about every single distribution you can think of. It is able to deploy snapins to linux without issue, I’ve witnessed it (his nightly builds, they are not stable or in FOG Trunk currently).

    I would suggest we pool our knowledge to just create some base-level scripts that will sync two DBs based on the exact same rules that the FOGImageReplicator follows. I’ve outlined these rules before in other threads.

    Once the script is developed, we can make it into a sourceforge project. People can deploy the script via Snapins to the remote storage nodes themselves to update the DBs using the web interface.


  • Moderator

    I can tell you through testing I know this so far.

    1. You can do this mostly with a storage node and the location plugin.
    2. The storage nodes don’t have the bits required for tftp to work

    Some caveats to what I just said.

    1. The storage nodes are storage nodes only. You can add the tftp service description for xinetd and the tftp files. But the storage node is not a fully deployment node. There is no user interface, all of the techs must access the master node to deploy images to the remote locations.
    2. The storage nodes do not have a local mysql database (as far as I can see). They connect back to the master deployment node to access its database. I see this as being an issue with latency when crossing a WAN link.

    I have been thinking of ways to map this out to do what ( I ) need it to do, but it would be one off and fragile at best. The best solution is to have this done natively within the program and not use any external hacks.



  • Tom, I was linked to this thread by the OP and I am in the exact same position. Wayne is 100% correct in what I need and I believe that to be what George1421 needs too.

    Lets break it down simply.

    1. Create image on “Master” server.
    2. Replicate image to all other storage nodes in the same group.
    3. Update the remote servers DB to reflect what the “Master” server just copied.

    Steps 1 and 2 work fine but there doesnt appear to be a way to do step 3 automatically. This would not be such a major issue if I were able to manually create the image definition at each site, but when I try I am presented with nothing but a white screen saying “add image definition” on the top left and absolutely nothing more on the screen.

    I dont want to export/import mysql DB files from the “Master” to the remote sites, I have been doing that for years with .32 and its not a very good practice. Simply updating the remote mysql tables to reflect the images that were just copied should not be a huge task for your software to perform.

    Does that explain what I am and also I believe George1421 to be looking for?


  • Moderator

    At the risk of extending this feature request even more…

    Please understand I’m not trying to be difficult, I truly want to understand if what I want to do is possible. I think we have a communication misalignment. I’m not doing a very good job explaining the situation because I keep seeing the same results (maybe that is the only answer, I don’t know).

    But I’m assuming from your context that in my drawing below there is one full deployment server in that network with the rest storage nodes. Is that a correct assumption?

    I understand the function of the location plugin, It allows you to assign storage groups and storage devices to a location and then you link a hosts to a location so it knows where to get and put (if necessary) an image to. I get that. I’ve been using FOG for quite a while.

    The issue(s) I’m seeing here are this:

    1. The storage nodes are not a fully functional deployment server. They are missing the tftpboot directory. While they do have the pxe boot kernel and file system, they alone can not provide pxe booting services for a remote site.
    2. The storage nodes do not appear to have a sql server instance running so I assume they are reaching out to the master node’s database for each transaction. Historically I’ve seen this being an issue with other products as they try to reach across WAN links for transactional data.
    3. There is no local web interface on the storage nodes. So all deployment techs from every site must interface with the HQ Master node. This shouldn’t be an issue since the web interface is very lite as apposed to some other flash or silverlight base management consoles.
    4. While this is not a technical issue, its more of a people issue. Since you will have techs from every site interfaces with a single management node its possible for one tech to mistakenly deploy (i.e. mess up) hosts at another site since there is no built in (location awareness) in regards to their user accounts.
    5. On the deployed hosts, where does the fog service connect to? Is it the local storage node or the Master node?
    6. Storage nodes can only replicate with the master node. i.e. if there are two storage notes at a remote site, one storage node can not get its image files from the other storage node at that site. All images must be pulled across the WAN for each storage node.
    7. Multicasting is only functional from the Master node. So in the diagram below only the HQ could use multicasting to build its clients. (edit: added based on a current unrelated thread)

    The fog system is very versatile and you guys have put a LOT of effort into it since the 0.3x days. And you should be acknowledged for your efforts. Understand I’m not knocking the system that has been created or your time spent on the project.

    I worked through this post, I can see that having a single master node with the rest storage nodes would work if:

    1. The /tftpboot directory was included in the replication files from the master node and the tftp service setup in xinet. (actually this could be built in as part of a storage node deployment by default, by having the service and tftpboot folder setup, even if it isn’t used in every deployment. There is no down side IMO)
    2. The user profile was location aware to keep them from making changes to hosts in other locations. The location awareness must have the ability to assign users who have global access for administration purposes.
    3. The storage nodes would have to be aware of latency issues with slow WAN links. And/or not break completely with momentary WAN outages.

  • Senior Developer

    The how is to enable the Location Plugin. (in the case of having fog automate the stuff for you)


  • Moderator

    @george1421 said:

    While its clear that the current FOG trunk can do this, but right now the how is missing from this discussion.

    For the sneaker net or for the setup you illustrated below?


  • Moderator

    @Joseph-Hales said:

    If you are not updating images that often it might be more logical to sneaker-net images to the other site we you make changes.

    Good point, it just may be easier and quicker to throw the image on a flash drive and overnight it to the other sites if transfer speed is required. But then there is more hands on steps at each site to import the image and create the DB entries.

    While its clear that the current FOG trunk can do this, but right now the how is missing from this discussion.


  • Testers

    If you are not updating images that often it might be more logical to sneaker-net images to the other site we you make changes.


  • Moderator

    @Wayne-Workman said:

    But I wanted to point out that a typical 16GB (compressed size) image, pushing one copy of the image to one other node across a 1.5Mb/s link will take roughly 24 hours, and that’s if you have 100% of the 1.5Mb/s dedicated to the transfer.

    Have you thought about this? How big are your images?

    I selected a network connection specifically that was artificiality low for the POC. I see network latency being a real issue with a distributed design.

    Our thin image (Win7 only+updates) are about 5GB in size and our fat image is over 15GB. At 1.5Mb/s I would suspect that we would have ftp transfer issues with file moves that were taking longer than 24hrs to complete. But that is only a speculation.

    Its good to hear that FOG could do this without any changes.


  • Moderator

    @george1421 said:

    1.5Mb/s

    Tom is right, it will work.

    But I wanted to point out that a typical 16GB (compressed size) image, pushing one copy of the image to one other node across a 1.5Mb/s link will take roughly 24 hours, and that’s if you have 100% of the 1.5Mb/s dedicated to the transfer.

    Have you thought about this? How big are your images?


  • Moderator

    Excellent…


  • Senior Developer

    In simple of terms as I can muster, YES!


  • Moderator

    Knowing what you know about the new features built into the SVN trunk, can I do this without any new “stuff” being added to FOG?


  • Moderator

    @Tom-Elliott said:

    @george1421 I’m still confused.

    Its highly possible that I’m ignorant to the features you have added to the trunk builds, plus I’m not doing a good job of explaining the current situation were I think FOG is highly capable to accomplish this with a few adjustments. I’ve looked through the wiki to see if there was something similar to what I need to do. The only thing that came close was https://wiki.fogproject.org/wiki/index.php/Managing_FOG#Storage_Management (the second graphic that shows the multiple storage groups). This is the POC concept used to setup my test environment.

    I took that previous drawing and build this sample layout.
    storage_network.JPG

    In this scenario I have these requirements (almost sounds like a school project):

    1. Will be constructed with 3 or more sites
    2. Connection to each site will be via a connected via a MPLS 1.5Mb/s link
    3. Because of the slow link each site must have its own FOG Deployment server to provide PXE booting
    4. Each of the sites could have one or more VLANs each with their own subnets isolated by a router.
    5. Corporate images will be created at the HQ site and distributed to all sites. There is a potential that each site could have their own images for specific purposes. So each site must be able to capture images to their local deployment server.
    6. On a corporate deployed image there may be a reason to recall or block deployment of a specific image across the organization (such as a detected flaw in the image).
    7. The location plugin is installed on all FOG servers. The only location that will have more than one locally defined location is LA

    To clarify the above picture:
    In the HQ location there is only one deployment server HQMasterNode
    The LAMasterNode and ATLMasterNode are connected back to HQ via a MPLS link (right now this is all done in a single virtual environment)
    In the LA site there are 3 FOG servers. One FOG deployment server, One FOG storage server and One FOG Storage server with PXE booting enabled (I think that is an option). The LA site also has two VLANs with about 700 nodes distributed across the VLANs. There are two defined locations for the LA site (LA_BLD01 and LA_BLD02)
    The ATL site only has one FOG Deployment server and one storage node on a single subnet.

    This is how I have the test environment built in my test lab.

    As I posted before I seeded the images in the HQMasterNode with images from my production FOG server. No replication happened between the HQMasterNode, LAMasterNode or ATLMasterNode until I created the first image definition on the HQMasterNode. Once that first image definition was created all images that were seeded on the HQMasterNode were replicated to the other two nodes in the HQ Storage Group. This worked great, now all image created on the HQMasterNode were located at the site FOG Deployment servers. The images did not get distributed beyond each sites MasterNode though. On the ATLMasterNode I created a single image definition and then the images were replicated to the ATLSlaveNode01.

    The first issue I ran into was even though I created all of the image definitions on the HQMasterNode those definitions were not copied to the LAMasterNode or the ATLMasterNode. Somehow I need to get those definitions (I’ll assume the same for the snapins) from the HQ deployment server to each site’s deployment server. This could be accomplished with a mysqldump of the tables before the replication starts and then picked up at the remote end and an mysqlimport run. Or by making url calls to each of the sites deployment servers to update their database with the image information.


  • Senior Developer

    @george1421 I’m still confused.

    In trunk you can setup multiple storage groups for both snapins and images. You also specify, now, which storage group is the primary/master group for the snapin or image.

    This will do the same thing you’re requiring. It will replicate to other storage groups from the primary storage group as assigned by the master.

    Doing this, you would not need to create the storage nodes under the primary group as you’ve described.

    Tie this with the location plugin and I believe you would have everything you’ve described.


  • Moderator

    @george1421 A scripting solution could keep just these two tables updated. I suppose you could create a plugin that does it?


  • Moderator

    @Wayne-Workman point well taken.

    I’m not really interested in creating a mishmash of scripts to do crazy things. I can see what needs to be done to make this work as FOG is currently designed.

    I’ve spent some time recreating my POC environment and have a mostly workable system using the current SVN. Based on the results of my testing I changed the a word in the title of this feature request to foreign master storage node from slave, because it sounds much cooler and is a bit more accurate.

    All joking aside. I found if I create 3 storage groups which represent 3 different sites each with their own master storage node and then in the center storage group make the master storage node from the left and right storage groups a “storage node” or to use my made up name “Foreign Master Storage node” in the center storage group I can send the images from a central master storage node to all other storage nodes in the other storage groups. (its a bit hard to explain with just words, but it does work). Eventually each storage group will be located at a different site, so I need a fully functional master node in each storage group.

    I did find an interesting fact, I seeded the center master storage node with images from my production server, but the replication did not start until I created the first image entry in the database. Then the files were replicated from the center Master Storage node to the other Foreign Master Storage nodes. The issue I’m at right now is that I need to get the content from the images and snapins table to both the left and right Foreign Master Storage nodes or they won’t start replicating to their storage nodes.


  • Moderator

    @george1421

    There are a lot of complications to what you’re wanting to do… having DB independence means having a full fog server at each site. But then you run into the issues of syncing the DB.

    And syncing the DB is only required as far as creating/updating image definitions. This could probably be scripted with Cron, and would require remote access to all the MySQL instances on all the servers…

    Each fog server will be trying to perform replication among the masters/slaves… so you’d have to totally disable that service on all servers except for one.

    You’d still need the location plugin in order to define to clients where to pull images from, where to upload to, and so on. You’d need to define your storage nodes, groups, masters/slaves identically on all servers…

    As far as running reports, you can look at the SQL underneath the various fog buttons (it is open source after-all). You can enable remote MySQL access from a list of specified IP addresses (for security) and create a script that will pull the reports you want. You could even have a little virtual machine running FOG, and just change the settings in /opt/fog/.fogsettings for each site you want to work with, for each site report you want to run.

    But,

    To be totally honest, Tom has a really strong point here… All of this craziness is not necessary. There are several multi-site organizations that use the standard setup with location plugin just fine. They have WAN limitations too. Some go as far as a full server at each location but having the DB settings pointed to the main server. The provided setup does work, and what you’re wanting to do would create a massive amount of oversight and work that probably very few could follow in your footsteps and do confidently.

    I mean, Linux and FOG is pretty foreign to most I.T. people already… Imagine the guy (or gal) that comes in behind you? They would absolutely hate FOG because of how complex they perceive it to be… how fast it would break due to their inaction, or simply following advice they see here on the forums or in the WiKi… Advice that won’t work because this setup is so dramatically customized.

    I mean… if the WAN goes down… are you going to be worried about imaging computers? Nope… And do you actually know the bandwidth load that MySQL would create for 100 or 1,000 or 5,000 computers? It’s probably pretty low… after all, it’s just text.

    My vote is… don’t create a massive monster that nobody but you can tame.


Log in to reply
 

482
Online

39.3k
Users

11.0k
Topics

104.6k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.