best master-master replication setup
-
Hello,
All the documentation and forum post I am seeing in regards to this topic is from 2017-2018 so I do apologize if some things have changed.
So here’s the scope. We have two full fog servers. One at Site A one at Site B.
Site A is our primary and is where all image capturing will be done. I’d simply put (like to copy the one image that we have from Site A to Site B and be able to deploy it from site B.
I believe this refers to a master master setup and at one point (it still may) involve some manual steps. Two of these steps being a manual import and export and a manual copy of the image definitions from the root to the remote server.
What I think we would need to do to complete the above is to add the slave box to the master box as a storage node and enable replication on that image?
What I think needs done based on my research is below:
Make a storage group for each server, make each of them masters of their own storage group, then enable replication on the image I want to share. (just our single image)I have never done any of this before. Just looking for some assistance and guidance.
-
The master -> master configuration is not currently supported by the developers (though I keep hoping).
I think you nearly have the point of how it works.
On the HQ server create a storage group or use the default storage group. On that storage group create a new storage node. That storage node will point to the remote FOG server. You can make that remote fog server storage node configuration mirror the default node except this node will be a storage node (Is Master unchecked). Correct the IP address and network interface name to reflect the remote fog server. The last thing that needs setting is the “Management Username” = fogproject and the for the “Management Password” on the remote FOG server look in /opt/fog/.fogsettings text file. You need the password saved in this file to go into the management password field on the HQ server.
Once that is done the FOG Replicator will start replicating the images to all storage nodes (real SN or Full FOG servers acting like storage nodes) in the storage group.
Now the manual part of the setup is the export of the image definitions from the HQ fog server via the web gui and the import of the definitions on the Remote FOG server’s web gui.
Now once that is done, as long as you don’t add new image definitions to the HQ server if you update the image files on the HQ server they will automatically replicate to the remote FOG server. If you capture new images to the HQ FOG server the images will still replicate to the remote FOG server, but the remote FOG admins will not be able to see these images until you export and import the image definitions into the remote FOG server.
It sounds complicated but its really not. The FOG replicator will replicate any captured image files that is marked for replication on the HQ server. To be able to see the image files in the web ui of the remote FOG server you need to have a matching image definition created. You can do this by hand on the remote fog server or go the export and import route (HQ -> Remote).
While I can’t say for sure, the developers had talked about rewriting the FOG Replicator for FOG 1.6, hopefully they can add this (master -> master) database sync at that time.
-
@george1421 by fog definitions do you mean under image management in the fog console?
The export images and import images options?
Are those just the definitions?
-
@amped said in best master-master replication setup:
@george1421 by fog definitions do you mean under image management in the fog console?
The export images and import images options?
Are those just the definitions?
Yes, export and then import via the webui, those are technically metadata that describes the raw data files. The FOG Replicator will copy over the raw data files.
-
Hey @george1421 it looks like the images did replicate. I did what you confirmed above.
Now the issue resides with DHCP. Struggling to find where I can modify these settings at. I have a feeling my Vm’s aren’t getting an address from our FW.
-
@amped Ok an easy (ish) way to tell is to install wireshark on a witness computer (extra computer not part of the booting process). In wireshark create a capture filter of
port 67 or port 68
. Now start your capture and pxe boot your VM. In wireshark you should see the typical DORA dhco process (DISCOVER, OFFER, REQUEST, ACK) You can ignore the INFORM packets they have a different purpose unrelated to pxe booting.So how this sequence starts out is the pxe booting client will send out a DISCOVER packet. One or more of your dhcp servers should send out an OFFER packet. Look in that OFFER packet, the ethernet header should list a {next-server} (should point to your FOG server) and a {boot-file} (should be undionly.kpxe or ipxe.efi). If those values are there look down below to ensure dhcp options 66 and 67 match those values. I have seen many firewall/routers have this bit messed up.
If you can’t find what is wrong post the pcap to a file share site and post the link here and I will review it. There is a lot of fiddly things that can go wrong to explain in a single post.
-
@george1421 looks like got DHCP figured out. The FW was giving trouble so we moved to to a windows server made sure that 66 and 67 were setup correctly.
Our test vms get an IP and they get whats called a NBP file okay, it sees the ipexe I believe its called and said okay. Then the VM immediately reboots and lands at a REFInd screen that tells me to shutdown or reboot.
The guest in a gen 2 in hyper V and we do get to the FOG console. I was able to hit deploy and see our image and try to deploy but then the same crash I mentioned that it occurring right at boot after typing the fog username and password.
Any ideas on a potential cause?
-
@amped said in best master-master replication setup:
The FW was giving trouble so we moved to to a windows server made sure that 66 and 67 were setup correctly.
If you would have came back a bit sooner I would have recommended that you install dnsmasq of your FOG server to supply the pxe boot information to your computers instead of moving to windows dhcp server. DNSMASQ can be setup in about 10 minutes without needing to change your network infrastructure. But now you are on win dhcp server there is no reason to move backwards.
So the NBP is the Network Boot Program (NBP is a uefi term, so) you should be sending ipxe.efi for dhcp option 67. Is this what you are sending? If you have secure boot disabled on the target computer you should get the FOG iPXE menu.
Hyper-V can be a bit strange depending on the version of the host OS, so I would also try with a physical machine if we have to question if its hyper-v or not. I know that hyper-v type 2 do boot into FOG.
-
@amped said in best master-master replication setup:
The guest in a gen 2 in hyper V and we do get to the FOG console. I was able to hit deploy and see our image and try to deploy but then the same crash I mentioned that it occurring right at boot after typing the fog username and password.
So this tells me that pxe booting is working because you get the FOG iPXE menu. On top of that it sounds like you are running a “Deploy Image” from the iPXE menu. That is not a problem just tracing your steps through your workflow. So as soon as you enter the user ID/password and bzImage and init.xz are transferred the target computer reboots. So where it is blowing up is the hand off between iPXE and FOS Linux (bzImage/ini.xz) This is usually the fault of the firmware of the target computer. In this case hyper-v. If you do the same thing on a physical machine my bet is that it will work.
-
@george1421 tested with Hyper V 2. everything you said is correct and that is exactly whats happening.
We are assuming this is limitations on the host. Going to try to test with a physical machine.