FOG Server failover
-
Personally,
I use Hyper-V for FOG.
Replication is turned on for my virtualized FOG server.
If the FOG server (or the underlying server) becomes unresponsive, an exact replica of it (hdd, RAM contents and all) fires up immediately (and automatically) on another server.
-
VM Replication is your friend.
-
I am using dnsmasq so that is not an issue.
I dove in head first and made the switch to see if it is possible.
I did a mysql dump of the database on the FOG server and saved it to the images folder, the images folder is replicated on the storage node so I have it in both places. I powered off the FOG server so that it would not answer DHCP requests using dnsmasq.
On the Storage Node I installed dnsmasq, I edited the .fogsettings script so that it would become a FOG server and I ran installfog.sh. I then restored the database from the original FOG server backup. I can access all the images and data. The only problem I am having is with pxe booting to the New FOG server. I have included an image of the screen, the original FOG server is 200.200.200.115 and the storage Node that is now a FOG server is 200.200.200.65. I am sure there is a simple change needed somewhere but I am not sure where yet.
VM’s would be nice but not possible for this project. I don’t need an automatic switchover so scripting will not be neccessary, I did think about writing a simple script to replace .fogsettings with .fogsettings.master/.fogsettings.storage, start or stop dnsmasq, and restore the database from a backup but a manual switchover will work.
Thanks for the quick replies, I will continue testing in the morning.
[url=“/_imported_xf_attachments/1/1983_IMG_3840.JPG?:”]IMG_3840.JPG[/url]
-
Did you edit the dnsmasq configuration on the storage node? You can literally take a copy of your primary FOG server’s ltsp.conf file, and just use “Replace All…” from a text editor to replace 200.200.200.115 with 200.200.200.65 Also, using a “Replace All…” instead of doing it by hand will minimize your risk of typos and mistakes. You must also modify the answer file to have the correct IP addresses as well. If you’re using names inside your ltsp.conf file, I’d recommend changing them to IPs instead… An IP is more reliable than a name.
Did you make sure TFTP is running? Permissions are set right on the /tftpboot folder?
See the wiki article “Troubleshoot TFTP”
Also,
With your old FOG server being [B]OFF[/B], and assuming you’ve got the “fail over” configured right, [B]hosts booting to the network should have no knowledge of 200.200.200.115 at all.[/B] So, this is why I’m asking you to double check your dnsmasq config… you probably forgot a line or something…
You sure DHCP isn’t handing out Option 066 and 067 ?
-
And to more finely tune a switch-over, you might want to just have your storage node configured as an actual FOG server, and then just disable dnsmasq… BUT, you still need the FOGImageReplicator running as if it were a storage node…
So, when you switch over, all you have to do is just start that, and then everything starts working…
But, you still need some way to keep the DB updated, too… That will be key. -
you’ll also need to change the ip’s stored in the fog configuration page and the default.ipxe file
-
Thanks again for all the replies!!
It’s up now. I did a dump of the FOG database, I just needed to chage the IP address in FOG settings for everything to work.
In FOG Settings:
General Settings> FOG_WOL_HOST: <ip address of server>
TFTP Server> FOG_TFTP_HOST: <ip address of server>
Web Server> FOG_WEB_HOST: <ip address of server>This is for a large project that we are building, rack 1: MasterAA, MasterAB, FOGServer; rack 2 MasterBA, MasterBB, FOGStorage. The Masters are running Red Hat 6.6, the FOG Servers are running CentOS. There are also 70 consoles (pc running Win7). I will have several images on FOG and also database backups daily. They do a switchover every 3 months, however, the FOG servers will not be switched at that time, but I needed to make sure it can be done. I may have to use DHCP on my FOG servers because this will be a closed network. All the servers and consoles are using bonded NIS’s on seperate networks. I need to prove to the project manager that FOG can handle the project with no problems. Fun Fun!!
-
Do share your FOG failover documentation, it’d make a great WiKi article!
I always recommend donating unique documentation. If you (or your organization) can’t donate hard cash, documentation is the next best way to contribute, because you’re making documentation anyways, right?
-
[quote=“Wayne Workman, post: 47066, member: 28155”]because you’re making documentation anyways, right?[/quote]
wait, i thought documentation was like commenting your code or backing up your files. something you expect other people to do, but you don’t do yourself.
-
I still have alot of testing and configuration to do but as soon as my documentation is done I will post it here. I am having issues with NIC bonding right now, it works great on the FOG servers, CentOS 6.6, but I applied the same settings to the Red Hat 6.6 servers and the connection keeps dropping. The fun of IT!!