Possible Image and Snapin Replication Problem w/ Working Branch
-
@jim-graczyk Ok then. I think you’ve found an issue with replication in RC8/working. Without waiting for the developers, you have two options. 1 is to go back to RC 7 where it worked. 2 is to just manually
scp
the image & snapin changes to the nodes as appropriate and see if that works.Here’s how to go back to rc7:
git checkout 31a61db2c12ebc394ea167f9b37ba6ef4da7ea99 cd bin ./installfog.sh -y
Normally I don’t recommend downgrading but it looks like no DB changes have happened since then and now, so it should work in this case. (future readers, it will not work for you).
When our developers come back from vacation, hopefully they can resolve the issue.
AND - I feel it’s time to build some quality-checking for replication - so I’ll be working on that in my free time in the coming weekends so that we can immediately know when this stuff isn’t working in one of the branches.
-
I prefer to move forward rather than backward so I’ll choose Option B - manually replicate and test that it’s only a replication problem. We’re already working on that. This will work for a few days, but we’ll have problems as soon as we have to upload PCs before imaging.
I’ll post here if all images and snapins work to clients at each site.
Jim
-
@wayne-workman while quality assurance is always a good thing, the code I added would not have broken what is being reported here. I added a simple check to find out if it can reach the server on port 21, the ftp port. If it cannot communicate it will report it cannot. If it does communicate it will perform replication tasks.
-
-
@Moderators @Testers Is anyone able to replicate this issue?
-
@sebastian-roth I have plans (in my head) to build automated functional testing for this, I’m not setup to test replication at the moment.
-
I just wanted to let the FOG team know that I have just completed the build of a new FOG server using v1.5.0 RC9, Dev Branch, with associated storage nodes. We created storage groups and storage nodes, after uploading images. Installed the Locations Plugin and set locations. We moved our images and snapins to the new FOG installation…
And we’ve reproduced the same problem we’ve had on FOG system on which this original posting was based. We have no replication of images or snapin, even though the storage nodes are working as expected.
We’ve written a bash script to replicate both snapins and images from the main fog server to the storage nodes, so our installation works, but based on my experience, it’s very easy to reproduce this replications problem - just do an installation on fresh servera and the problems ensue.
Thanks,
Jim
-
@jim-graczyk Can you try on working branch please? There was an issue with implementation of finding “isAvailable” nodes which I’m pretty sure has been corrected for.
-
Tom,
I’m using the working branch on my lab set up. I’ll pull it there.
I’m also OK trying the working branch on my new installation, as long as I can switch back to the Dev branch after pulling the current Working version.
Is it just a matter of changing the git checkout back to dev branch?
FYI - My new installation is showing SVN 6079 while the lab is showing SVN 6080 - if that helps any.
thanks,
Jim
-
@jim-graczyk Switching to dev-branch is as easy as
git checkout dev-branch
(re-run installer of course too).However, reinstalling would put it back into state where it isn’t working properly again. I don’t expect the current working to remain working for too much longer though. Probably by end of this week I’ll make it into RC-10.
-
Updated Lab to v52. Can no longer check replications logs.
Fog Configuration / Log Viewer gives me a Files: pulldown that is now empty.
Jim
-
-
@jim-graczyk I’ll fix that later. Sorry that’s a missed thing in jquery call. The logs are still ‘working’, you just can’t view them currently from the GUI.
-
@Jim-Graczyk Try
less /opt/fog/log/fogreplicator.log
on the FOG server command line. -
Working should fix the GUI not displaying the logs for you now too.
-
Fantastic …
Updated the lab and replication is working to the Storage Nodes, Group to Group, as it should be AND the Replication Log UI is not only working but provides much more detail.
Please consider this issue Solved…
Just an FYI - since we have a script working in the new system, we may or may not move to it to the Working branch. It’ll depend on how much pain our script causes us as we add Locations and Storage Nodes v fear of instability this is part and parcel to using the Working branch. Our lab will stay up to date on the working branch. At this point, we think we can hold out for the Dev release of RC10.
Thanks very much.
Jim
-
I have been keeping an eye out on this issue with replication myself, im running V52 of working branch, should I wait until RC10 to come out to update? So far I think everything is functioning normally for replication but figured I could wait until RC10.