Fog Replication Not working



  • Running Version 1.4.4
    SVN Revision: 6077
    Ubuntu 14.04 Server

    I currently have FOG up and working on our primary location and setup to replicate to 13 storage nodes. While it has worked in the past for them. It seems as though it may have stopped. Right now I have about 100GB used on my master node and only 70 used on the rest with the exception of the new one and it has not replicated anything. I have looked in the log files in fog and am not finding anything there. Where should I start to troubleshoot on this issue? They all appear to be reachable and FOG is able to see the versions etc. But no replication seems to be happening.

    Thanks for your help on this.


  • Moderator

    @obsidian said in Fog Replication Not working:

    So I am confused as to why the Windows 10 image has not replicated to the ones that are syncing.

    I have no idea. Check the master node’s replication log. It should give a reason.



  • That is what I thought. So I am confused as to why the Windows 10 image has not replicated to the ones that are syncing.


  • Moderator

    @obsidian The ones that are correct stay correct.



  • I will do that today. Would that stop the synchronization of the ones that are correct or will it only just not synchronize the one that is wrong and continue updating the ones that are correct?


  • Moderator

    @obsidian said in Fog Replication Not working:

    Message: ftp_login(): Login incorrect., Host: 10.7.1.16, Username: fog

    That’s the problem. Doublecheck your passwords. Make sure the password is set correctly inside each storage node’s /opt/fog/.fogsettings file. Re-run the installer on those to correct the problem. The installer will update the node’s credentials in the database too (or should anyway).



  • This is from one of the storage nodes:

     vi /opt/fog/log/fogreplicator.log
    [01-15-18 2:39:05 am]  *  | This is not the master node
    [01-15-18 2:49:05 am]  *  | This is not the master node
    [01-15-18 2:59:05 am]  *  | This is not the master node
    [01-15-18 3:09:05 am]  *  | This is not the master node
    [01-15-18 3:19:05 am]  *  | This is not the master node
    [01-15-18 3:29:05 am]  *  | This is not the master node
    [01-15-18 3:39:05 am]  *  | This is not the master node
    [01-15-18 3:49:05 am]  *  | This is not the master node
    [01-15-18 3:59:05 am]  *  | This is not the master node
    [01-15-18 4:09:05 am]  *  | This is not the master node
    [01-15-18 4:19:05 am]  *  | This is not the master node
    [01-15-18 4:29:05 am]  *  | This is not the master node
    [01-15-18 4:39:05 am]  *  | This is not the master node
    [01-15-18 4:49:05 am]  *  | This is not the master node
    [01-15-18 4:59:05 am]  *  | This is not the master node
    [01-15-18 5:09:05 am]  *  | This is not the master node
    [01-15-18 5:19:05 am]  *  | This is not the master node
    [01-15-18 5:29:06 am]  *  | This is not the master node
    [01-15-18 5:39:06 am]  *  | This is not the master node
    [01-15-18 5:49:06 am]  *  | This is not the master node
    [01-15-18 5:59:06 am]  *  | This is not the master node
    [01-15-18 6:09:06 am]  *  | This is not the master node
    [01-15-18 6:19:06 am]  *  | This is not the master node
    [01-15-18 6:29:06 am]  *  | This is not the master node
    [01-15-18 6:39:06 am]  *  | This is not the master node
    [01-15-18 6:49:06 am]  *  | This is not the master node
    [01-15-18 6:59:06 am]  *  | This is not the master node
    [01-15-18 7:09:06 am]  *  | This is not the master node
    [01-15-18 7:19:06 am]  *  | This is not the master node
    [01-15-18 7:29:06 am]  *  | This is not the master node
    [01-15-18 7:39:06 am]  *  | This is not the master node
    [01-15-18 7:49:06 am]  *  | This is not the master node
    [01-15-18 7:59:06 am]  *  | This is not the master node
    [01-15-18 8:09:06 am]  *  | This is not the master node
    [01-15-18 8:19:07 am]  *  | This is not the master node
    [01-15-18 8:29:07 am]  *  | This is not the master node
    [01-15-18 8:39:07 am]  *  | This is not the master node
    [01-15-18 8:49:07 am]  *  | This is not the master node
    [01-15-18 8:59:07 am]  *  | This is not the master node
    [01-15-18 9:09:07 am]  *  | This is not the master node
    [01-15-18 9:19:07 am]  *  | This is not the master node
    [01-15-18 9:29:07 am]  *  | This is not the master node
    [01-15-18 9:39:07 am]  *  | This is not the master node
    [01-15-18 9:49:07 am]  *  | This is not the master node
    [01-15-18 9:59:07 am]  *  | This is not the master node
    [01-15-18 10:09:07 am]  *  | This is not the master node
    [01-15-18 10:19:07 am]  *  | This is not the master node
    [01-15-18 10:29:07 am]  *  | This is not the master node
    [01-15-18 10:39:07 am]  *  | This is not the master node
    [01-15-18 10:49:07 am]  *  | This is not the master node
    [01-15-18 10:59:07 am]  *  | This is not the master node
    [01-15-18 11:09:07 am]  *  | This is not the master node
    [01-15-18 11:19:07 am]  *  | This is not the master node
    [01-15-18 11:29:07 am]  *  | This is not the master node
    [01-15-18 11:39:07 am]  *  | This is not the master node
    [01-15-18 11:49:08 am]  *  | This is not the master node
    [01-15-18 11:59:08 am]  *  | This is not the master node
    [01-15-18 12:09:08 pm]  *  | This is not the master node
    [01-15-18 12:19:08 pm]  *  | This is not the master node
    [01-15-18 12:29:08 pm]  *  | This is not the master node
    [01-15-18 12:39:08 pm]  *  | This is not the master node
    [01-15-18 12:49:08 pm]  *  | This is not the master node
    [01-15-18 12:59:08 pm]  *  | This is not the master node
    

    Here is for the Master

    [01-15-18 7:26:43 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:26:44 pm]  | postdownloadscripts: No need to sync fog.postdownload file to Frankfort
    [01-15-18 7:26:44 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Frankfort.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.13.1.16
    [01-15-18 7:26:44 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:26:44 pm] | Replication already running with PID: 3885
    [01-15-18 7:26:48 pm]  * Type: 2, File: /var/www/html/fog/lib/fog/fogftp.class.php, Line: 463, Message: ftp_login(): Login incorrect., Host: 10.7.1.16, Username: fog
    [01-15-18 7:36:39 pm]  * Starting Image Replication.
    [01-15-18 7:36:39 pm]  * We are group ID: 1. We are group name: default
    [01-15-18 7:36:40 pm]  * We are node ID: 1. We are node name: Summit
    [01-15-18 7:36:40 pm]  * Attempting to perform Group -> Group image replication.
    [01-15-18 7:36:40 pm]  | Replicating postdownloadscripts
    [01-15-18 7:36:40 pm]  * Found Image to transfer to 11 nodes
    [01-15-18 7:36:40 pm]  | File Name: postdownloadscripts
    [01-15-18 7:36:40 pm]  | 236 0 /images/postdownloadscripts/fog.postdownload
    [01-15-18 7:36:40 pm]  | Files do not match.
    [01-15-18 7:36:40 pm]  * Deleting remote file: /images/postdownloadscripts/
    [01-15-18 7:36:41 pm]  * Starting Sync Actions
    [01-15-18 7:36:41 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Alsip.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.6.1.16
    [01-15-18 7:36:41 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:36:42 pm]  | postdownloadscripts: No need to sync fog.postdownload file to Byron Center
    [01-15-18 7:36:42 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Byron Center.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.1.1.16
    [01-15-18 7:36:42 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:36:44 pm]  | postdownloadscripts: No need to sync fog.postdownload file to Frankfort
    [01-15-18 7:36:44 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Frankfort.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.13.1.16
    [01-15-18 7:36:44 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:36:44 pm] | Replication already running with PID: 3885
    [01-15-18 7:36:47 pm]  * Type: 2, File: /var/www/html/fog/lib/fog/fogftp.class.php, Line: 463, Message: ftp_login(): Login incorrect., Host: 10.7.1.16, Username: fog
    [01-15-18 7:46:40 pm]  * Starting Image Replication.
    [01-15-18 7:46:40 pm]  * We are group ID: 1. We are group name: default
    [01-15-18 7:46:40 pm]  * We are node ID: 1. We are node name: Summit
    [01-15-18 7:46:40 pm]  * Attempting to perform Group -> Group image replication.
    [01-15-18 7:46:40 pm]  | Replicating postdownloadscripts
    [01-15-18 7:46:40 pm]  * Found Image to transfer to 11 nodes
    [01-15-18 7:46:40 pm]  | File Name: postdownloadscripts
    [01-15-18 7:46:41 pm]  | 236 0 /images/postdownloadscripts/fog.postdownload
    [01-15-18 7:46:41 pm]  | Files do not match.
    [01-15-18 7:46:41 pm]  * Deleting remote file: /images/postdownloadscripts/
    [01-15-18 7:46:41 pm]  * Starting Sync Actions
    [01-15-18 7:46:41 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Alsip.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.6.1.16
    [01-15-18 7:46:41 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:46:42 pm]  | postdownloadscripts: No need to sync fog.postdownload file to Byron Center
    [01-15-18 7:46:42 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Byron Center.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.1.1.16
    [01-15-18 7:46:42 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:46:44 pm]  | postdownloadscripts: No need to sync fog.postdownload file to Frankfort
    [01-15-18 7:46:44 pm]  | CMD:
                            lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.Frankfort.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:3840000;set net:limit-rate 0:512000; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.13.1.16
    [01-15-18 7:46:44 pm]  * Started sync for Image postdownloadscripts
    [01-15-18 7:46:44 pm] | Replication already running with PID: 3885
    [01-15-18 7:46:48 pm]  * Type: 2, File: /var/www/html/fog/lib/fog/fogftp.class.php, Line: 463, Message: ftp_login(): Login incorrect., Host: 10.7.1.16, Username: fog
    

  • Moderator

    /opt/fog/log/fogreplicator.log I think. Last 100 or so lines should be enough.


  • Developer

    @Obsidian So do you still need help on this? If so, please provide more information like log files.



  • Okay so from what I can tell most (all but maybe one site) didn’t replicate all of the or most of the Windows 7 images the one that didn’t replicate I think is a pw issue so am not overtly concerned on it at this time. However I recaptured my Windows 10 image last week and so far at this time it has still not replicated. I looked at the sync logs on the storage nodes and the only showing in them is that it is not the master node. :-(


  • Developer

    @Obsidian Great to hear your got it all back together with the help of Wayne! :-)

    Please give us a quick update if it all works on Friday so we can mark this solved.



  • Okay I replaced the plugin.class.php file as suggested and that fixed the issue. I have added all of the sites back in and so far they are all talking and I believe all are replicating at this time. I will keep and eye on it this week and see how it goes.



  • Thanks guys for all of the hard work that you do on this great program and also especially for Wayne taking time out to work on this with me last night!! I am going to try to get to finishing what we have started this week as well as the solution in the link by Sebastian and will let you know how it goes.


  • Developer

    Probably George’s post here: https://forums.fogproject.org/post/100004


  • Senior Developer

    @wayne-workman already known about and there’s a workaround fix for it somewhere around the forums. It has to do with replacing the plugin.class.php file


  • Moderator

    I helped a bit with this remotely. I couldn’t figure out what was wrong with the DB in the time I had, so we opted to drop the database and re-run the installer. Then re-created one of the storage nodes and replication is working - so it was something to do with the database.

    @Developers I did find an issue with creating a new location for the first time in 1.4.4 release - it appears to be some sort of gui bug that won’t let you create a new location. I’m going to confirm this when I get some time and if it’s a thing, I’ll make a bug thread (unless it’s already known about?).


  • Senior Developer

    @obsidian DO you have a “master” node setup for 10.16.1.16?



  • Wayne,

    Were you able to find more on this? I am still getting the same message and have been looking and am unable to find any other settings where I might enable this. Here is a screenshot of the settings you suggest I check.

    0_1515541401142_d6fce813-756c-48be-a584-f62e9ac984f1-image.png


  • Moderator

    @obsidian That’s not the only one. Off the top of my head, I remember each storage node having one, and each image having one. There might be others too.



  • I found that setting via the following link. So I checked and all are check marked (none had ever been changed)

    https://github.com/FOGProject/fogproject/issues/160

    FOG Configuration Page->FOG Settings->FOG Linux Service Enabled


Log in to reply
 

458
Online

39.8k
Users

11.2k
Topics

106.8k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.