RC8 - Snapin not replicating after edit
-
After editing a snapin by uploading a new snapin file with the same name as the old file , the new file does not get replicated to the other storage nodes. They keep the old file, which will no longer deploy since the hashes differ.
Updating a snapin with a file that has the same name as the previous one happens a lot with e.g. altered cmd scripts or newer versions of msi installers.
-
@bluenix I found the issue. It wasn’t checking if the file existed. This by itself is fine, but if the files didn’t match, the file should’ve been deleted. LFTP has a bug in it that doesn’t allow the deletion of the file, so I have to check each file first and forcibly delete it.
I’ve found and fixed this and it will be available for RC-9.
In the mean time, can you delete the remote node’s snapin file and restart the sender node’s file?
-
What is the output of
ps -aux | grep FOGSnapinReplicator
and what is this below setting set to?Web interface -> FOG Configuration -> FOG Settings -> FOG Service Sleep Times -> SNAPINREPSLEEPTIME
The Snapin Rep Sleep Time setting is how long there is between snapin replication cycles.
Also - if the service is running and the snapin rep sleep time is set to something reasonable, this could point to incorrect FTP credentails, Firewall issues, SELinux issues, or even space available on the nodes.
-
I should have been more clear. Replication works fine once a new snapin file is uploaded (existing snapin with new filename, or new snapin), just not when replacing a file with another file that has the same name.
-
@bluenix Can you upload a modified snapin file, and then restart the
FOGSnapinReplicator
service, and then post the logs specific to this modified file from the snapin replication logs? The logs are here:Web Interface -> FOG Configuration -> Log Viewer -> Snapin replication log
The logs should have timestamps, you can sort them so new ones appear at the top. Once that file you changed is processed you can press pause, so you can copy / paste it.
-
This is after updating a snapin with a new file that has the same name as the old one and restarting the FOGSnapinReplicator service:
[08-13-16 11:30:45 am] * Found Snapin to transfer to 3 node(s) [08-13-16 11:30:45 am] | Snapin name: testsnapin [08-13-16 11:30:46 am] * Starting Sync Actions [08-13-16 11:30:46 am] | CMD: lftp -e 'set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-rate 0:1280000; mirror -c -R -i test.cmd --ignore-time -vvv --exclude 'dev/' --exclude 'ssl/' --exclude 'CA/' --delete-first /opt/fog/snapins /opt/fog/snapins; exit' -u fog,[Protected] 172.17.0.5 [08-13-16 11:30:46 am] * Started sync for Snapin testsnapin [08-13-16 11:30:47 am] | CMD: lftp -e 'set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; mirror -c -R -i test.cmd --ignore-time -vvv --exclude 'dev/' --exclude 'ssl/' --exclude 'CA/' --delete-first /opt/fog/snapins /opt/fog/snapins; exit' -u fog,[Protected] 172.18.0.5 [08-13-16 11:30:47 am] * Started sync for Snapin testsnapin
-
@bluenix And if you run this command manually, what happens?
lftp -e 'set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; mirror -c -R -i test.cmd --ignore-time -vvv --exclude 'dev/' --exclude 'ssl/' --exclude 'CA/' --delete-first /opt/fog/snapins /opt/fog/snapins; exit' -u fog,[Protected] 172.18.0.5
Replace the
[Protected]
part with the storage node’s password (the ftp pass for that node). -
@Wayne-Workman said in RC8 - Snapin not replicating after edit:
lftp -e ‘set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; mirror -c -R -i test.cmd --ignore-time -vvv --exclude ‘dev/’ --exclude ‘ssl/’ --exclude ‘CA/’ --delete-first /opt/fog/snapins /opt/fog/snapins; exit’ -u fog,[Protected] 172.18.0.5
The only output is:
Total: 1 directory, 2 files, 0 symlinks
And the file does not get updated at 172.18.0.5
-
If they’re already the same file, they won’t be updated. Can you get a checksum of the file on both sides? The main server and the main node?
-
Main server sha512sum:
cb872de2b8d2509c54344435ce9cb43b4faa27f97d486ff4de35af03e4919fb4ec53267caf8def06ef177d69fe0abab3c12fbdc2f267d895fd07c36a62bff4bf
Storage node(s) sha512sum:
b16ed7d24b3ecbd4164dcdad374e08c0ab7518aa07f9d3683f34c2b3c67a15830268cb4a56c1ff6f54c8e54a795f5b87c08668b51f82d0093f7baee7d2981181
-
@bluenix I found the issue. It wasn’t checking if the file existed. This by itself is fine, but if the files didn’t match, the file should’ve been deleted. LFTP has a bug in it that doesn’t allow the deletion of the file, so I have to check each file first and forcibly delete it.
I’ve found and fixed this and it will be available for RC-9.
In the mean time, can you delete the remote node’s snapin file and restart the sender node’s file?
-
@Tom-Elliott Great!