Replication is not working
-
Server
- FOG Version: 1.3.0-RC-22
- OS: Ubuntu 16.04.1 LTS
Client
- Service Version: not in use
- OS:
Description
I have the replication setup in the method I would think would be working.
On the Storage Node Management both sides have everything identical, except for the master has “Is Master Node” checked.
The Master node logs show
[12-08-16 8:46:32 pm] * Starting Image Replication. [12-08-16 8:46:32 pm] * We are group ID: 1. We are group name: default [12-08-16 8:46:32 pm] * We are node ID: 1. We are node name: SyncSRO [12-08-16 8:46:32 pm] * Attempting to perform Group -> Group image replication. [12-08-16 8:46:32 pm] * Not syncing Image between groups [12-08-16 8:46:32 pm] | Image Name: W7E-HP600G2 [12-08-16 8:46:32 pm] | There are no other members to sync to. [12-08-16 8:46:32 pm] * Not syncing Image between groups [12-08-16 8:46:32 pm] | Image Name: W7E-OS-Install [12-08-16 8:46:32 pm] | There are no other members to sync to. ...
and the slave node says
[12-08-16 8:23:38 pm] * | This is not the master node [12-08-16 8:33:38 pm] * | This is not the master node
I have re-entered all of the settings, verified the passwords match, etc. I cannot seem to get this to work and there doesn’t seem to be a post/question with an example on how to set it up correctly.
Anyone mind showing me what they have setup? Or suggestions on how to fix mine. Thanks in advance.
-
@Quazz Thanks, this will work, setting it to 87000 so it is just over 24hrs, that way I can unthrottle it and have the service start at 2am. Each image takes about 30 mins so I cannot imagine this being a problem at 7am when people start showing up. Thanks!
-
First, as typical, I’d say please try installing the latest RC (currently 30.)
Second, just knowing there’s a storage node isn’t enough. Are there multiple storage groups, or do both “nodes” exist in the same group?
-
@Tom-Elliott I think what you said made me realize how the Replication works.
Is it that the Master server has multiple nodes setup under the default group and it pushes the updates to them?
I was thinking I had to set the slaves to pull data from the master, so both sides needed to be setup. The Master acting as a master, accepting connections, and the Slaves needing the Master’s IP to pull data.
I had the Master setup with one Node, and the slave setup with the Master’s IP like it was going to try to pull data.
From what I gathered: The Master needs to have a node for each slave, and the slave servers need to have the master setup to allow the connection in?
-
Node’s are individual “things”.
The IP of a node is referring to the server/node itself, not the master.
The Master is it’s own “thing” too.
Yes, replication works from the “top” down. Not the other way around.
-
In simpler terms:
Master’s replicate their own contents to the other “subordinates”.
-
@Tom-Elliott Thank you for the clarification. I did run into an error though.
[12-08-16 11:30:22 pm] * Found Image to transfer to 1 node [12-08-16 11:30:22 pm] | Image Name: W7P-HP6300 [12-08-16 11:30:23 pm] | W7P-HP6300: No need to sync d1.fixed_size_partitions file to SyncMHB [12-08-16 11:30:23 pm] | W7P-HP6300: No need to sync d1.mbr file to SyncMHB [12-08-16 11:30:23 pm] | W7P-HP6300: No need to sync d1.minimum.partitions file to SyncMHB [12-08-16 11:30:23 pm] | W7P-HP6300: No need to sync d1.original.fstypes file to SyncMHB [12-08-16 11:30:24 pm] | W7P-HP6300: No need to sync d1.original.swapuuids file to SyncMHB [12-08-16 11:30:24 pm] | W7P-HP6300: No need to sync d1.partitions file to SyncMHB [12-08-16 11:30:24 pm] | W7P-HP6300: No need to sync d1p1.img file to SyncMHB [12-08-16 11:30:24 pm] | W7P-HP6300: No need to sync d1p2.img file to SyncMHB [12-08-16 11:30:24 pm] * Found Image to transfer to 1 node [12-08-16 11:30:24 pm] | Image Name: W7P-T460s [12-08-16 11:30:26 pm] * Starting Sync Actions [12-08-16 11:30:26 pm] | CMD: lftp -e 'set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; set net:limit-total-rate 0:256000;set net:limit-rate 0:256000; mirror -c -R --ignore-time -vvv --exclude "dev/" --exclude "ssl/" --exclude "CA" --delete-first '/images/W7PT460s' \''/images/W7PT460s'\'; exit' -u fog,[Protected] 10.63.xxx.xxx [12-08-16 11:30:26 pm] * Started sync for Image W7P-T460s mirror: Access failed: 550 Failed to change directory. (/images/W7PT460s)
Are there specific permissions the images folder needs? I have just left this as the default
-
@sbenson said in Replication is not working:
mirror: Access failed: 550 Failed to change directory. (/images/W7PT460s)
Typically that means the fog user has no home folder. What is the output of
ls -la /home
Also, image permissions should be fog:root and 777. This fixes all of them:
chown -R fog:root /images chmod -R 777 /images
-
@Wayne-Workman This worked. Just making the parent /images folder as fog:root vs root:root. Thanks.
Is there a way to only enable replication between certain times?
IE. 11pm-6am?The only thing I can think of is running a cron that stops and starts the service at those times. But that seems a little messy.
-
@sbenson Maybe the FOG setting, FOG Linux Service Sleep times can be helpful. You could set it to 24hours. Then you only need to run the cron once to restart at for eg 11PM
-
@Quazz Thanks, this will work, setting it to 87000 so it is just over 24hrs, that way I can unthrottle it and have the service start at 2am. Each image takes about 30 mins so I cannot imagine this being a problem at 7am when people start showing up. Thanks!