Replication oddity after moving Master node


  • Hi all, I logged an issue on this link the other day re imaging failing after migrating our master server to a new location/IP via VMware VtoV.

    Imaging is sorted now (thanks to @Sebastian-Roth for the assist on that) but am facing a more peculiar issue.

    We have a multi office setup.

    Data Center with Master (DCMaster) and Storage (DCStorage) nodes.
    DCMaster is set to Storage group 1 to contain all images
    DCStorage is set to Storage group 2 to contain only images to be replicated to shops which have less storage capacity.

    We have a main office set with storage node (MainStorage) which is set to Storage group 1

    And several small shops also with storage nodes (ShopStorage1, ShopStorage2 etc) - these all use Storage group 2

    We have an odd issue where replication seems to overwriting existing images somehow. We’ve made changes to an image and captured it (which should go to DCMaster and be replicated out from there). But something is overwriting that image.

    To test I’ve removed all reference to a particular pre-sysprep image. Initiated a capture again and can see the mac address for the capture hitting /mnt/images/dev on DCMaster

    Before that image has completed capturing however an image with that name appears in /mnt/images on MainStorage

    I’ve double checked that MainStorage is set as a storage node (it is) and it’s set to look to DCMaster as it’s snmysqlhost entry from /opt/fog/,fogsettings

    In the fogreplicator.log on MainStorage I’m seeing the following

    [03-29-21 9:11:33 am]  *  | This is not the master node
    [03-29-21 9:21:33 am]  *  | This is not the master node
    [03-29-21 9:31:32 am]  *  * Image replication is globally disabled
    [03-29-21 9:32:32 am] FOGService: ImageReplicator - Waiting for mysql to be available
    last line repeated....
    

    Is there anywhere else I can check to determine where this image is replicating from???

    regards Tom

  • Senior Developer

    @sebastian-roth I’m not aware of anything special with this. This doesn’t make much sense to me. We don’t use SSH at all for replicating data. LFTP is used. So Maybe selinux was causing issues that rsync corrected for? I don’t know other wise.

  • Senior Developer

    @kiweegie said in Replication oddity after moving Master node:

    Wondering if there needs to be some syncing of SSH keys or similar to pemit the servers to talk to each other and thats not being handled automatically? Just a theory…

    Not that I know of! Sounds strange that you need to manually sync one before it does the other ones for you.

    @Tom-Elliott Would you have an idea?


  • So I just added 2 more Storage nodes to the equation. On both replication didn’t kick in til I’d initiated a manual rsync from Master to storage node. After doing that (again for one image only) the other 3 images all synced OK.

    Wondering if there needs to be some syncing of SSH keys or similar to pemit the servers to talk to each other and thats not being handled automatically? Just a theory…

    regards Tom


  • After rsyncing one of the images from DCMASTER to DCSTORAGE both are now showing up on the storage node… and in turn seem to be in process of replicating to the other Storage nodes.

    I’ll need to double check all of them once replication process finished to see if they have same file sizes etc.

    regards Tom


  • @sebastian-roth You’re quite right, i’ve just re-read that myself and it’s confusing… I was using non-real names for reasons of privacy.

    This image should hopefully help show the layout a little more clearly.

    2021-04-01_10h26_34.png
    DCMASTER is set as Master node for Storage Group 1
    DCSTORAGE is set as Master node for Storage Group 2

    MAINSTORAGE is a storage node on Storage Group 1
    All the SHOPSTORAGE nodes are storage nodes on Storage Group 2

    Image capture to DCMASTER is working fine - I can see the mac address of the image machine hitting /mnt/images/dev as the image itself is uploading, that part seems fine.

    Images in question have been assigned to both Storage Group 1 and Storage Group 2 with Storage Group 2 set as primary. The goal is to have DCMASTER host all images and sync all of them to all storage nodes via membership to both Storage groups.

    2 issues being faced:

    MAINSTORAGE is showing the PreSysprep and PostSysprep image folder structure before it’s fully uploaded and showing on DCMASTER

    None of the SHOPSTORAGE nodes are getting the images replicated. I can try rsyncing the images over to DCSTORAGE manually so they sync in turn to the other storage nodes but was looking for this to happen automatically.

    Am I correct in stating that if MAINSTORAGE is replicating from DCMASTER then the folder names and sizes should be identical? Should the structure show up on MAINSTORAGE before it appears on DCMASTER?

    We’re using the location plugin in case that has any impact (plus LDAP, WOLBroadcast plugins).

    If you need more information than above let me know

    regards Tom

  • Senior Developer

    @Kiweegie You might draw a picture of your setup to make this more clear. In first sight I have no idea where the image “appearing out of the blue” might come from.


  • Added to above I’ve just checked DCStorage and all the ShopStorage nodes and that image is NOT replicating to those nodes.

    Image has only Storage Group 2 set under Image Management

    DCStorage is set as the Master for Storage Group 2. I’m guessing I may need to add that storage group to the image so it has both storage groups assigned?

    That would perhaps explain why the replication is not happening to other nodes but I’m still baffled as to why the image is suddenly appearing out of nowhere on MainStorage

    regards Tom

355
Online

8.1k
Users

15.0k
Topics

141.4k
Posts