Server
- FOG Version: v1.5.0 RC-8 v24 w/ Location Plugin
- OS: CEntOS 7
Description
We have 2 Storage Nodes with same as above, 3 Storage Groups, 3 Locations, Each Location has 1 storage node in 1 storage group.
Snapin and Image Replication has stopped as of Aug 26th Working Update. We’ve removed and re-connected the storage nodes and we’ve removed, reinstalled and reconfigured the Locations Plugin, but the problem persists.
All images and all snapins are configured to replicate to all storage groups. The main FOG server is the primary for all images and snapins. On the dashboard, Storage nodes are shown to be online and Storage Groups report as expected.
Image Replicator Log from the main FOG server (the Primary for all Images) shows 2 sorts of issue in a single pass:
It shows images fail to replicate because the storage nodes are offline (but they aren’t):
[09-07-17 11:23:23 am] * Starting Image Replication.
[09-07-17 11:23:23 am] * We are group ID: 1. We are group name: default
[09-07-17 11:23:23 am] * We are node ID: 1. We are node name: DefaultMember
[09-07-17 11:23:23 am] * Attempting to perform Group -> Group image replication.
[09-07-17 11:23:23 am] | Replicating postdownloadscripts
[09-07-17 11:23:23 am] * Not syncing Image between nodes
[09-07-17 11:23:23 am] | Image Name:
[09-07-17 11:23:23 am] | There are no other members to sync to.
[09-07-17 11:23:23 am] | Replicating postinitscripts
[09-07-17 11:23:23 am] * Not syncing Image between nodes
[09-07-17 11:23:23 am] | Image Name:
[09-07-17 11:23:23 am] | There are no other members to sync to.
[09-07-17 11:23:23 am] * Found Image to transfer to 3 groups
[09-07-17 11:23:23 am] | Image Name: W10Prox64BIOSSysprep
[09-07-17 11:23:23 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:23 am] sal2fogsnl01 Server does not appear to be online.
[09-07-17 11:23:23 am] * Found Image to transfer to 3 groups
[09-07-17 11:23:23 am] | Image Name: W7ProSp1x32ReamDrivers
[09-07-17 11:23:23 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:23 am] sal2fogsnl01 Server does not appear to be online.
[09-07-17 11:23:23 am] * Found Image to transfer to 3 groups
[09-07-17 11:23:23 am] | Image Name: W7ProSP1x64ReArmDrivers
[09-07-17 11:23:23 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:23 am] sal2fogsnl01 Server does not appear to be online.
.
.
.
Second, it shows some images are not configured to replicate:
[09-07-17 11:23:23 am] | Image Name: Win7ProSP1x64DriversRearm
[09-07-17 11:23:23 am] | There are no other members to sync to.
[09-07-17 11:23:23 am] * Attempting to perform Group -> Nodes image replication.
[09-07-17 11:23:23 am] * Not syncing Image between nodes
[09-07-17 11:23:23 am] | Image Name: W10Prox64BIOSSysprep
[09-07-17 11:23:23 am] | There are no other members to sync to.
[09-07-17 11:23:23 am] * Not syncing Image between nodes
[09-07-17 11:23:23 am] | Image Name: W7ProSp1x32ReamDrivers
[09-07-17 11:23:23 am] | There are no other members to sync to.
[09-07-17 11:23:23 am] * Not syncing Image between nodes
[09-07-17 11:23:23 am] | Image Name: W7ProSP1x64ReArmDrivers
[09-07-17 11:23:23 am] | There are no other members to sync to.
[09-07-17 11:23:23 am] * Not syncing Image between nodes
.
.
.
Note that some of the images are listed twice in one replication pass.
Similarly, the Snapin Replication Log from the Main FOG Server (Primary for all Snapins) shows the same two issues:
First that the Storage Nodes are offline:
[09-07-17 11:23:26 am] * Starting Snapin Replication.
[09-07-17 11:23:26 am] * We are group ID: 1. We are group name: default
[09-07-17 11:23:26 am] * We are node ID: 1. We are node name: DefaultMember
[09-07-17 11:23:26 am] * Attempting to perform Group -> Group snapin replication.
[09-07-17 11:23:26 am] | Replicating ssl less private key
[09-07-17 11:23:26 am] * Not syncing Snapin between nodes
[09-07-17 11:23:26 am] | Snapin Name:
[09-07-17 11:23:26 am] | There are no other members to sync to.
[09-07-17 11:23:26 am] * Not syncing Snapin between nodes
[09-07-17 11:23:26 am] | Snapin Name:
[09-07-17 11:23:26 am] | There are no other members to sync to.
[09-07-17 11:23:26 am] * Found Snapin to transfer to 3 groups
[09-07-17 11:23:26 am] | Snapin Name: -DeliverFogExe
[09-07-17 11:23:26 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] sal2fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] * Found Snapin to transfer to 3 groups
[09-07-17 11:23:26 am] | Snapin Name: -ExtendDisk
[09-07-17 11:23:26 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] sal2fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] * Found Snapin to transfer to 3 groups
[09-07-17 11:23:26 am] | Snapin Name: -Timeout
[09-07-17 11:23:26 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] sal2fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] * Found Snapin to transfer to 3 groups
[09-07-17 11:23:26 am] | Snapin Name: 0-AdminSet
[09-07-17 11:23:26 am] roa1fogsnl01 Server does not appear to be online.
[09-07-17 11:23:26 am] sal2fogsnl01 Server does not appear to be online.
.
.
.
and Second that the snapin isn’t configured for replications:
09-07-17 11:23:26 am] * Attempting to perform Group -> Nodes snapin replication.
[09-07-17 11:23:26 am] * Not syncing Snapin between nodes
[09-07-17 11:23:26 am] | Snapin Name: -DeliverFogExe
[09-07-17 11:23:26 am] | There are no other members to sync to.
[09-07-17 11:23:26 am] * Not syncing Snapin between nodes
[09-07-17 11:23:26 am] | Snapin Name: -ExtendDisk
[09-07-17 11:23:26 am] | There are no other members to sync to.
[09-07-17 11:23:26 am] * Not syncing Snapin between nodes
[09-07-17 11:23:26 am] | Snapin Name: -Timeout
[09-07-17 11:23:26 am] | There are no other members to sync to.
[09-07-17 11:23:26 am] * Not syncing Snapin between nodes
[09-07-17 11:23:26 am] | Snapin Name: 0-AdminSet
[09-07-17 11:23:26 am] | There are no other members to sync to.
And, again, some, if not all, Snapins are listed twice in the single log pass.
This all worked in previous versions of the working branch of v1.5.0 at the end of August.
In the current system we have, Images and snapins fail from storage nodes and work from the main FOG server. It appears the only problem is replication. Our next step is to manually copy files around and test deployment to verify the problem is limited to replication alone.
Any idea how to proceed?
Any suggestions would be appreciated.
Thanks,
Jim