Fog 1.5.7 Locations Plugin pulling image from wrong storage node
-
@Sebastian-Roth Hi Sebastien & @Tom-Elliott more than happy to help gents, you’re doingme the favour being so quick to respond to requests and with really good/helpful suggestions. I only wish paid support was this good…
I’ll snapshot the VMs for roll back no problem - i presume every one of the nodes needs updated this way or can I get away with (initially at least) the 2 master nodes in my head office? Only ask as there are 12 nodes to snapshot/update over 10 different vcenters/esxi hosts. If needs be I can do though.
Let me know on the above and I shall proceed accordingly. I’m technically on leave tomorrow but the imaging solution here is getting a lot of focus up the food chain as it were so I’m keen to get working as soon as I can. I can dial in from home tomorrow to focus on this without other distractions
regards Tom
-
@Kiweegie Great to hear you can do some testing!
i presume every one of the nodes needs updated this way or can I get away with (initially at least) the 2 master nodes in my head office?
Yeah, definitely a good question. In the location settings, did you say PXE boot from location? If yes, then updating the storage node in question could be enough. If all clients PXE boot from the master server then I’d update that one/two. I am trying to get my head around all the major changes we had since 1.5.7 and if one of them could cause an issue if you leave some of the nodes on 1.5.7. Well, we added some database security and I am sure all your storage nodes will fail to connect to the master(s) DB as soon as the master is updated to
dev-branch
. Not exactly sure about the other way round but I think only updating a storage node might work.My suggestion: If possible set PXE booting for Colorado location to yes (location setting “Use inits and kernels from this node”). Make sure clients boot properly. Update that one node to
dev-branch
and see if deployment of a single client pulls the image from Colorado server.Update to
dev-branch
:su - git clone https://github.com/FOGProject/fogproject cd fogproject git checkout dev-branch cd bin ./installfog.sh
EDIT: Most certainly you will need to adjust a setting because otherwise clients booting the updated init files form your storage node will fail! FOG web UI -> FOG Configuration -> FOG Settings -> TFTP Server -> KERNEL RAMDISK SIZE is probably set to 127000 and you need to set to 275000. This doesn’t hurt even if the other nodes are still on 1.5.7. It just means that machines with less than 512 MB of RAM will fail to run a FOG task.
EDIT2: Just figured that this was even before 1.5.7 was out. So you might check, but it should be set to 275000 already. -
@Sebastian-Roth said in Fog 1.5.7 Locations Plugin pulling image from wrong storage node:
git checkout dev-branch
I’ve snapshotted both vms for the 2 Georgia Servers, GEO01VFOG01 and GEO01VFOG02 and uplifted both to latest dev version 1.5.7.109
KERNEL RAMDISK SIZE was already showing as 275000
Can see on this version it forces a root mysql password to be added.
Also that the fogstorage DB password used previously was deemed not secure enough so new one has been generated. From memory the storage node setup calls this user account so do we need to update all the storage nodes to use this pw? If so how or is it simply easiest to uplift all the nodes to dev instance?
regards Tom
-
@Kiweegie from the GUI you should be able to get the fogstorage password from FOG Configuration Page->FOG Settings->Storage Password. I forget the exact string to look for but should be pretty close
-
Thanks Tom, can see that in the GUI ok and it matches what was displayed on screen during installation.
FOG configuration page > FOG Settings > FOG Storage Nodes > STORAGENODE MYSQLPASS
My question was around whether the storage nodes need to be updated to reflect this change? Per this section in the installer unless I’m mistaken.
What is the username to access the database? This information is storage in the management portal under 'FOG Configuration' -> 'FOG Settings' -> 'FOG Storage Nodes' -> 'FOG_STORAGENODE_MYSQLUSER'. Username [fogstorage]: What is the password to access the database? This information is storage in the management portal under 'FOG Configuration' -> 'FOG Settings' -> 'FOG Storage Nodes' -> 'FOG_STORAGENODE_MYSQLPASS'. Password:
I’ve updated the 2 master nodes (normal and storage) in Georgia only at this stage and kicked off an image deployment task to a machine in Wisconsin. Its picking up the storage node from Alabama instead still.
To which end I’m going to kill that task and uplift all nodes onto the 1.5.709 dev branch and try again.
regards Tom
-
All storage node have now been uplifted to the latest dev version 1.5.709. I edited the /opt/fog/.fogsettings on each to amend to the new fogstorage creds first.
I’ve deployed image task to Wisconsin desktop now and no longer picking up Alabama storage node, this time it’s selected California.
Going back over my setup, each location is set with Use inits and kernels from this node option checked.
There is a location for each site with 2 in Georgia (head office), 2nd of which I had setup as location for Georgia but linked to the storage node. I don’t think this is actually required so have removed this location entry.
There are only 2 storage groups, Toys’RUs and Mattel with latter being replicated out to all the storage nodes.
All storage nodes therefore are pointed to the Mattel Storage group with exception of the Georgia Head office (normal FOG) server which points to Toys’RUs.
I “think” everything is set as it should be. Is there a log file which shows which storage node is being selected and why?
cheers Tom
-
@Kiweegie I am wondering why you updated the Georgia nodes first but possibly my description was a bit confusing. As long as things are up and running now with all nodes being on
dev-branch
that’s fine.I have gone through the code now a few more times but can’t see any obvious problems with it. Though it’s very hard doing this kind of debugging based solely on assumptions. Could you please connect to your database on the master node, run a query as follows and post the full output here:
shell> mysql -u root -p Password: ... mysql> use fog; ... mysql> SELECT * FROM location; ...
-
@Sebastian-Roth said in Fog 1.5.7 Locations Plugin pulling image from wrong storage node:
SELECT * FROM location;
+-----+--------------------+-------+-----------------+----------------+------------+---------------------+--------------+ | lID | lName | lDesc | lStorageGroupID | lStorageNodeID | lCreatedBy | lCreatedTime | lTftpEnabled | +-----+--------------------+-------+-----------------+----------------+------------+---------------------+--------------+ | 2 | Alabama | | 2 | 4 | fog | 2020-01-31 23:37:10 | 1 | | 3 | Connecticut | | 2 | 3 | fog | 2020-01-31 23:37:28 | 1 | | 4 | Louisiana | | 2 | 7 | fog | 2020-01-31 23:39:08 | 1 | | 5 | Arizona | | 2 | 6 | fog | 2020-01-31 23:39:32 | 1 | | 6 | California | | 2 | 8 | fog | 2020-01-31 23:41:10 | 1 | | 7 | Arkansas | | 2 | 5 | fog | 2020-01-31 23:41:35 | 1 | | 8 | Colorado Shop | | 2 | 9 | fog | 2020-01-31 23:42:51 | 1 | | 9 | Kentucky | | 2 | 12 | fog | 2020-02-01 01:22:42 | 1 | | 10 | South Dakota | | 2 | 11 | fog | 2020-02-01 01:23:37 | 1 | | 11 | Wisconsin | | 2 | 10 | fog | 2020-02-01 01:23:53 | 1 | | 12 | Georgia | | 1 | 1 | fog | 2020-02-01 01:24:45 | 1 | | 14 | Colorado Office | | 1 | 13 | kiweegie | 2020-02-01 15:34:10 | 1 | +-----+--------------------+-------+-----------------+----------------+------------+---------------------+--------------+
-
@Kiweegie Sorry, I’d need info from another query as well:
SELECT ngmID,ngmMemberName,ngmIsMasterNode,ngmGroupID,ngmIsEnabled,ngmHostname,ngmMaxClients FROM nfsGroupMembers;
-
@Sebastian-Roth said in Fog 1.5.7 Locations Plugin pulling image from wrong storage node:
SELECT ngmID,ngmMemberName,ngmIsMasterNode,ngmGroupID,ngmIsEnabled,ngmHostname,ngmMaxClients FROM nfsGroupMembers;
Here you go
+-------+---------------+-----------------+------------+--------------+----------------+---------------+ | ngmID | ngmMemberName | ngmIsMasterNode | ngmGroupID | ngmIsEnabled | ngmHostname | ngmMaxClients | +-------+---------------+-----------------+------------+--------------+----------------+---------------+ | 1 | GEO01VFOG01 | 1 | 1 | 1 | 10.166.136.199 | 10 | | 2 | GEO01VFOG02 | 1 | 2 | 1 | 10.166.136.198 | 10 | | 3 | CON01VFOG01 | 0 | 2 | 1 | 192.168.7.1 | 10 | | 4 | ALA02VFOG01 | 0 | 2 | 1 | 192.168.1.1 | 10 | | 5 | ARK01VFOG01 | 0 | 2 | 1 | 192.168.9.1 | 10 | | 6 | ARI01VFOG01 | 0 | 2 | 1 | 192.168.11.1 | 10 | | 7 | LOU01VFOG01 | 0 | 2 | 1 | 192.168.6.1 | 10 | | 8 | CAL01VFOG01 | 0 | 2 | 1 | 192.168.2.1 | 10 | | 9 | COL02VFOG01 | 0 | 2 | 1 | 192.168.4.1 | 10 | | 10 | WIS02VFOG01 | 0 | 2 | 1 | 192.168.10.1 | 10 | | 11 | SOU01VFOG01 | 0 | 2 | 1 | 192.168.8.1 | 10 | | 12 | KEN02VFOG01 | 0 | 2 | 1 | 192.168.5.1 | 10 | | 13 | COL01VFOG01 | 0 | 1 | 1 | 192.168.3.1 | 10 | +-------+---------------+-----------------+------------+--------------+----------------+---------------+
-
@Kiweegie Sorry for the delay! I have some time tomorrow to work on this. I’ll try to set things up as close to what you have and try to find what’s wrong.
-
@Sebastian-Roth Appreciate the time and effort Sebastian. Let me know if you need anything tested my side please.
regards Tom
-
In case it leads to any pointers or brainwaves.
I thought initially the main server was imaging fine as images sent from GEO01VFOG01 to devices within Georgia site pick up OK. However I can see an example this morning where a machine is trying to image from COL01VFOG01 (Colorado Office) which is the only storage node in the Toys 'R Us storage group. (I’ve updated my spreadsheet below which didn’t accurately reflect this).
In the greater number of cases imaging from Georgia office to Georgia machines is working but imaging to remote sites is pulling from the wrong storage node.
Looking at the dashboard I can the display for the Mattel Storage group displays 99.9999999
Instead of the split shown by the Toys 'R Us group
This could be purely cosmetic and the display was not meant to handle more than 99 clients. I presume this shows the number of client activity per storage group?
I can’t seem to find any logs for the plugins, are there any?
regards Tom
-
@Kiweegie Not sure if you saw the chat bubble in the top right corner yet. Trying to contact you via PM.
-
@Sebastian-Roth I did now, and have replied
-
Thanks to a support session from @Sebastian-Roth this evening we’ve confirmed that this is actually working as designed.
There are 2 things at play.
One: Location plugin
Albeit the location plugin being installed the active tasks list initially shows the task as linking to the wrong storage node. This is only until such time as the client boots and checks in then updates to show the correct storage node is actually being used.
Two: BIOS boot order on clients
In our case due to changes in the BIOS boot order on the remote site computers they were never seeing the FOG task on boot so the task still shows on the Active Tasks page as wrong storage node.
I will resolve the BIOS issue on the remote sites and then test again and confirm back that all is working as designed.
regards Tom
-
This post is deleted! -
@AlexPDX Please open a new topic as your request has nothing to do with the initial post as far as I can see. Just copy and paste your text from this.