Storage Node Issues
-
Hi All,
I am having real trouble with storage nodes.
I have a single storage group with two servers in, my main fog server and a storage only install server. I have, I think, added the storage node to the storage group. I have set the main FOG server as master and the node as secondary.
Replication isn’t happening and i think its because of FTP credentials and setup.
Which password should be in which location on the two boxes?
Thanks! -
On each fog server (normal or storage node) there is a linux account called
fog
. This is not the default web gui admin calledfog
but a linux account by the same name. Your storage node configuration needs to match the user ID and password correctly for the linux account. If these were complete fog installs the linux userfog
’s password will be in a hidden file in /opt/fog directory called .fogsettings. Review this file on each fog server and be sure the storage node configuration is consistent with these files. -
Is that the user called ‘fog’ with the log alpha numeric password created at install or the snmysql user?
-
Here is what I am getting…
[11-07-17 12:06:57 pm] * Type: 2, File: /var/www/fog/lib/fog/fogftp.class.php, Line: 463, Message: ftp_login(): Login incorrect., Host: 10.2.0.55, Username: fog [11-07-17 12:06:53 pm] | File Name: postdownloadscripts [11-07-17 12:06:53 pm] * Found Image to transfer to 1 node [11-07-17 12:06:53 pm] | Replicating postdownloadscripts [11-07-17 12:06:53 pm] * Attempting to perform Group -> Group image replication. [11-07-17 12:06:53 pm] * We are node ID: 1. We are node name: DefaultMember [11-07-17 12:06:53 pm] * We are group ID: 1. We are group name: Lakeside [11-07-17 12:06:53 pm] * Starting Image Replication. [11-07-17 12:06:53 pm] * Starting service loop [11-07-17 12:06:53 pm] * Checking for new items every 600 seconds [11-07-17 12:06:53 pm] * Starting ImageReplicator Service [11-07-17 12:06:53 pm] Interface Ready with IP Address: port-34.marketmakers.co.uk [11-07-17 12:06:53 pm] Interface Ready with IP Address: 213.131.182.34 [11-07-17 12:06:53 pm] Interface Ready with IP Address: 127.0.1.1 [11-07-17 12:06:53 pm] Interface Ready with IP Address: 127.0.0.1 [11-07-17 12:06:53 pm] Interface Ready with IP Address: 10.2.0.60
-
I am also finding that when I go to 10.2.0.55 (the IP of my storage node) in a browser I am not seeing a management console, the page just times out. Does that mean the node itself isn’t right or would a storage node not have a web UI?
Thanks again! -
Progress!!! Still not working tho ;-(
mirror: d1p2.img: Fatal error: max-retries exceeded [11-07-17 12:24:56 pm] * Started sync for Image Win10-Desktop lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator.Win10-Desktop.transfer.POR-FOG-SN1.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/Win10-Desktop" "/images/Win10-Desktop"; exit' -u fog,[Protected] 10.2.0.55 [11-07-17 12:24:56 pm] | CMD: [11-07-17 12:24:56 pm] * Starting Sync Actions [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 16024639479 0 /images/Win10-Desktop/d1p2.img "A valid database connection could not be made" [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 355111921 0 /images/Win10-Desktop/d1p1.img "A valid database connection could not be made" [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 190 0 /images/Win10-Desktop/d1.partitions "A valid database connection could not be made" [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 0 0 /images/Win10-Desktop/d1.original.swapuuids "A valid database connection could not be made" [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 30 0 /images/Win10-Desktop/d1.original.fstypes "A valid database connection could not be made" [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 190 0 /images/Win10-Desktop/d1.minimum.partitions "A valid database connection could not be made" [11-07-17 12:24:56 pm] * Deleting remote file: [11-07-17 12:24:56 pm] | Files do not match. [11-07-17 12:24:56 pm] | 1048576 0 /images/Win10-Desktop/d1.mbr "A valid database connection could not be made" [11-07-17 12:24:55 pm] * Deleting remote file: [11-07-17 12:24:55 pm] | Files do not match. [11-07-17 12:24:55 pm] | 1 0 /images/Win10-Desktop/d1.fixed_size_partitions "A valid database connection could not be made" [11-07-17 12:24:55 pm] | Image Name: Win10-Desktop [11-07-17 12:24:55 pm] * Found Image to transfer to 1 node [11-07-17 12:24:55 pm] * Attempting to perform Group -> Nodes image replication. [11-07-17 12:24:55 pm] | There are no other members to sync to. [11-07-17 12:24:55 pm] | Image Name: Win10-Desktop [11-07-17 12:24:55 pm] * Not syncing Image between groups [11-07-17 12:24:55 pm] * Started sync for Image dev/postinitscripts lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.POR-FOG-SN1.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/dev/postinitscripts" "/images/dev/postinitscripts"; exit' -u fog,[Protected] 10.2.0.55 [11-07-17 12:24:55 pm] | CMD: [11-07-17 12:24:55 pm] * Starting Sync Actions [11-07-17 12:24:55 pm] * Deleting remote file: /images/dev/postinitscripts/ [11-07-17 12:24:55 pm] | Files do not match. [11-07-17 12:24:55 pm] | 249 0 /images/dev/postinitscripts/fog.postinit "A valid database connection could not be made" [11-07-17 12:24:54 pm] | File Name: dev/postinitscripts [11-07-17 12:24:54 pm] * Found Image to transfer to 1 node [11-07-17 12:24:54 pm] | Replicating postinitscripts [11-07-17 12:24:54 pm] * Started sync for Image postdownloadscripts lftp -e 'set xfer:log 1; set xfer:log-file "/opt/fog/log/fogreplicator..transfer.POR-FOG-SN1.log";set ftp:list-options -a;set net:max-retries 10;set net:timeout 30; mirror -c -r -R --ignore-time -vvv --exclude ".srvprivate" "/images/postdownloadscripts" "/images/postdownloadscripts"; exit' -u fog,[Protected] 10.2.0.55 [11-07-17 12:24:54 pm] | CMD: [11-07-17 12:24:54 pm] * Starting Sync Actions [11-07-17 12:24:54 pm] * Deleting remote file: /images/postdownloadscripts/ [11-07-17 12:24:54 pm] | Files do not match. [11-07-17 12:24:54 pm] | 235 0 /images/postdownloadscripts/fog.postdownload "A valid database connection could not be made" [11-07-17 12:24:54 pm] | File Name: postdownloadscripts [11-07-17 12:24:54 pm] * Found Image to transfer to 1 node [11-07-17 12:24:54 pm] | Replicating postdownloadscripts [11-07-17 12:24:54 pm] * Attempting to perform Group -> Group image replication. [11-07-17 12:24:54 pm] * We are node ID: 1. We are node name: DefaultMember [11-07-17 12:24:54 pm] * We are group ID: 1. We are group name: Lakeside [11-07-17 12:24:54 pm] * Starting Image Replication. [11-07-17 12:24:54 pm] * Starting service loop [11-07-17 12:24:54 pm] * Checking for new items every 600 seconds [11-07-17 12:24:54 pm] * Starting ImageReplicator Service```
-
@coxm It will be the entry just called password, I believe (as said from memory).
You can also test by using a windows client and attempting to ftp to the remote storage node with the user id of
fog
and the value from thepassword
field in the .fogsettings file. -
@coxm said in Storage Node Issues:
I am also finding that when I go to 10.2.0.55 (the IP of my storage node) in a browser I am not seeing a management console, the page just times out. Does that mean the node itself isn’t right or would a storage node not have a web UI?
FYI: Storage nodes don’t have a web ui, they are managed from the master node.
-
@coxm said in Storage Node Issues:
16024639479 0 /images/Win10-Desktop/d1p2.img “A valid database connection could not be made”
Well this one is a bit different… but may lead to what happened here. Lets assume you manually created the entry in the master node for the storage node? If yes, that may explain what is going on. When you install the storage node, it is suppose to reach out to the master node and connect to its database. The storage node doesn’t have its own mysql database, it uses the root FOG server’s db. If that db connection fails then the remote storage node won’t create its own record in the database, which kind of leads us down this broken path.
What host OS is on your fog master node?
-
@george1421 It looks like the issue now is a Db connection failure? Any thoughts?
-
@coxm I have lots of thoughts, it would help to narrow down to just a few…
What host OS is the master fog server using?
-
@george1421 They are both runing on Ubuntu.
The other thing I am not sure about being correct at the moment is the image and ftp path set in the storage management section. I am wanting to store the images on the storage node on a 2nd VHD, not the system drive but not sure how to reference it in the FOG management UI
-
@coxm OK lets work on one issue at a time (sorry about the delay, its early in the AM here and I’m trying to get my morning stuff done).
OK lets assume that Ubuntu is “helping us” again. Can you confirm that mysqld.cnf (in either /etc/mysql or /etc/mysql/mysql.conf.d) contains this line (on your master fog server).
bind-address = 0.0.0.0
It may be the default of :
bind-address = 127.0.0.1
Which really doesn’t help us connect remotely.
If you need to change the setting then run this command to restart mysqlsystemctl restart mysql.service
Once mysql comes up run the following command.
lsof -i -P | grep :3306
You should get something that looks like this:
# lsof -i -P | grep :3306 mysqld 1574 mysql 64u IPv4 20354 0t0 TCP 192.168.1.15:3306 (LISTEN)
Where
192.168.1.15
should be the IP address of your fog server or the fog server’s name -
@george1421 It was commented out and set to 127…
I have un commented and set to 10.2.0.60, that IP of my main fog server. I now get this…
[11-07-17 1:29:55 pm] * * Image replication is globally disabled mirror: Access failed: 550 Failed to change directory. (/media/ubuntuadmin/Data/images/Win10-Desktop)
-
@coxm said in Storage Node Issues:
/media/ubuntuadmin/Data/images/Win10-Desktop
How do you have things setup there? This is a non-standard location.
Are your images stored on an external (to the master fog server) disk?
Also setting
bind-address = 0.0.0.0
means to connect the mysql server to all interfaces. It should work ok if you set it to the single interface on your fog server, but generally you would just set it tobind-address = 0.0.0.0
and not worry about it. -
I would like to have the images on the main FOG server stored in the default local location.
On the storage nodes I want to have a smaller drive for system and fog install and a second disk for image storage, that way images cant fill up the system drive and cause me problem down the line.
-
@coxm Lets switch over to IM since its quicker to get Q&A turn around. We’ll switch back to the forum once the Q&A session is over. Look for the talk bubble on the fog forum tool tray for additional questions.
But the first question is around: /media/ubuntuadmin/Data/images/Win10-Desktop is that on the master node or the storage node?
-
@CoxM Did you solve the issue together with the help of @george1421 ?