Host/Group replication between FOG Servers
-
@Tom-Elliott I just upgraded the Main and reinstalled the storage node yesterday afternoon
-
@adukes40 You found the right page in that second post. The page should have a lot more. For instance, it should tell if the host is registered or not, and give a lot more information…
Now, swap the IP with the main server’s IP and try the URL again. There should be a lot more displayed.
Also, I think this issue is being caused by mysql credentials somehow being bad.
I have an idea… try to manually re-set the fogstorage mysql password using this:
'fogstorage'@'localhost' IDENTIFIED WITH mysql_native_password BY '';
Then ensure remote access is enabled for the fogstorage account with this:
GRANT ALL PRIVILEGES ON fog.* TO 'fogstorage'@'%' IDENTIFIED WITH mysql_native_password BY '' WITH GRANT OPTION;
Then, set the mysql password in the node’s
/opt/fog/.fogsettings
file to just blank, like empty quotes or whatever the wrapper is. Then re-run the installer you have on the node again, and see what happens - also re-try manually connecting to the DB on the main server FROM the storage node, using a blank password.Of course, this is temporary - you may set a password later. I just want to get it working for troubleshooting purposes.
-
When i do the Main server I get this: I will try the other things in a bit. Do I simply just copy and paste those commands in putty and which server do I do this on?
#!ipxe
set fog-ip 10.103.72.49
set fog-webroot fog
set boot-url http://${fog-ip}/${fog-webroot}
cpuid --ext 29 && set arch x86_64 || set arch i386
goto get_console
:console_set
colour --rgb 0x00567a 1 ||
colour --rgb 0x00567a 2 ||
colour --rgb 0x00567a 4 ||
cpair --foreground 7 --background 2 2 ||
goto MENU
:alt_console
cpair --background 0 1 ||
cpair --background 1 2 ||
goto MENU
:get_console
console --picture http://10.103.72.49/fog/service/ipxe/bg.png --left 100 --right 80 && goto console_set || goto alt_console
:MENU
menu
colour --rgb 0xff0000 0 ||
cpair --foreground 1 1 ||
cpair --foreground 0 3 ||
cpair --foreground 4 4 ||
item --gap Host is NOT registered!
item --gap – -------------------------------------
item fog.local Boot from hard disk
item fog.memtest Run Memtest86+
item fog.reginput Perform Full Host Registration and Inventory
item fog.reg Quick Registration and Inventory
item fog.quickimage Quick Image
item fog.multijoin Join Multicast Session
item fog.sysinfo Client System Information (Compatibility)
choose --default fog.local --timeout 3000 target && goto ${target}
:fog.local
sanboot --no-describe --drive 0x80 || goto MENU
:fog.memtest
kernel memdisk iso raw
initrd memtest.bin
boot || goto MENU
:fog.reginput
kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 keymap= web=10.103.72.49/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=manreg
imgfetch init_32.xz
boot || goto MENU
:fog.reg
kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 keymap= web=10.103.72.49/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=autoreg
imgfetch init_32.xz
boot || goto MENU
:fog.quickimage
login
params
param mac0 ${net0/mac}
param arch ${arch}
param username ${username}
param password ${password}
param qihost 1
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
:fog.multijoin
login
params
param mac0 ${net0/mac}
param arch ${arch}
param username ${username}
param password ${password}
param sessionJoin 1
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
:fog.sysinfo
kernel bzImage32 loglevel=4 initrd=init_32.xz root=/dev/ram0 rw ramdisk_size=127000 keymap= web=10.103.72.49/fog/ consoleblank=0 rootfstype=ext4 loglevel=4 mode=sysinfo
imgfetch init_32.xz
boot || goto MENU
:bootme
chain -ar http://10.103.72.49/fog/service/ipxe/boot.php##params ||
goto MENU
autoboot -
@adukes40 The script from the main server looks good - that’s how it should look from the node.
Run them on the node. Yes, just copy paste.
-
@Wayne-Workman this is what i get:
Sorry, I am new with Linux, so it is all a learning curve for me.
-
@adukes40 Try it with a simple password. It goes between the empty quotes.
-
All I get is invalid syntax. No clue what I am doing wrong.
-
@adukes40 I sent you a message. Top right talk bubble.
-
I remoted in to help.
bind address 127.0.0.1 was set in my.cnf, I commented that and then we were able to connect to mysql from the remote node using the
fogstorage
username and password.However, when visiting the storage node’s boot.php file with a registered MAC address appended like this:
http://10.106.2.149/fog/service/ipxe/boot.php?mac=b8:ac:6f:3d:6e:a4It spits just this out:
set fog-ip set fog-webroot set boot-url http://${fog-ip}/${fog-webroot}
also, an apache2 error pops up in the node’s logs:
[Tue May 24 16:05:32.244071 2016] [:error] [pid 20780] [client 10.106.10.5:12079] PHP Fatal error: Call to a member function lastInsertId() on null in /var/www/html/fog/lib/db/pdodb.class.php on line 124
FOG version 7829
Both main and node is Ubuntu 14.04 LTS. -
I just spoke briefly with the @Senior-Developers and they said that the storage node installation actually communicates with the main server’s DB to do certain things.
What this means is because of the bind address previously, those things the installer needed to do didn’t get done.
So, now that we’ve sorted out the connectivity issues, all you should need to do is re-run the installer on the storage node (no rebuild is required at all!), and then things should be working for you.
-
After re-running the installer on the storage node and rebooting the main server, this is what’s in the replication logs, and it just hangs and never replicates. @Tom-Elliott
[05-24-16 9:37:59 pm] | Image name: testing [05-24-16 9:37:59 pm] * Found Image to transfer to 2 node(s) [05-24-16 9:37:59 pm] | I am the only member [05-24-16 9:37:59 pm] | Image Name: testing [05-24-16 9:37:59 pm] * Not syncing Image between group(s) [05-24-16 9:37:59 pm] | We are node name: CA - MasterNode [05-24-16 9:37:59 pm] * We have node ID: #1 [05-24-16 9:37:59 pm] | We are group name: default [05-24-16 9:37:59 pm] * We are group ID: #1 [05-24-16 9:37:59 pm] * Starting Image Replication. [05-24-16 9:37:59 pm] * Starting service loop [05-24-16 9:37:59 pm] * Checking for new items every 600 seconds [05-24-16 9:37:59 pm] * Starting ImageReplicator Service [05-24-16 9:37:59 pm] Interface Ready with IP Address: 167.21.42.13 [05-24-16 9:37:59 pm] Interface Ready with IP Address: 10.103.72.49
-
I’ve verified both the main and node’s FTP credentials.
I’ve toggled the master node for both, just to reset it.
I’ve made test images and test snapins but both hang at exactly where lftp should start.
I’ve also uninstalled and reinstalled lftp. -
Found this interesting message in /var/log/syslog:
May 24 18:20:30 MSDCATS09 kernel: [ 2560.753652] init: vsftpd main process (4005) killed by TERM signal
-
snapins attempt to replicate, it seems… here’s the lftp log. But images are not.
root@MSDCATS09:/var/log# cat /var/log/xferlog Tue May 24 17:44:36 2016 1 10.103.72.49 12 /opt/fog/snapins/test.bat b _ i r fog ftp 0 * c Tue May 24 18:14:20 2016 1 10.103.72.49 12 /opt/fog/snapins/test.bat b _ i r fog ftp 0 * c
-
These log entries are also interesting:
cat /var/log/auth.log | grep vsftpd May 24 17:36:01 MSDCATS09 vsftpd[21829]: pam_unix(vsftpd:auth): check pass; user unknown May 24 17:36:01 MSDCATS09 vsftpd[21829]: pam_unix(vsftpd:auth): authentication failure; logname= uid=0 euid=0 tty=ftp ruser=apc rhost=10.103.67.62
-
@Tom-Elliott helped out, it was a bug in replication. It should be fixed in the current.