Install hangs at "Ensuring node username and passwords match"
-
@george1421 I would like a central pxe server and storage nodes at the different locations.
-
@mpmackenna OK great,
So your main fog server, is that the one with slow web interface? If so what version of FOG are you running. There are some post install “fixes” that need to be made to make fog happy again (even before we talk storage nodes).
-
@george1421 1.5.4, yes, that is the one with the unresponsive web interface. Thank you for your help!
-
@mpmackenna There are a few steps that you need to do (these will be addressed when fog 1.5.5 is released.
For these changes you will need console access to the master fog server linux command prompt:
- Change to the /etc directory from the fog server linux command prompt.
Search for www.conf file. It can be in a number of locations depending on what version of php is installed. Use this command.
find /etc -name www.conf (hopefully you will only find one)
- Edit that file file and ensure these settings are accurate. Don’t just add them since all should be there except
php_admin_value[memory_limit] = 256M
you will need to add that entry.
php_admin_value[memory_limit] = 256M pm.max_requests = 2000 pm.max_children = 35 pm.min_spare_servers = 5 pm.start_servers = 5
- Save and exit your text editor.
- Reboot the fog server.
Now you need to roll back the version of the FOG kernel installed on the master node. The deployed version with FOG 1.5.4 (kernel 4.17.0) has something not right with it. Under certain circumstances it take between 3-5 minutes to create the disk structure. The rest of imaging is fine, its just related to creating the disk structure. You can roll back the kernel from the FOG Settings->FOG Kernel page.
Also while you are in the fog settings area, install the Location plugin. You will need that a bit later in the setup.
- Change to the /etc directory from the fog server linux command prompt.
-
@mpmackenna said in Install hangs at "Ensuring node username and passwords match":
@george1421 1.5.4, yes, that is the one with the unresponsive web interface.
If the last post doesn’t solve the problem, then we need to ensure that php-fpm is running correctly. The FOG developers switched from the native php engine in apache to php-fpm in version 1.5.3 to address the slow response they were seeing from the new UI in the 1.5.x version. The settings below were discovered after 1.5.4 was released. These settings help with tweaking php-fpm under heavy load. We have also seen depending on the linux distro that php-fpm some times is not hooked in correctly with apache, which still gives the user an unhappy experience with the UI.
To see if php-fpm is hooked into apache correctly, launch
top
from the linux console. Then sort by CPU usage by keying in P. The top 2 or 3 processes should be php-fpm. If it is apache then something isn’t hooked correctly. -
@george1421 Got this when trying to roll back the kernel “Type: 2, File: /var/www/html/fog/lib/fog/fogftp.class.php, Line: 463, Message: ftp_login(): Login incorrect., Host: 10.10.2.8, Username: fog”. I know I’ve changed the FOG user password when trying to setup the storage node. Now I am unable to ssh to the box with the fog account using the passwords that I thought it was set to a while back. I also tried every other password I could think it may be including what used to the default of “password”. I still have ssh access with other accounts and root access. Should I set the FOG account password to something? Thank you!
Also, the changes made to www.conf made all the difference with the web interface. It is working well at this point. -
@mpmackenna Tell me if you fiddled with the linux user called
fog
(such as changed/or reset the password). That’s not the default webui admin user fog this is the linux userfog
.If you did, shame on you. You will need to go through the process of resyncing the linux user (and may be part of your issue with the storage node): https://forums.fogproject.org/topic/11203/resyncing-fog-s-service-account-password
-
@george1421 Oh I fiddled with it. Charlie Daniels has got nothing on me. I am completing the steps you outlined to repair. Thank you!
Update: I ran your repair and the installer finished properly. I then was able to roll back the kernel without issue. -
@mpmackenna Yeah, that account is a fog service account used by the fog application and should only be managed by the fog installer script. That point should be better documented on the wiki. You are not he first, or the last to have this issue.
-
@george1421 Things seem to be running well now. Thank you so much for your help! I am going to go back read the docs for adding a storage node and attempt to attach a clean install of a storage node to my working primary installation. Should I mark this thread resolved and start a new one if I have an issue with the storage node? Or, perhaps you/moderator mark as resolved? I don’t see that option. Thanks again!
-
@mpmackenna Lets keep working through on this one.
Now when/if you have a storage node. If you are going to change modes on that Normal Node -> Storage Node. You need to delete the /opt/fog/.fogsettings file. Understand when you do that you may have issues with the local linux
fog
user accont. The fog installer script should keep everything sane, but if things go wrong then you have a place to look.Also you will need to make the edits to the www.conf file on every storage node too. Also you will need to downgrade the FOS kernel. I know its a pain, but it is what we have until 1.5.5 is released.
You can copy from your master node to the storage nodes. The files are in /var/www/html/fog/service/ipxe You can use scp to copy and the files you need to copy are bzImage and bzImage32 to all of your storage nodes in the same location.
-
@george1421 said in Install hangs at "Ensuring node username and passwords match":
To see if php-fpm is hooked into apache correctly, launch top from the linux console. Then sort by CPU usage by keying in P. The top 2 or 3 processes should be php-fpm. If it is apache then something isn’t hooked correctly.
I checked this and it seems to be correct. Top processes are php-fpm7.1. Thanks!
-
@george1421 said in Install hangs at "Ensuring node username and passwords match":
@mpmackenna Lets keep working through on this one.
Now when/if you have a storage node. If you are going to change modes on that Normal Node -> Storage Node. You need to delete the /opt/fog/.fogsettings file. Understand when you do that you may have issues with the local linux
fog
user accont. The fog installer script should keep everything sane, but if things go wrong then you have a place to look.Also you will need to make the edits to the www.conf file on every storage node too. Also you will need to downgrade the FOS kernel. I know its a pain, but it is what we have until 1.5.5 is released.
You can copy from your master node to the storage nodes. The files are in /var/www/html/fog/service/ipxe You can use scp to copy and the files you need to copy are bzImage and bzImage32 to all of your storage nodes in the same location.
That sounds great! I am going to run the installer on my Storage Node. This server will be the Normal/Master install. I will make the edits listed and let you know how it goes. Thank you!
-
@george1421 I followed your instructions and the new Storage Node seems to be working. I put them in the same storage group and I am replicating the images to the new node. Thank you for your help!
-
@george1421 I checked my storage node this morning and it appears some of my images did not sync. I have a number of them that did but there are definitely some that are missing. Syncing has stopped so it’s not matter of not enough time to sync. Can you point me to a document or offer assistance as to how I can troubleshoot this issue? My first thought was to just use rsync but I am concerned that perhaps there are files in the folder that are supposed to differ between a master node in a storage group and a slave node in the same group? Thank you!
-
@mpmackenna If there is an image definition for the image and its set to replicate, then it should replicate for you. You CAN use rsync if you want to seed the remote repository.
There are replication log files in /opt/fog/log that might give you a clue to why certain images were skipped.
IMO rsync might be a better tool for FOG to use in the future for moving files than its current replication method. But that’s not up to me to decide.
-
@george1421 Use to be all the FOG services were listed with a “service --status-all” command with previous versions of FOG, but now I don’t see them. If I want to stop the replication service while I run rsync how would I do that? Thank you!
-
@mpmackenna If you have a systemd based system then its.
sudo systemctl stop FOGImageReplicator
If you have a systemv based system its
sudo system FOGImageReplicator stop
-
@george1421 Got it. I am on a systemd based system. I am still getting used to systemd. Why would services that are listed in systemd not be listed under the output of the services command (not really asking, I can Google that on my own)? Just venting as I continue to relearn all that I thought I knew about Linux. Thanks again! Don’t think I ever would have got this sorted out without your help.
-
@george1421 I replicated my images using rsync and verified that all images are in both locations. I also have a backup of my images off of both of the nodes. I went in and I created a new Storage Group. I associated the Storage Node that is not the Master Server with the new Storage Group. I then associated a number of my images with that Storage Group. My goal is to have some images only reside on the node in the new Storage Group and some only reside on the other Storage Group. I set both servers to be the Master Node for their Storage Group. Based on the warning I read about setting a server to the Master Node I half-way expected for any images that were not associated with that Storage Group to be purged from the Master Node but that didn’t seem to happen, which is fine. Is it now safe to go back and purge images from servers via cli if they reside in an image store on a node that is not in the Storage Group that is associated with that image? Thank you!