Storage nodes not deploying images



  • Server
    • FOG Version: 1.4.0-RC-4
    • OS: Ubuntu 16.04 LTS
    Description

    I installed Fog 1.3.5 on my normal server, added a couple of images and then deployed them without problems. This all went incredibly easily. I then created a new server and added a storage node, sorted out getting the images to replicate, but was unable to deploy from this new storage node - Everything still deployed from the main (normal) server. I made certain that max clients was set to 3 on each, yet when I deployed 6 machines at the same time all stayed on the main server. I was frustrated, but after a while found that the new storage node had installed with version 1.4.0, where the main server was running 1.3.5. I upgraded the main server and rebooted them both, happily I found it now push 3 deployments to each server (although the ones deploying from the storage node were quite a bit slower than those deploying on the main server).

    I then built another server, added fog and configured it as a storage node, set it up on for management site and the images copied over. I changed all the storage nodes to ‘max clients: 2’ and rebooted all three servers. It has now reverted to all 6 deployments pushing from the main server and nothing from the storage nodes.

    All three servers are running on Windows Hyper-V, all are running Ubuntu 16.04 LTS and all running Fog 1.4.0. Only the main server is set to ‘Master node’ right now.

    Any ideas why my two storage node servers aren’t picking up deployments and why the main server is accepting 6 clients when it is set to ‘Max clients: 2’?

    Thanks in advance

    Peter


  • Moderator

    @george1421 said in Storage nodes not deploying images:

    @EuroEnglish AFAIK, multicast sessions are only hosted by the master server. Unless something has changed in the multicast bits only the master server sends out the image.

    This has changed, the non-master storage nodes are now able to multicast too. It’s something Tom worked on maybe 6 months ago. It was so multicast wouldn’t go across the WAN link and conform to the settings set in Location Management.



  • @george1421
    I would imagine that the most limiting factor would be the 1Gb limitation on the master server NIC itself. I looked into bonding a couple of NICs on my master server, however most reviews seems to suggest that this actually causes problems and can slow things down - As they would share a common IP address in Ubuntu/Fog, but two different IP addresses in the Hyper-V host computer it is running on, therefore confusion about packets sending and receiving out of order.

    Ultimately the rest of the network is running 1Gb fiber as well, so even increasing the server to 2 bonded 1Gb NIC’s would still bottleneck.

    I guess at this stage I am running about as fast a possible with Fog, and it is much faster than my old WDS system. I can remove all the storage nodes, as we have a single campus and deploy without registering clients, so everything will go to the master server anyway. All I need to do is make sure that I make a good backup of my Hyper-V virtual machine, then make checkpoints before doing any updates to both Ubuntu and/or Fog server, that way I am covered in the event of a crash.

    This imaging solution is far superior to WDS and Ghost, creating and capturing images is easier and faster, deploying is much faster. It will save weeks of work over this summer alone, and moving forwards it will certainly get better and better. I only wish I had know about this a few years ago, I would have far less grey hairs ;-)

    I am going to mark this as resolved, given that I was trying to do something it simply wasn’t built to do that way. However I really want to thank both you and Wayne for your help, it has helped find and resolve some other issues I didn’t even know I had. Plus given me a better understanding of Ubuntu and Fog, which is just as important for a newbie like myself to both systems.

    Regards

    Peter


  • Moderator

    @EuroEnglish said in Storage nodes not deploying images:

    however both my 2 storage nodes seemed to be receiving the same amount of data

    Understand a multicast will be received by all hosts on your network unless you are running igmp snooping or multicast routing in sparse mode.

    Your bandwidth will fluctuate as your clients consume the image. The faster systems will have to wait for the slower systems to consume the image. This is because there is a single data stream consumed by all.

    create a larger number of storage nodes to increase the combined available bandwidth.

    Remember with a multicast there is 1 data stream with only 1 talker and X number of listeners. Adding storage nodes will not make anything work better, faster, brighter. You can multicast to 1 node or 1000 nodes, it consumes the same bandwidth (well that is stretching it a bit [because there are ACK packets sent from the clients back to the FOG server], but I hope you understand).

    If you want to go faster get a fast FOG server running on SSD disks. Switch the data compression from gzip to zstd and have fast clients (to consume the image faster).


  • Moderator

    @EuroEnglish AFAIK, multicast sessions are only hosted by the master server. Unless something has changed in the multicast bits only the master server sends out the image.



  • @Wayne-Workman

    Hi Wayne,
    I just created a simple multicast session, a great suggestion from George, for 6 hosts as a test, it ran really well and seems much faster than 6 individual unicast sessions. I am wondering how Fog processes multicast sessions, when the multicast was running I monitored the Fog Management home page and noticed that my main server was the only thing showing transmit data, however both my 2 storage nodes seemed to be receiving the same amount of data. Does multicast push from the main server to the hosts? Or, does the main server push to the storage nodes and they distribute the session? My bandwidth was fluctuating between 500 Mbps and 3000 Mbps, which tells me that all three servers must be pooling the multicast session, but maybe the bandwidth monitor on the dashboard isn’t accurate?

    I am trying to work out, once I start deploying 20 or more hosts at a time, what the best method would be. If multicasting utilizes the normal server and all storage nodes I can create a larger number of storage nodes to increase the combined available bandwidth. If multicasting only uses the normal server though, I may as well not have more than one storage node (a good way to make sure I have backups of images in case of normal server failure).

    I am not finding any information online about how the multicasting distributes workload and this could define whether smaller unicast deploys or larger multicast would work better for me.

    Thanks again for your help, really helped on the issues the Wiki didn’t cover.

    Regards

    Peter

    Thanks again for all your help with this.

    Peter


  • Moderator

    @EuroEnglish said in Storage nodes not deploying images:

    I am not registering my hosts and simply selecting deploy, therefore it seems to be ignoring settings as the hosts aren’t running as clients. Could that be right?

    This is why random storage nodes are being selected for imaging despite location settings - The location plugin only works with registered hosts. FOG for all intensive purposes doesn’t really care a lot about non-registered hosts. All of FOG’s features come to life when hosts are registered.



  • @george1421

    Hi George,
    I just created a simple multicast session for 6 hosts as a test, it ran really well and seems faster than 6 individual unicast sessions. I am wondering how Fog processes multicast sessions, when the multicast was running I monitored the Fog Management home page and noticed that my main server was the only thing showing transmit data, however both my 2 storage nodes seemed to be receiving the same amount of data. Does multicast push from the main server to the hosts? Or, does the main server push to the storage nodes and they distribute the session? My bandwidth was fluctuating between 500 Mbps and 3000 Mbps, which tells me that all three servers must be pooling the multicast session, but maybe the bandwidth monitor on the dashboard isn’t accurate?

    Thanks again for all your help with this.

    Peter



  • @Wayne-Workman

    Hi Wayne,
    After a few small issues sorting out user names and passwords, I reran the Fog Installer and then set the ownership/permissions. That all looks good now. Thanks for that help.

    I ran into a problem with my second storage node and deleted it from Storage Management, then added it back in, which put it at the top of the list in “All storage nodes”. Now when I deploy to the 6 computers all of them deploy from this storage node, previously all pulled from my main server (which was at the top of the storage management list before).

    I think that my problem is that I am misunderstanding the ‘Max Clients’ option, I presumed that it would send an image to the host from one node, until it hits the max client number, then start sending from the next node automatically, and so on. I am not registering my hosts and simply selecting deploy, therefore it seems to be ignoring settings as the hosts aren’t running as clients. Could that be right?

    Thanks again

    Peter


  • Moderator

    @EuroEnglish The fog installer along with a couple extra commands can fix these things.

    SSH into each box, become root with sudo -i
    Make sure that inside /opt/fog/.fogsettings (as root) the username field is set to fog. Then re-run the fog installer (again, as root). This will fix the passwords & user names both on the local system & in the DB. Then on each box, reset ownership & permissions of the images with chown -R fog:root /images;chmod -R 777 /images



  • @Wayne-Workman

    I have deleted the ‘Default’ storage group, it will never be used and so I guess there is no point leaving it there. I will add another group, ‘Teacher’, to separate the different images, but that can wait until I have everything running correctly. That should clear up those log file entries I would imagine.



  • @george1421

    Well spotted on the owner difference, I didn’t even notice that one - Strange as it pretty obvious now you mentioned it. All the image replication was done by Fog itself, however it could be that I messed something up when I was setting up the different users on the different servers - Again, could be part of the problem.


  • Moderator

    @EuroEnglish The other thing I see is file ownership is different on each system too. That’s not right.

    Fog will use the user account listed for each storage node to copy the files from the master node to the remote nodes. This is done over ftp from the master server in the storage group to all slave nodes in the same storage group. Systems defined in FOG but in a different storage group will be left out of the replication cycle.

    Did you seed the files on each storage node or did FOG do this?



  • @george1421

    Based on your thoughts about the Image replication log files I used Webmin to check the Images folder on each server. It seems that you might be correct, it shows that there is a small file size difference between the main server and the two storage servers. It also seems that rights didn’t transfer the same, both storage servers have different rights to the files.

    I marked each file manager with the server for reference:

    0_1491955601541_Fog File manager.jpg


  • Moderator

    @george1421 He has another group called default that doesn’t have any nodes in it. I know from the screenshot he posted for the Images Storage Groups. default is listed there. I’m guessing that’s what the logs are referring to.


  • Moderator

    @george1421 Something else to consider is that your image push time will be less than 5 min per system per unicast. Can you stage these systems (next system) that fast? It will take longer for windows to install than it will to push out 5 systems in series.



  • @george1421

    Thanks for the idea regarding multicasting, I will look into setting that up in the Wiki - Seems almost everything you would ever need about Fog is there somewhere. We were running WDS on a Windows server for the last few years, but something went wrong and it will not capture new images. Still deploys, but not a good idea to have old images that take longer to update than it did to deploy the image ;-)

    I was researching how to fix the WDS server and came across a forum for Fog, this solution is a life-saver, no more sysprep and capturing images is much easier. I also find that deployment is much faster with Fog. We image the machines over the summer break, working in a high school, and bandwidth isn’t much of a problem, individual 1Gb Ethernet to each server and fiber point to point between the server room switches and my imaging location switches. However, I agree that multicasting might still speed things up.


  • Moderator

    @EuroEnglish Sorry to keep asking you for screen shots and more screen shots. The one I see missing is the storage group association. All the other ones look good.

    Just be aware that the storage nodes and location plugins were intending for a multi site or multi building(subnet) setup where you would dedicate certain workstation to specific locations. Its not really setup to do what I’m calling overflow imaging. (you deploy to server 1 until max clients is reached, then the next client is assigned to another storage node until its max client level is reached and so on). The storage node concept doesn’t work that way.



  • @Wayne-Workman

    Hi Wayne,
    Here is the location management screenshot.

    0_1491953785511_Fog Location management.jpg


  • Moderator

    looking at the log, it might appear that your storage groups are not setup correctly.

    [04-11-17 4:06:27 pm] | Image Name: Student-E460
    [04-11-17 4:06:27 pm] * Found Image to transfer to 2 nodes
    [04-11-17 4:06:27 pm] * Attempting to perform Group -> Nodes image replication.
    [04-11-17 4:06:27 pm] | There are no other members to sync to.
    [04-11-17 4:06:27 pm] | Image Name: Student-Twist
    [04-11-17 4:06:27 pm] * Not syncing Image between groups
    [04-11-17 4:06:27 pm] | There are no other members to sync to.
    [04-11-17 4:06:27 pm] | Image Name: Student-T440-Old
    [04-11-17 4:06:27 pm] * Not syncing Image between groups
    [04-11-17 4:06:27 pm] | There are no other members to sync to.
    [04-11-17 4:06:27 pm] | Image Name: Student-E460
    [04-11-17 4:06:27 pm] * Not syncing Image between groups
    

    This part of the log may need some dev clarification (and some possible code changes). There is no need to sync because??? The file already exists on the target server or the FOG replicator is being a bit lazy??

    [04-11-17 4:06:30 pm] | Student-E460: No need to sync d1.partitions file to FOG-Ubuntu-Slave1
    [04-11-17 4:06:30 pm] | Student-E460: No need to sync d1.original.swapuuids file to FOG-Ubuntu-Slave1
    [04-11-17 4:06:30 pm] | Student-E460: No need to sync d1.original.fstypes file to FOG-Ubuntu-Slave1
    [04-11-17 4:06:29 pm] | Student-E460: No need to sync d1.minimum.partitions file to FOG-Ubuntu-Slave1
    [04-11-17 4:06:29 pm] | Student-E460: No need to sync d1.mbr file to FOG-Ubuntu-Slave1
    [04-11-17 4:06:29 pm] | Student-E460: No need to sync d1.fixed_size_partitions file to FOG-Ubuntu-Slave1
    

Log in to reply
 

407
Online

39.3k
Users

11.0k
Topics

104.6k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.