FOG Multicast Problems / Partclone fails to finish



  • Thanks for checking things out Tom!

    Looks like stopping and starting the service on both the Server and Fog Storage Node solved the problem. The hosts are happy and the task is going!

    Cheers,

    Joe Gill


  • Senior Developer

    That’s what I suspected. It should be running now. If you wanna check?



  • @Tom-Elliott

    lcarr@Fog:~$ service FOGMulticastManager stop; sleep 5; service FOGMulticastManager start
     * Stopping FOG Computer Imaging Solution: FOGMulticastManager                  start-stop-daemon: warning: failed to kill 1362: No such process
    
    

    And hangs.



  • lcarr@Fog:~$ ps ax | grep udp
     3686 pts/3    S+     0:00 grep --color=auto udp
    

  • Senior Developer

    @Joe-Gill Seeing s FOGMulticastManager doesn’t appear to be running, what happens if you restart the service?

    service FOGMulticastManager stop; sleep 5; service FOGMulticastManager start



  • @Tom-Elliott

    I spoke too soon!! I had a batch fail on me again…

    /var/log/fog/multicast.log

    [06-22-16 6:02:36 pm]  | Sleep time has changed to 10 seconds
    
    :~$ ps aux | grep FOGMulti
    lcarr     3670  0.0  0.0  15944  2284 pts/3    S+   14:50   0:00 grep --color=auto FOGMulti
    

    Let me know if there are any steps you’d like me to try. I’ll be around until 5:30 MST this evening if you’d like to try and remote in. Thanks!!

    Joe Gill



  • Hi guys,

    Tom, thanks again for all the help! Everything is happy. The multi-cast we setup finished just fine. I’m making a new image today and imaging another lab. I doubt they’ll be any more issues with multicasting.

    I’ll re-post if more issues arise.

    Thanks!!

    Cheers,

    Joe Gill


  • Senior Developer

    Remoted in. The .22 was correct. The 17 was the “central server” but .22 is where the images are stored.

    The issue(s):

    1. DB was being connected to via the user fog and it’s associated password, but the user was not given privileges on the fog database.
    2. The Main server (.17) had bind-address enabled.

    Once I updated the DB to allow the ‘fog’ user to access the fog database AND disabled the bind-address, the node (.22) was able to startup the FOG Services (which was failing as it couldn’t contact the database). I watched the multicast log in particular after setting up a multicast task and saw 3 of the 4 clients in that tasking link into the Multicast side. (Still waiting to hear if partclone will work and if the 4th client worked properly).

    I suspect this issue will be resolved, but please keep us posted.



  • @Tom-Elliott
    I’m here still


  • Senior Developer

    Trying to contact via chat also.


  • Senior Developer

    @Joe-Gill Well the storage configuration you have says your storage node IS the master, which is not the log files you’re presenting us.



  • @Tom-Elliott

    Hi Tom,

    65.x.x.x is our Public IP address on our router.
    172.16.1.17 is the address for our FOG Server.
    172.16.1.22 is the address for our FOG Storage Node.

    Thanks!

    Joe


  • Senior Developer

    @Joe-Gill Where is 172.16.1.22 located on that same storage node?

    Your log shows 65.x.x.x and 172.16.1.17, not 172.16.1.22 as listed on the storage node configuration.



  • Also, I did add back my Freenas storage node as an additional storage node but it is not master. I can’t seem to wrap my head around this. I did try uploading a new image with the new version of partclone. I’m going to try re-pushing a multicast on Monday. But I don’t think it’s going to matter much.

    The one thing that is strange in this install is that when I initially set up the server I had my Freenas storage as the only node until I had problems. Then I demoted the Freenas down to slave and installed the FOG storage node. My main image was originally stored on the Freenas server when it was the master. But I’ve since reimaged that machine several times onto the new FOG storage.



  • 0_1466111073840_screenshot.jpg


  • Developer

    @Joe-Gill You don’t need to get rid of the public address. This is just some kind of information but FOG wouldn’t use that interface unless you configure it to do so.

    [06-16-16 8:37:57 pm] | This is not the master node

    Well, that’s something I’d say. Please check your storage configuration in the web gui…



  • I checked the /var/log/fog/multicast.log, and it still says:

    [06-16-16 8:37:57 pm] Interface Ready with IP Address: 65.XXX.XX.194
    [06-16-16 8:37:57 pm] Interface Ready with IP Address: 172.16.1.17
    [06-16-16 8:37:57 pm] Interface Ready with IP Address: Fog
    [06-16-16 8:37:57 pm] * Starting MulticastManager Service
    [06-16-16 8:37:57 pm] * Checking for new items every 10 seconds
    [06-16-16 8:37:57 pm] * Starting service loop
    [06-16-16 8:37:57 pm] | This is not the master node

    How do I get rid of the 65.xxx.xx.194 address? This is our outside address for our public ip address.



  • And yes this is the master node. I removed my secondary storage node just to be sure. I’ll go check the logs since I did that.

    Restarted a Multicast with 2 machines and it’s stuck at the Partclone screen.



  • @Tom-Elliott

    ps -ef | grep ‘FOGMulti’ | grep -v grep

    Revealed nothing running.

    Should I try re-running the multicast?


  • Senior Developer

    @Joe-Gill Good, now start it back running with service FOGMulticastManager start


Log in to reply
 

389
Online

6.1k
Users

13.5k
Topics

127.3k
Posts