Multicasting Issues



  • So we are trying to do a multicast deployment. The task is started, and the clients are booted. All the clients eventually show a progress screen of stars:

                  • … ad. infinitum …

    The server logs the same message over and over:

    Checking if I am the group manager
    I am the group manager

    And that’s it. These machines are all on the same VLAN, and multicasting is enabled on the switch. Yet nothing happens :(
    Our network admin has seen the clients connect to the mcast group - yet the server never picks up any of these clients … so where do I start looking. We are using alcatel switches …



  • I found another post from another site…can’t find it now to save my life…but it suggested that you go to /etc/hosts and edit the IP
    addresses from 127.0.0.1 to whatever it was that you set as your IP address for the server. I changed them both and it worked like a charm. If I come across that post again I’ll add it in.


  • Developer

    Please, Todd, paste your mc_checkin.php code. I try to know where is the problem.



  • I have the same problem. MadX, did you get resolution from Fernando’s posts?


  • Developer

    Hi,

    I have located where is the problem. When the client runs the fog script, this script call to a web service: mc_checkin.php
    [CODE]fog: line 108

    else
    echo -n " * Checking In…";
    queueinfo=wget -q -O - "http://${web}service/mc_checkin.php?mac=$mac" 2>/dev/null
    echo “Done”;
    while [ “$queueinfo” != “##” ]
    do
    echo -n " * $queueinfo ";
    queueinfo=wget -q -O - "http://${web}service/mc_checkin.php?mac=$mac" 2>/dev/null
    sleep 5;
    done

    fi[/CODE]
    In the line 110 calls to the web service:
    [I]queueinfo=wget -q -O - "[url]http://${web}service/mc_checkin.php?mac=$mac[/url]" 2>/dev/null[/I]
    Print “Done” in the screen, but queueinfo variable is empty. This is the problem.

    The mc_checkin.php web service doesn’t work fine. Put checkpoints in the code using “echo” to see where is the problem.
    The mc_checkin.php is in /var/www/fog/service directory



  • [ATTACH=full]44[/ATTACH]
    Okay, we have now set up the server and 3 clients in a controlled environment.
    They all get their images - and boot into the FOG kernel.
    The Screen gets to this point - ie: no “Please Wait” screen

    [url="/_imported_xf_attachments/0/44_IMG-20120217-00029.jpg?:"]IMG-20120217-00029.jpg[/url]



  • The FOGMulticastManager is on - and logs every 10 seconds (Checking if I am the group Manager …)
    (checked with ps aux |grep Multicast)
    There is logging going on
    The udp sender command is being executed (see the udp-sender gzip -d …)
    The clients don’t ‘crash’ - they boot into their kernels and the stars appear.

    What I am planning on doing in the morning is connecting 5-6 clients on a dumb switch (ie: take vlan out the picture) We are currently running alcatel switches and all the clients are on the same switch - but are VLAN tagged. I will report the results


  • Developer

    Obviously if you make unicast tasks, the kernel shouldn’t be the problem.
    I use fog 0.30, probably fog .032 or your version works in the same way.
    Life cycle of a multicast task (short version ;) ):
    [LIST=1]
    []You ask to the server to do a multicast tasks (in the web interface)
    [
    ]In the fog database:
    [LIST=1]
    []tasks table: one registry per client. tasksType=C, taskState=0
    [
    ]MulticastSession: one registry. msSTate=0
    []MulticastSessionAssoc: this table links one MulticastSession with “M” client tasks. (1:M relation)
    [/LIST]
    [
    ]FOGMulticastManager service. This service makes a query, every 5 seconds, to know if there are new multicast tasks. (you can see the FOGMulticastManager log in /opt/fog/log/multicast.log). If the tasks is new:
    [LIST=1]
    []Update the MulticastSession.msState=1
    [
    ]Create a udp-sender command (see the multicast.log). Create a multicast session in the server.
    [/LIST]
    []Send a WOL to the clients
    [
    ]The client is awake
    []The client connects with the PXE server
    [
    ]the client downloads the kernel
    []the client loads the kernel
    [
    ]the client downloads the init.gz file
    []the client loads the init.gz file
    [
    ]the client runs “fog” script. This script is in /bin
    [LIST=1]
    []the script checks some parameters: OS, image file, …
    [
    ]The client update DB, via webservice, and tasks.taskState=1
    []the script mount the /images directory in the server
    [
    ]the script makes some thigs more
    []the script runs udp-receiver command
    [
    ]Blue screen “Please Wait”
    []All clients links with the multicast session. The download process begins.
    [
    ]The download process ends.
    []The client updates DB, via webservice, and tasks.taskState=2
    [/LIST]
    [
    ]FOGMulticastManager service updates the DB, and now multicastSession.msState=2
    []Multicast task ends
    [/LIST]
    some questions:
    [LIST]
    [
    ]Is FOGMulticastManager ON?
    []FOGMulticastManager log?
    [
    ]The udp-sender command, is running?
    [*]When crashs the client?
    [/LIST]



  • Yep … works a charm … obviously when doing 40+ machines at a time - the speed slows to about 40MB / min. This is why we are trying to get multicasting working … Thanks for the replies so far :)


  • Developer

    and you can deploy the image using unicast?



  • Everything starts up fine :( That’s the strange part - other than a vesafb error - but that’s just graphics & has never interfered before …


  • Developer

    [quote=“MadX, post: 1345, member: 456”]

                  • … ad. infinitum …
                    [/quote]
                    And the previous messages??

Log in to reply
 

848
Online

38951
Users

10701
Topics

101530
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.