Multicasting Issues
-
So we are trying to do a multicast deployment. The task is started, and the clients are booted. All the clients eventually show a progress screen of stars:
-
-
-
-
-
-
-
- … ad. infinitum …
-
-
-
-
-
-
The server logs the same message over and over:
Checking if I am the group manager
I am the group managerAnd that’s it. These machines are all on the same VLAN, and multicasting is enabled on the switch. Yet nothing happens
Our network admin has seen the clients connect to the mcast group - yet the server never picks up any of these clients … so where do I start looking. We are using alcatel switches … -
-
[quote=“MadX, post: 1345, member: 456”]
-
-
-
-
-
-
-
- … ad. infinitum …
[/quote]
And the previous messages??
- … ad. infinitum …
-
-
-
-
-
-
-
-
Everything starts up fine That’s the strange part - other than a vesafb error - but that’s just graphics & has never interfered before …
-
and you can deploy the image using unicast?
-
Yep … works a charm … obviously when doing 40+ machines at a time - the speed slows to about 40MB / min. This is why we are trying to get multicasting working … Thanks for the replies so far
-
Obviously if you make unicast tasks, the kernel shouldn’t be the problem.
I use fog 0.30, probably fog .032 or your version works in the same way.
Life cycle of a multicast task (short version
[LIST=1]
[]You ask to the server to do a multicast tasks (in the web interface)
[]In the fog database:
[LIST=1]
[]tasks table: one registry per client. tasksType=C, taskState=0
[]MulticastSession: one registry. msSTate=0
[]MulticastSessionAssoc: this table links one MulticastSession with “M” client tasks. (1:M relation)
[/LIST]
[]FOGMulticastManager service. This service makes a query, every 5 seconds, to know if there are new multicast tasks. (you can see the FOGMulticastManager log in /opt/fog/log/multicast.log). If the tasks is new:
[LIST=1]
[]Update the MulticastSession.msState=1
[]Create a udp-sender command (see the multicast.log). Create a multicast session in the server.
[/LIST]
[]Send a WOL to the clients
[]The client is awake
[]The client connects with the PXE server
[]the client downloads the kernel
[]the client loads the kernel
[]the client downloads the init.gz file
[]the client loads the init.gz file
[]the client runs “fog” script. This script is in /bin
[LIST=1]
[]the script checks some parameters: OS, image file, …
[]The client update DB, via webservice, and tasks.taskState=1
[]the script mount the /images directory in the server
[]the script makes some thigs more
[]the script runs udp-receiver command
[]Blue screen “Please Wait”
[]All clients links with the multicast session. The download process begins.
[]The download process ends.
[]The client updates DB, via webservice, and tasks.taskState=2
[/LIST]
[]FOGMulticastManager service updates the DB, and now multicastSession.msState=2
[]Multicast task ends
[/LIST]
some questions:
[LIST]
[]Is FOGMulticastManager ON?
[]FOGMulticastManager log?
[]The udp-sender command, is running?
[*]When crashs the client?
[/LIST] -
The FOGMulticastManager is on - and logs every 10 seconds (Checking if I am the group Manager …)
(checked with ps aux |grep Multicast)
There is logging going on
The udp sender command is being executed (see the udp-sender gzip -d …)
The clients don’t ‘crash’ - they boot into their kernels and the stars appear.What I am planning on doing in the morning is connecting 5-6 clients on a dumb switch (ie: take vlan out the picture) We are currently running alcatel switches and all the clients are on the same switch - but are VLAN tagged. I will report the results
-
[ATTACH=full]44[/ATTACH]
Okay, we have now set up the server and 3 clients in a controlled environment.
They all get their images - and boot into the FOG kernel.
The Screen gets to this point - ie: no “Please Wait” screen[url=“/_imported_xf_attachments/0/44_IMG-20120217-00029.jpg?:”]IMG-20120217-00029.jpg[/url]
-
Hi,
I have located where is the problem. When the client runs the fog script, this script call to a web service: mc_checkin.php
[CODE]fog: line 108else
echo -n " * Checking In…";
queueinfo=wget -q -O - "http://${web}service/mc_checkin.php?mac=$mac" 2>/dev/null
echo “Done”;
while [ “$queueinfo” != “##” ]
do
echo -n " * $queueinfo ";
queueinfo=wget -q -O - "http://${web}service/mc_checkin.php?mac=$mac" 2>/dev/null
sleep 5;
donefi[/CODE]
In the line 110 calls to the web service:
[I]queueinfo=wget -q -O - "[url]http://${web}service/mc_checkin.php?mac=$mac[/url]" 2>/dev/null
[/I]
Print “Done” in the screen, but queueinfo variable is empty. This is the problem.The mc_checkin.php web service doesn’t work fine. Put checkpoints in the code using “echo” to see where is the problem.
The mc_checkin.php is in /var/www/fog/service directory -
I have the same problem. MadX, did you get resolution from Fernando’s posts?
-
Please, Todd, paste your mc_checkin.php code. I try to know where is the problem.
-
I found another post from another site…can’t find it now to save my life…but it suggested that you go to /etc/hosts and edit the IP
addresses from 127.0.0.1 to whatever it was that you set as your IP address for the server. I changed them both and it worked like a charm. If I come across that post again I’ll add it in.