Multicast clients progress at different speeds
-
Using FOG 1.2.0, I tried to image 13 hosts with a single multicast task. They all finished at slightly different times. I thought that multicast tasks always go at the speed of the slowest client. See attached screenshot:
How is this possible?
-
The network traffic is always moving at the slowest client speed. However, the decompression handles at the client. So the clients can have altered speeds of writing to disc.
-
Thanks for the quick reply.
There were 5 hosts that did not even start, of which two were not even checked in. Yet the multicast started. I’m just confused about the mechanics of that. Is there a timeout, after which the multicast becomes impatient and drops the clients who do not check in, or do not accept data at all? -
@dolf said:
Is there a timeout, after which the multicast becomes impatient and drops the clients who do not check in, or do not accept data at all?
Yes, there is a value that sets the impatience level.
Web Interface -> FOG Configuration -> FOG Settings -> Multicast Settings -> FOG_UDPCAST_MAXWAIT
-
Are you sure?
-
Looks like it’s a 1.3.0 and up setting only? This happens to me all the time. I only use FOG Trunk. See my screenshot:
-
I’d recommend if you can to upgrade to trunk. There a lot of improvements that I think might help this issue out.
-
Sounds risky, but I’ll try when I have time.
-
@dolf Not really. It’s safer if you have FOG virtualized and take snapshots regularly, but fog trunk is pretty solid at the moment. I’ve been running fog trunk in production for about a year now.
-
@dolf This should now be properly fixed. When editing the functions I added param calls that I forgot to add to the calls themselves. This is now fixed, please update and let me know if all is well.
Thank you,
-
@Tom-Elliott I sent you a msg as well, the error I posted is no longer present after cleaning mutlicast “stuff” from fog server and rebooting, but now I get to partclone screen and it hangs.
Multicast log from gui
[02-03-16 9:30:29 am] | Task (17) Multi-Cast Task is already running PID 2461 [02-03-16 9:30:39 am] | 0 tasks to be cleaned
Service Master log
[02-03-16 9:26:47 am] service_signal_handler (30080) exiting. [02-03-16 9:26:48 am] service_signal_handler (29843) received signal 2. [02-03-16 9:26:48 am] service_signal_handler (29843) killing child (29861). [02-03-16 9:26:48 am] service_signal_handler (29843) exiting. [02-03-16 9:27:58 am] FOGImageReplicator Start [02-03-16 9:27:58 am] FOGMulticastManager Start [02-03-16 9:27:58 am] FOGImageReplicator Start [02-03-16 9:27:58 am] FOGTaskScheduler Start [02-03-16 9:27:58 am] FOGPingHosts Start [02-03-16 9:27:58 am] FOGImageReplicator fork()ed child process (1356). [02-03-16 9:27:58 am] FOGMulticastManager fork()ed child process (1357). [02-03-16 9:27:58 am] FOGImageReplicator fork()ed child process (1355). [02-03-16 9:27:58 am] FOGImageReplicator child process (1356) is running. [02-03-16 9:27:58 am] FOGImageReplicator child process (1355) is running. [02-03-16 9:27:58 am] FOGTaskScheduler child process (1354) is running. [02-03-16 9:27:58 am] FOGPingHosts fork()ed child process (1358). [02-03-16 9:27:58 am] FOGTaskScheduler fork()ed child process (1354). [02-03-16 9:27:58 am] FOGPingHosts child process (1358) is running. [02-03-16 9:27:58 am] FOGMulticastManager child process (1357) is running.
I rebooted between 9:26 and 9:27 as you can see from log.
This is tasked to multicast from remote storage node I might add. Have not tested multicast from server subnet yet, and I’m running trunk per the 6177.
I’d also like to add that after trying to run a multicast the server shows “This is not the master storage node” under multicast log, until I completely clean mysql of multicastAssoc, MulticastgroupAssoc, and task type=8 followed by a reboot, then I get the logs as shown above working correctly.