Multicast clients progress at different speeds

dolf

Using FOG 1.2.0, I tried to image 13 hosts with a single multicast task. They all finished at slightly different times. I thought that multicast tasks always go at the speed of the slowest client. See attached screenshot:

0_1454333209737_Screenshot from 2016-02-01 14:29:44.png

How is this possible?

Tom Elliott

The network traffic is always moving at the slowest client speed. However, the decompression handles at the client. So the clients can have altered speeds of writing to disc.

dolf

Thanks for the quick reply.
There were 5 hosts that did not even start, of which two were not even checked in. Yet the multicast started. I’m just confused about the mechanics of that. Is there a timeout, after which the multicast becomes impatient and drops the clients who do not check in, or do not accept data at all?

Wayne Workman

@dolf said:

Is there a timeout, after which the multicast becomes impatient and drops the clients who do not check in, or do not accept data at all?

Yes, there is a value that sets the impatience level.

Web Interface -> FOG Configuration -> FOG Settings -> Multicast Settings -> FOG_UDPCAST_MAXWAIT

dolf

Are you sure?

0_1454337593645_Screenshot from 2016-02-01 16:39:40.png

Wayne Workman

@dolf

Looks like it’s a 1.3.0 and up setting only? This happens to me all the time. I only use FOG Trunk. See my screenshot:

0_1454341467170_upload-c920c019-5b21-4c5c-8d3a-dbcccbd1cb9b

Tom Elliott

I’d recommend if you can to upgrade to trunk. There a lot of improvements that I think might help this issue out.

dolf

Sounds risky, but I’ll try when I have time.

Wayne Workman

@dolf Not really. It’s safer if you have FOG virtualized and take snapshots regularly, but fog trunk is pretty solid at the moment. I’ve been running fog trunk in production for about a year now.

Tom Elliott

@dolf This should now be properly fixed. When editing the functions I added param calls that I forgot to add to the calls themselves. This is now fixed, please update and let me know if all is well.

Thank you,

Hanz

@Tom-Elliott I sent you a msg as well, the error I posted is no longer present after cleaning mutlicast “stuff” from fog server and rebooting, but now I get to partclone screen and it hangs.

Multicast log from gui

[02-03-16 9:30:29 am] | Task (17) Multi-Cast Task is already running PID 2461
[02-03-16 9:30:39 am] | 0 tasks to be cleaned

Service Master log

[02-03-16 9:26:47 am] service_signal_handler (30080) exiting.
[02-03-16 9:26:48 am] service_signal_handler (29843) received signal 2.
[02-03-16 9:26:48 am] service_signal_handler (29843) killing child (29861).
[02-03-16 9:26:48 am] service_signal_handler (29843) exiting.
[02-03-16 9:27:58 am] FOGImageReplicator Start
[02-03-16 9:27:58 am] FOGMulticastManager Start
[02-03-16 9:27:58 am] FOGImageReplicator Start
[02-03-16 9:27:58 am] FOGTaskScheduler Start
[02-03-16 9:27:58 am] FOGPingHosts Start
[02-03-16 9:27:58 am] FOGImageReplicator fork()ed child process (1356).
[02-03-16 9:27:58 am] FOGMulticastManager fork()ed child process (1357).
[02-03-16 9:27:58 am] FOGImageReplicator fork()ed child process (1355).
[02-03-16 9:27:58 am] FOGImageReplicator child process (1356) is running.
[02-03-16 9:27:58 am] FOGImageReplicator child process (1355) is running.
[02-03-16 9:27:58 am] FOGTaskScheduler child process (1354) is running.
[02-03-16 9:27:58 am] FOGPingHosts fork()ed child process (1358).
[02-03-16 9:27:58 am] FOGTaskScheduler fork()ed child process (1354).
[02-03-16 9:27:58 am] FOGPingHosts child process (1358) is running.
[02-03-16 9:27:58 am] FOGMulticastManager child process (1357) is running.

I rebooted between 9:26 and 9:27 as you can see from log.

This is tasked to multicast from remote storage node I might add. Have not tested multicast from server subnet yet, and I’m running trunk per the 6177.

I’d also like to add that after trying to run a multicast the server shows “This is not the master storage node” under multicast log, until I completely clean mysql of multicastAssoc, MulticastgroupAssoc, and task type=8 followed by a reboot, then I get the logs as shown above working correctly.

Multicast clients progress at different speeds

141

12.2k

17.3k

155.5k