Determining Which Client is the Bottleneck During Multicast



  • Let’s say that you have a computer lab with about 40 computers, and you are imaging them over a 1 Gbps connection through multicast. However, one network connection to one of the computers has a problem, and while it is negotiating at 1 Gbps with the switch like the other computers, actual throughput is 95% less than that. That one computer is the bottleneck for the whole lab.

    So, here would be the requests:
    [LIST]
    []A way in the FOG web console to determine which computer(s) are holding up the rest. Is there a way that FOG can determine which computer(s) are the problem?
    [
    ]Once it has been determined which computer(s) are the problem, a way to kill off just these computers from the multicast task and let the other computers continue imaging at the higher speed.
    [/LIST]
    Are these realistic?


  • Moderator

    [quote=“loosus456, post: 38867, member: 26317”]
    So, here would be the requests:
    [LIST]
    []A way in the FOG web console to determine which computer(s) are holding up the rest. Is there a way that FOG can determine which computer(s) are the problem?
    [
    ]Once it has been determined which computer(s) are the problem, a way to kill off just these computers from the multicast task and let the other computers continue imaging at the higher speed.
    [/LIST]
    Are these realistic?[/quote]

    I use Iperf to determine wich computer have bad bandwith. I’ll try to added iperf on iPXE.

    Regards,
    Ch3i.


  • Senior Developer

    They way multicast sessions are killed/ canceled is by the taskings in the group that’s kicked off (in the case it is a task as such.)

    If any one of the tasks are cancelled, the entire session is cancelled.



  • Gotcha.

    The only strange thing is that I have actually killed off systems with 1.2.0 before and it worked just fine. But I didn’t kill it with the web console. I actually just unplugged it, and the other computers paused for about 15 seconds. Then they picked up where they left off and at the higher speed.

    Is there a difference between a physical kill and a software kill?

    In regards to finding the problem connection, could FOG have like a 10-second throughput test? So if you applied a throughput task to a group (totally separate from the multicast task), it would do a 10-second unicast bandwidth test for each computer and iterate through the group until done? So 30 computers would take about 5 minutes to test.

    I realize that we could probably do this another way, but having it built into FOG with our groups already there and wake-on-LAN already there would be a hell of a lot easier. For our use, we are not really concerned about a few Mbps difference. But we occasionally have a connection that will negotiate at 1 Gbps but have throughput in the 5 Mbps range, so any throughput test should make it painfully obvious.


  • Senior Developer

    The problem with killing off the problem system(s) and keep the task running is you have to kill the sender command and then recreate it less the problem system. The other problem is the task will only send as fast as the slowest system can receive the data. Meaning all systems will appear as if they are the problem system. That and the way multicast works is there isn’t really a mrchanism to do as you’re reqursting. Theres no way to see what system is causing the problem because there is no return of packets.


Log in to reply
 

930
Online

39.3k
Users

11.0k
Topics

104.5k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.