Multicast very slow
-
@plegrand Also I just looked into these switching thinking they were old. They are not (at least performance wise) they are better capacity than Cisco small business switch SG300.
So I have to ask you why are you running 100MB/s to the desktops?
Do you have the capabilities to run a second network wire from the switch where the fog server is to the switch where the target computers are? (I’ll explain more if you can).
-
@george1421
With a target computer far from the fog server it seems to works alsofile:///home/pascal/Bureau/multicast-205.jpg
-
@george1421
I’m not sure to well understand
Actually my diagram is not “really” true.
The fog server is on a GB port
All desktop clients are on 100 M ports
I cant do anything else, no enough GB ports -
@george1421 For these tests for the moment i’ve no time enough but i’ll do these tests in a near futur
-
@george1421 said in Multicast very slow:
Do you have the capabilities to run a second network wire from the switch where the fog server is to the switch where the target computers are? (I’ll explain more if you can).
It’s quiet already the case
The switch (OS6250) where are connected all my target computers is just below the switch where the fog server is (OS6450).
And directly connected by a 1GB link
Then, only target computers are on a 100MB link -
@plegrand said in Multicast very slow:
I cant do anything else, no enough GB ports
When I looked up the switch configuration it said that all ports are 10/100/1000 rated. Maybe I did not understand.
The the OS6250 switch is only 100Mb/s then I understand why you have 100MB/s to the desktop.
OK on delaying the test. There is a bottleneck someplace we just need to find it.
-
@george1421
The 6450 has all his ports 10/100/1000
not the 6250 only the uplink ports are on 1000 -
@george1421 said in Multicast very slow:
Under a multicast you should get just slightly less than unicast speeds.
So what’s the multicast interest if it’s not faster, apart from not flooding the network?
I’m going to make some tests this morning with 2, 3, 4 target computers.
I will tell you the result -
@george1421
Then i made some testsMulticast session (test) with 2 target computers : about 1.7GB/min rate
Multicast session (test) with 3 target computers : about 1.7GB/min rate
Multicast session (test) with 4 target computers
The 2 first target computers didn’t start i cant understand why.
then i removed these to computers from the session (test)
i create a new multicast session (test2) for these 2 computersThen there is 2 sessions running (test and test2) and all the target computers have about 1.15GB/min rate
All the computers have 2 partitions and strangely the first partition is slower than the second : about 700MB/min rate
-
@plegrand I’d say keep on doing more tests. If you see some machines not joining the session then you might want to cancel it and restart the FOG server just to have a clean test setup on every run.
All the computers have 2 partitions and strangely the first partition is slower than the second : about 700MB/min rate
From my point of view this is another hint that it’s not as much a network/switch issue but more due to the clients actually deploying the data to disk. There might be some clients that have a dieing disk that might be causing the slowdown for all clients in one session. You know that with multicast the slowest link of the chain is dictates the overall speed! While it causes way less network traffic than unicast for a group of computers it has the caveat of being regulated by the slowest part speed-wise.
-
@plegrand said in Multicast very slow:
So what’s the multicast interest if it’s not faster, apart from not flooding the network?
This answer is simple. Lets say you want to send the same image to 5 computers and your base image is 20GB in size.
With a multicast we send out one 20GB image for 20 or 100 systems. In the unicast situation for those 5 systems you would have to transmit 100GB worth of data over your network. So from a network load standpoint you will get less network impact with multicast.
-
@plegrand said in Multicast very slow:
Then there is 2 sessions running (test and test2) and all the target computers have about 1.15GB/min rate
Just so I’m clear on this. When you were able to get 4 computers imaging your transfer rate was 1.15GB/min? That’s still 19MB/sec. You are still above the 100Mb/s theoretical limit.
All the computers have 2 partitions and strangely the first partition is slower than the second : about 700MB/min rate
I can explain this. The number is based on an incorrect calculation. The issue is that first partition is pretty small, like 500MB. It transfers so fast that the speed numbers get skewed. The second partition is typically the contents of the drive. You can see this if you look at the disk manager in Windows. Look at the size of the first partition.
I might need to explain how image multicasting works. There is one computer (FOG Server) that is sending the image out. As each multicast client boots up it checks in with the multicast sender through a discovery process. The muticast sender (FOG Server) configures the multicast sender service to wait for X number of clients to check in before going, or after the first client checks in wait for XX seconds before going even if not all have checked in. Once the multicast stream starts, no other late clients can check in (they are blocked). So in the image stream the FOG server sends out the first block of data then stops. It waits for every multicast receiver (target computers) to respond with “OK!”. The FOG server will not send the next block until it hears “OK!” from every client. If something happens and one client didn’t get the block correctly it will send “Retrans” back to the FOG server and the fog server will resend that block back to the client computer (while the others sit and wait until everyone replies with “OK!”. This is why we say multicasting can only go as fast as the slowest computer in the multicast stream. Consider you have 4, 8-core desktops all with SSD drives and one with a Penitum-4 and a slow HDD. If you imaged them all in one stream the 8 core systems would image at the rate the Penitum-4 system can write data to its slow HDD. If you have a system with a failing hard drive if the block transferred to it’s checksum doesn’t match the checksum of the block on the disk it will send a “Retrans” command back to the FOG server while the other clients wait. The point is when everything works it works well, when you have one bad actor everyone suffers.