Strange network activity and peeks during multicast

Foglalt

Hi!

We are experiencing strange network issues which was short and random “bottleneck” effects. At first we thought it is an outside source of problem (too short to be able to detect) but slowly it led us to fog activity as the source.

As we cut down to the bottom of it finally we could reproduce it this way: 2 machines imaging in multicast setup. Server and wall socket are on gigabit link, but on the “client side” a managed switch, patchcables to clients from that. When imaging starts serverside bandwidth is going to fill up to 100mbps (strangely not to 1gbps) and oscillating around a “max of 100mbps”. Few sec from start, network lags occure. Even ip-phones drop line or go choppy, which is unique.

This effect comes and goes during the imaging, but mostly 2-3 occasions. We will go deeper investigations (get different switches for exaple, etc), but i decided to show the pattern of it on the serverside “gui of bandwidth utilization”.

Images shows that from maxed out 100mbps sometimes there are peeks to even over 1,5gbps. Those should be false values maybe, but signs where and what happens. Those times phones struggles to be alive, streams or real time connections drop off, etc.

Why would multicast do such? For some other reason we stepped from 1.2 (that time we dont have this for sure) to 1.3, then 1.3.4 of fog versions. Unicasts dont do this effect.

Is there anything that would give acceptable reasons for those peeks? And does fog have throttle control somehow?

Tom Elliott

Multicast, in its very nature, is a “selfish” protocol.

Unicast uses TCP transfers, which means only the requesting systems will even see the data (particularly much faster on switches – which is why they are so much more common now – than on hubs.) So the only data sent is given a direct path to the destination (essentially). In simpler terms, I suppose, TCP traffic doesn’t even start sending data until the “requesting” item makes a request confirming it needs the data. This, essentially, creates a tunnel and the data requested is sent purely to the requestor.

Multicast uses UDP transers. UDP works by just sending data across the network. It doesn’t care who the traffic is for and “spams” (for lack of a better term) the entire network path it can reach. We do have “waiters” for the Multicast system to hold sending data until a single host (or number of hosts) are trying to request the data, but once the data starts sending it’s sending everywhere it can get to.

Foglalt

We are thinking about the “newly” installed managed switch is the source of the problem (we have such ages before in one of our sites). This one was setup in hope it can solve “allways there is a need to a new slot on the wall” problem. It is set for like this: some pc, some phone, 1 nas (all 100 mbps but nas has giga). and some of its ports are set to gigabit for purpose of fog usage. We suppose that it replicate traffic to wrong directions maybe which it should not, generating network issue. That will be killed somehow I hope soon (well, next week is the closest).

But those peeks are strange and disturbing. Why are there those at all? And why it is maxed out to 100mbps, if those link for requestors are gigabit? Throttle control on fog? (both end has gigabit slot, cpus are not overloaded, network is mostly still at this time).

Anyway, there was rumours about torrent imaging? Will there be once?

Tom Elliott

The spikes you see aren’t ever going to be 100% accurate.

The bandwidth is using some basic math to figure out what is currently used. Why is the 100Mbps maxed out? Is something imaging right now? Maybe images are replicating to different nodes? There could be any number of reasons.

Bandwidth checking uses math to figure out the range, but let’s say a small file is downloaded. If the small file is downloaded a bunch of times really fast, you should see your bandwidth reflecting that change relatively accurately. But let’s say the bunch of times has some “variance” one time fast one time slower. If the time period that we get the bandwidth check didn’t notice the change immediately the next time it will show the bandwidth a bit “faster” than actual speed because the time period prior was less accurate. This is what creates the “spikes”.

The bandwidth checker, by the way, also uses bandwidth, so some of these spikes might very well be due to the check itself checking itself.

Foglalt

Yes, the accuracy thing is same as we thought. The peeks anyway maybe real ones. The maxed out part is real, I mean no other activity (no replicating, etc). It was dont in isolated time, multiple occasions. That is why it looked strange. Next week we try to get other tests with different, maybe dumber switch to see what goes on. I will report in what we found. If no other changes on fog side of multicast, only inbetween hw can be the reason I think.

Junkhacker

@Foglalt said in Strange network activity and peeks during multicast:

there was rumours about torrent imaging?

the experimental torrent imaging method relied on a bunch of assumptions fog used to make about disk layouts. we no longer make those assumptions and fog works a lot better overall for it. while we had it working, torrent-casting wasn’t as fast as multicasting OR unicasting a group of computers. but, if i ever figure out a way to make torrent-casting work again, i’ll try it again.

Foglalt

@Junkhacker thx for comment

Strange network activity and peeks during multicast

136

12.1k

17.3k

155.4k