Multicast data address not change from one task to another one

Jose Cacho

Hello,
We have some performance problems in multicast with FOG v1.5.2 (tested on v1.5.4 too). Our FOG, serves images to different VLANs and it can be used for a group of more than 20 tecnicians. We have realized that the multicast throughput goes dramatically down (40MB/min,…10MB/min) when more than one multicast task is running at the same time.
As we have recently migrated from v0.30 and the multicast was stable and faster, we have noticed that v0.30 changes the udpsender IP address for each task but this not happens with v1.5.2.

Version 0.30 udpsender running **with 3 active multicast task**:
[root@fog5 ~]# ps -eaf | grep udp
root     19399 24835  5 14:10 ?        00:19:11 /usr/local/sbin/udp-sender --file /images/aula-sc-ingtecnica-ss-matematicas2-docencia/d1p1.pcz --min-receivers 21 --portbase 63124 --interface eth0 --max-wait 300 --half-duplex --ttl 32 --nokbd --mcast-data-address 239.0.101.14 --max-bitrate 200m --start-timeout 3600
root     22331 24835  0 20:24 ?        00:00:00 /usr/local/sbin/udp-sender --file /images/aula-ehu-upv-enajenacion/d1p1.pcz --min-receivers 2 --portbase 63128 --interface eth0 --max-wait 300 --half-duplex --ttl 32 --nokbd --mcast-data-address 239.0.101.16 --max-bitrate 200m --start-timeout 3600
root     22338 24835  0 20:24 ?        00:00:00 /usr/local/sbin/udp-sender --file /images/aula-ehu-upv-enajenacion/d1p1.pcz --min-receivers 2 --portbase 63130 --interface eth0 --max-wait 300 --half-duplex --ttl 32 --nokbd --mcast-data-address 239.0.101.17 --max-bitrate 200m --start-timeout 3600
root     22350 16333  0 20:24 pts/2    00:00:00 grep udp
[root@fog5 ~]#
 
Version 1.5.2 udpsender running **with 2 active multicast task**:
	
[root@fog7 ]# ps -eaf | grep udp
root     18006 10428  0 20:44 ?        00:00:00 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.104.1 --portbase 51712 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
root     18084 10428  0 20:44 ?        00:00:00 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.104.1 --portbase 65366 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
root     18136  1247  0 20:45 pts/0    00:00:00 grep --color=auto udp
[root@fog7 ]#

Could it be one of the causes of the performance problem?

george1421

@jose-cacho said in Multicast data address not change from one task to another one:

As we have recently migrated from v0.30 and the multicast was stable and faster, we have noticed that v0.30 changes the udpsender IP address for each task but this not happens with v1.5.2.

The sender IP address being the same should not be the problem as long as the port number is different for each session.

Since you are running a new version of FOG (1.5.2) than your original version 0.3, can we assume that you installed a new host OS on a new VM or physical server?

If it is a new server are you sure the server is up to achieve the best performance? Right now we don’t know where the bottle neck is so we need to start ruling out where the problem isn’t.

I would like you to update your FOG server to 1.5.4 and then we need to do some tweaks to 1.5.4 (that will be in 1.5.5 when its released) or if you have a test server you can install the 1.5.5 working branch which has the fixes in it. If you don’t want to install the 1.5.5 working branch I have the tune up instructions for make 1.5.4 run a bit smoother.

Then for benchmarking, what speed do you get for a single unicast image? What speed do you get with two simultaneous unicast images? (I realize you have an issue with multicasting, right now we need some benchmark numbers). What speeds do you get when 2 unicast streams are running to target computers at the opposite end of your network?

I would expect with 2 unicast streams on a 1 GbE network you should be able to pull 5-6GB/m transfer rates. Above 2 simultaneous streams you will flood the 1GbE fog server uplink and see degraded throughput.

Jose Cacho

Ok, thank you for the response.
About the server v1.5.2:
It is a new VM running RHEL 7.5.
8 CPU 2.67GHz, 16 GB RAM.
It has a 10GbE network (shared with some others VMs, but this days there is not a big traffic).
The original version 0.3 is running (on RHEL 5.1) on the same VM host.

Sorry, I can’t upgrade to v1.5.4 now (we are preparing our classrooms for the next course).
We have another test server and I’m asking if there is another “work in process” to get it upgraded.

For benchmarking in our production server v1.5.2:

for a single unicast image I could get above 12GB/min (HP 800 G2 CPU i7 client with SSD and 1GB network - U030688)
for 2 unicast above 9GB/min (same image and host models - U030688/U030731)
for two unicast streams at the opposite end** of our network above 1GB/min (about 1,3GB/min same model, other image U030568/U030569).

** opposite end is a difficult term to define in our network. We are deploying from one server in our campus to another three different campus with links at 3, 6 and 10 GB. The test is to two hosts on the 6GB link (I think so).

In our FOG server 1.5.2 I’ve added the suggested settings of 256MB for php-fpm and I was thinking about add the bitrate on the Storage configuration (as you can see on the updsender commandlines there was used on our prior 0.3 version). If we could do some tweaks without upgrade the version, it would be easier for us.

The question about the IP sender IP address comes on the back of our network. Because (correct me please if necessary) all the hosts on any multicast session would receive the packets of every running multicast task. Then each computer would discards the data not for him. So (with heavy load) it could result in lost packets due to saturation of the lower bandwith network links.

george1421

@jose-cacho said in Multicast data address not change from one task to another one:

Note: I’m not picking your post apart, but it has a lot of moving bits so here goes.

About the server v1.5.2:
It is a new VM running RHEL 7.5.
8 CPU 2.67GHz, 16 GB RAM.

Hopefully this is the VM Host stats, because if this is for FOG, its overkill and depending on the vm host configuration may actually be hurting your FOG server performance. Depending on the size of your network and the number of FOG clients you have installed, you should be able to do what you need with 2 vCPU and 4GB of ram. If you have more than 200 fog clients then we would bump up the stats a bit.

Can you tell me how many CPU cores are in your VM Host server? Also how many fog clients you have installed across your campus that is communicating with this FOG server?

The original version 0.3 is running (on RHEL 5.1) on the same VM host.

Good this tells us the issue may not be in your infrastructure.

For benchmarking in our production server v1.5.2:

for a single unicast image I could get above 12GB/min (HP 800 G2 CPU i7 client with SSD and 1GB network - U030688)

for 2 unicast above 9GB/min (same image and host models - U030688/U030731)

The 12-13GB/min is typical on a 10 GbE network with current target hardware with SSD or NVMe drives. The numbers coming out of partclone are a bit deceiving in that the numbers measure the entire imaging process from file transfer to the target computer and decompression of the image to writing the image to disk. But its the best tool we have a the moment to indicate throughput. The 2 unicasts are a touch slower than I would expect, but not that far out. It could be related to underlying disk subsystem in your virtualization server. Not something to be concerned about, really.

for two unicast streams at the opposite end** of our network above 1GB/min (about 1,3GB/min same model, other image U030568/U030569).

This gives me the idea that there is a restriction (bottleneck) some place in between. In an ideal 1GbE network you should be able to achieve 6.1GB/min transfer rates. For a 100Mb/s network you should be able to achieve about 700MB/min. Again we are measuring unicast to get an idea of your networks capabilities without adding in any multcast overhead. So this is good benchmarking numbers.

** opposite end is a difficult term to define in our network. We are deploying from one server in our campus to another three different campus with links at 3, 6 and 10 GB. The test is to two hosts on the 6GB link (I think so).

Just as a question, are you trying (or doing) multicasting across your WAN links?

In our FOG server 1.5.2 I’ve added the suggested settings of 256MB for php-fpm and I was thinking about add the bitrate on the Storage configuration (as you can see on the updsender commandlines there was used on our prior 0.3 version). If we could do some tweaks without upgrade the version, it would be easier for us.

I don’t remember if php-fpm was added in 1.5.2 or 1.5.3, but ensuring the 256MB as well as the min and max services will help. The bitrate on the storage configuration only impacts fog server to storage node replication. You can adjust the updsender command line by adjusting the FOG code by hand. Just navigate to /var/www/html/fog and grep for updsender it should only be in one or two class files. You can add elements to the command line if you want.

The question about the IP sender IP address comes on the back of our network. Because (correct me please if necessary) all the hosts on any multicast session would receive the packets of every running multicast task. Then each computer would discards the data not for him. So (with heavy load) it could result in lost packets due to saturation of the lower bandwith network links.

You are correct for multcast data all clients listen to all data packets. The IP address is not a factor since it listens to every one. The combination of IP address + port number is what the target computer keys off from. Now you are entering into network switch configuration land. Ideally on all of your switches, you want to have IGMP Snooping turned on for the vlans where your target computer and FOG server exist. The IGMP snooping code will then only send the multicast data to the switch ports where you have a multicast subscriber. This will configure roughly PIM Sparse mode, vs Dense mode for better switch performance as well as multicast streams.

Jose Cacho

@george1421 Ok, bit by bit is easier.

Can you tell me how many CPU cores are in your VM Host server?

You are right, they are vCPUs. I’m asking for the CPU cores in our VM Host server and come back with the information.

Also how many fog clients you have installed across your campus that is communicating with this FOG server?

We have more than 7000 host on FOG today. And daily (our last count a year ago) it could be more than 2500 computers (switched on at the same time) communicating with this server.

Just as a question, are you trying (or doing) multicasting across your WAN links?

Yes, we are multicasting across our WAN links.
I use the term of campus to get identified the 4 different locations that have the dedicated connections to the central CPD. So the “opposite end” unicast test has been done to one of these locations.

The IP address is not a factor since it listens to every one. The combination of IP address + port number is what the target computer keys off from. Now you are entering into network switch configuration land.

I’m sending this information to our network team (I suppose they know about it but it is better to refresh the concepts.) But, taking in mind I am not an expert (and excuse me in advance if my question is some stupid), if it is the same IP sender address for two different establised multicast session on two different VLANs (one multicast task each VLAN), are all the data packets sent to two VLANs and the IGMP snooping code gets ride off the data on the last switch? Or is IGMP so smart that propagates the route to the suscribers to each sesion and the multicast data is “routed” in an efficient way from the FOG server switch?

We can deploy to 35 VLANs and for us (for our FOG) all the locations are “the same network”. So, I am thinking about some multicast tasks running near the FOG server and other ones running across the WAN links and (may be, if our network set up is not correct) the WAN links getting unnecesarily congested.

P.D.: Very good stuff. A lot of useful information for me @george1421.

Jose Cacho

You are right, they are vCPUs. I’m asking for the CPU cores in our VM Host server and come back with the information.

2 sockets of 6 cores (12 processors) with hyperthreading enabled (24 logical processors). Our FOG server 8 vCPUs.

george1421

@jose-cacho said in Multicast data address not change from one task to another one:

2 sockets of 6 cores

The thing is with 8 vCPUs allocated to the VM, the hypervisor needs to have 8 of the 12 cores available for the VM to be scheduled to execute. The other factor is how many VMs are on this VM Host server. While we are getting off point of your initial post. But my intuition is telling me that 8 vCPU is much and you might see better performance with 4 or 6 vCPUs. But at the moment only change one thing at a time.

When you say 2500 hosts, do all of them have the FOG client installed? If so, what is your client check in time for the fog client? If it still set for 60 seconds, change that to 900 (15 minutes). That will dramatically drop the load on the FOG server.

Jose Cacho

@george1421

When you say 2500 hosts, do all of them have the FOG client installed?

Yes, all of them (7000) have the FOG client installed. But 2500 could be polling the server in a ordinary class day. At this time that it is not a problem because all the schools are in summer break. In order to use the task reboot manager to unnatended image deploys, we set the check in time in 180 seconds. (So if a multicast deploy task is sended, the computers would have time to reboot and get suscribed before the 300 seconds limit.)

my intuition is telling me that 8 vCPU is much and you might see better performance with 4 or 6 vCPUs. But at the moment only change one thing at a time.

Ok, 6 vCPUs was the setting until one month. We were running very slow tasks and I ask for more power on our server (6-8CPU - 12-16GB RAM). We were aware that it could not be the best but we were “forced” to test it. We didn’t notice a much better performance and the plan is to have 6 vCPUs after the classrooms are ready for the new academic year.

While we are getting off point of your initial post.

Ok, coming back to the point ;P. I have been talking with one of our network team and he have give me some general information about our network. Our FOG server is atacched to a “CORE” router (10GbE). From this central point there are connections to the four named campus. I have done a sketch map.
0_1533314602550_3d0da188-fcb9-4621-8268-d9b43b794afa-image.png

Looking the map, and remembering the benchmark tests, the first unicast tests hosts are in Campus 1. And the last unicast test hosts (at “the opposite end” ) are in Campus 4.

So, (if my memory serves me correctly) my network workmate has tell me that IGMP does not use the port number parameter (only IP). And, today, we are not sure if the router has the capability of “discard” or “route efficiently” the muliticast data only to the subscribers on IP+portnumber multicast session. (Our high experienced guy on tunning multicast in our network is on holidays.)

I am looking for the cause that doubles the time needed (with the v1.5.2 server not heavily used) for a multicast tasks. (If we compare it with the v0.30 server.)
We could take a multicast deploy to a group in “campus 3” and on april with v0.30 took less than 4h (about 58GB). But yesterday with v1.5.2 more than 8h (about 67GB).
(Don’t get me wrong, I know it is very difficult to tune up all the settings. And in addition, I think our FOG implementation is not an easy one :). So step by step.)

On another level, to add some little test results to the multicast performance problem we try with the Bitrate option (yes, it seems that setting it up on the “Storage” options it get added to the udpsender command):

In “campus 1” deploy 2.43 GB/min Vs 4.24GB/min (the second test 5 minutes later without --max-bitrate 200m to the same two hosts)

root     23705  2180  0 19:44 ?        00:00:00 /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 51604 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img

root     31218  2180  8 19:51 ?        00:00:12 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 52262 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img

I have not had chance to get tested on other “campus”.

Jose Cacho

Here you have some images, for an overview of our fog server load today.
The active unicast tasks are properly queued if they are more than 10. This setting mantains our unicast tasking giving a good performance.
But the multicast tasks get quite slow if they are not “alone” (“one by one”). And, as you can see on the attached images, we can easily reach to five (or more) multicast groups at the same time.
– FOG Overloaded –

– FOG Managing overload –

@george1421 Thinking aloud, if the mcast-data-address is not part of the performance problem, the way could be to get the multicast tasks queued.

george1421

@jose-cacho said in Multicast data address not change from one task to another one:
I’ve trying to think about how we can best debug this issue.

At this moment I’m just thinking out loud here: There has been many changes since 0.30. Partclone is now used instead of Partimage, ZSTD is used as the standard image decompressor (even if gzip is picked for image capture). The FOS kernel (the customized linux that runs on the target computer) has been updated a hundred time or so. Plus all of the ancillary applications to FOG have been updated. The linux OS of the FOG host server has been updated.

On the other side: The VM is running on the same infrastructure as 0.30 instance. The image is taking the same data path between the VM host server and the target computers.

Well we know we can manually launch the udp-sender application on the FOG server with this command:

/usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 52262 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img

On the target computer there will be a udp-receiver command that will connect to the multicast stream initiated by the fog server. I don’t know the exact command that FOG is using but it should be close to this

udp-receiver --file /tmp/pig.tmp --nokbd --portbase 52262 --ttl 32 --mcast-rdv-address 239.0.107.1

The one thing I did notice is that the ttl is set to 32, so you can’t have more than 32 hops between the sender and receiver. Unless you have a really big campus then this shouldn’t come into play.

Now if you schedule a debug capture or debug deploy and then pxe boot the target computer, on the target computer you will be dropped to a linux command prompt where you can key in commands like udp-receiver

ref: https://www.udpcast.linux.lu/cmd.html

Jose Cacho

@george1421 Thank you very much for your thougts and suppport. I agree with you, there has been many changes since our last version. And we will have to test the udpcast commands to get the best (thanks). But I have some more tests and data.
We have run simultaneous “controlled” multicast tasks on different campuses and network team has captured the traffic (port mirroring) on one of the multicasted computers. (Please, let me know if you don’t understand something on this post. I am not used to write about network terms, and It could be a better way to explain it.)

The summary is:

All the traffic of the same multicast address IP reaches the computer NIC. It is not filtered by port.
As the multicast IP address is the same, the different mcast sessions to a campus are sent by the same data channel and not over another one (to take advantage of the other data channels if the former one is giving its max throughput). Note we have the campus connected by 2, 3 or 4 different aggregated data channels and the data is balanced to get the overall best throughput and performance. But the IP is a vital data to get it properly routed. So, when one (or more) multicast session is running on a campus, all the multicast data is routed by the same data channel.
FOG server’s CPUs goes to 100% only with 2 simultaneous multicast task on the Campus1: one task of 9 computers and another one of 41.

So, I’m thinking about:
A) could you help us tweaking FOG to get each specific multicast tasks using a different IP?
B) (…And thinking aloud) If FOG needs to resend more packets and it has to be waiting for “an overflow” data channel, could be this the main cause of the CPU comsumption?

And now, some additional data courtesy of our network team:

Port mirroring and capturing the traffic on a multicasted computer, we can see it receives the data of all the running multicast tasks
From https://community.cisco.com/t5/switching/multicast-ports/td-p/854295

But you would do well to use different multicast IP addresses for different application because switches will distribute multicast packets according to the IP address (regardless of port).

So if you have two applications that use the same IP address but different ports, a machine that is interested in either application will have to listen to both sets of traffic and filter out the port it is not interested in. If they are using different IP addresses, the switch will do that for them.
(Actually, its a bit more complicated because the switch distributes according to groups of 32 addresses, so there may be some overlap even if the addresses are different … if the addresses fall in the same MAC group.)

Fernando Gietz

Hi,

I changed the multicasttask.class.php file to give different ips in each multicast session, and the performance is better now.

One line in /var/www/html/fog/service/multicasttask.class.php:

#diff multicasttask.class.php multicasttask.class.php.ori 
421,423d420
< /* Se añade esta linea para que asigne direcciones IP diferentes a cada tarea multicast*/
< 	$address = long2ip(ip2long($address)+(( $this->getPortBase() / 2 + 1) % self::getSetting('FOG_MULTICAST_MAX_SESSIONS')));
< /* FIN DEL CAMBIO*/

This line assigns dinamic multicast ips to the sessions, to do it the code uses some parameters of the server: the portbase (this port is created by FOG randomly) and FOG_MULTICAST_SESSIONS.

You can see the udp-sender commands:

Command: /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.106.12 --portbase 63764 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-ehu-upv-enajenacion/d1p1.img;

Command: /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 3 --max-wait 300 --mcast-data-address 239.0.106.31 --portbase 55994 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-W10-UEFI/d1p1.img;

Jose Cacho

Hi @george1421,

With the change made by my workmate @Fernando-Gietz, regarding to the use of a multicast data address, we have improved the througput when there are serveral multicast tasks running at the same time. So you can mark this as solved (I don’t find how to do it).

We are now focusing our attention on mysql tunning. Because (as you pointed) with the course started the polls of the fog clients on the hosts bring our CPUs “to the red zone”.

Only for keep the information on the post: I “remember” (I am back from holidays today) that our colleage from network team told me about IGMP: v3 can avoid delivering multicast packets from specific sources to networks where there are no interested receivers, but v2 can’t. And, our router is running IGMP v2.

COMPARISON OF IGMPV1, IGMPV2 AND IGMPV3
Understanding difference between IGMPv2 and v3

Many thanks for your excellent support.

george1421

@Jose-Cacho said in Multicast data address not change from one task to another one:

v3 can avoid delivering multicast packets from specific sources to networks where there are no interested receivers, but v2 can’t. And, our router is running IGMP v2.

I’m not a network engineer, but I think that “IGMP Snooping” configured on the switches will supplement IGMP v2, to make it a bit more like v3 by only delivering the multicast stream to the stream subscribers.

george1421

@Jose-Cacho said in Multicast data address not change from one task to another one:

With the change made by my workmate @Fernando-Gietz, regarding to the use of a multicast data address, we have improved the througput when there are serveral multicast tasks running at the same time. So you can mark this as solved (I don’t find how to do it).

@Developers we might want to consider @Fernando-Gietz patches for the next release of FOG.

Tom Elliott

@Fernando-Gietz said in Multicast data address not change from one task to another one:

Se añade esta linea para que asigne direcciones IP diferentes a cada tarea multicast

I’ve added the patch, but a little more checking involved. This has been added to both the working and working-1.6 branches. It tests the set value for the $address variable. If this variable is set, it will calculate the address. Here’s the snippet of lines:

if ($address) {
    $address = long2ip(
        ip2long($address) + (
            (
                $this->getPortBase() / 2 + 1
            ) % self::getSetting('FOG_MULTICAST_MAX_SESSIONS')
        )
    );
}

Hopefully this will address the problem people have been seeing and allow the use of multiple sessions.

Multicast data address not change from one task to another one

95

12.6k

17.5k

156.3k