Posts made by Jose Cacho

Jose Cacho

With the change made by my workmate @Fernando-Gietz, regarding to the use of a multicast data address, we have improved the througput when there are serveral multicast tasks running at the same time. So you can mark this as solved (I don’t find how to do it).

We are now focusing our attention on mysql tunning. Because (as you pointed) with the course started the polls of the fog clients on the hosts bring our CPUs “to the red zone”.

Only for keep the information on the post: I “remember” (I am back from holidays today) that our colleage from network team told me about IGMP: v3 can avoid delivering multicast packets from specific sources to networks where there are no interested receivers, but v2 can’t. And, our router is running IGMP v2.

COMPARISON OF IGMPV1, IGMPV2 AND IGMPV3
Understanding difference between IGMPv2 and v3

Many thanks for your excellent support.

Jose Cacho

@george1421 Thank you very much for your thougts and suppport. I agree with you, there has been many changes since our last version. And we will have to test the udpcast commands to get the best (thanks). But I have some more tests and data.
We have run simultaneous “controlled” multicast tasks on different campuses and network team has captured the traffic (port mirroring) on one of the multicasted computers. (Please, let me know if you don’t understand something on this post. I am not used to write about network terms, and It could be a better way to explain it.)

The summary is:

All the traffic of the same multicast address IP reaches the computer NIC. It is not filtered by port.
As the multicast IP address is the same, the different mcast sessions to a campus are sent by the same data channel and not over another one (to take advantage of the other data channels if the former one is giving its max throughput). Note we have the campus connected by 2, 3 or 4 different aggregated data channels and the data is balanced to get the overall best throughput and performance. But the IP is a vital data to get it properly routed. So, when one (or more) multicast session is running on a campus, all the multicast data is routed by the same data channel.
FOG server’s CPUs goes to 100% only with 2 simultaneous multicast task on the Campus1: one task of 9 computers and another one of 41.

So, I’m thinking about:
A) could you help us tweaking FOG to get each specific multicast tasks using a different IP?
B) (…And thinking aloud) If FOG needs to resend more packets and it has to be waiting for “an overflow” data channel, could be this the main cause of the CPU comsumption?

And now, some additional data courtesy of our network team:

Port mirroring and capturing the traffic on a multicasted computer, we can see it receives the data of all the running multicast tasks
From https://community.cisco.com/t5/switching/multicast-ports/td-p/854295

But you would do well to use different multicast IP addresses for different application because switches will distribute multicast packets according to the IP address (regardless of port).

So if you have two applications that use the same IP address but different ports, a machine that is interested in either application will have to listen to both sets of traffic and filter out the port it is not interested in. If they are using different IP addresses, the switch will do that for them.
(Actually, its a bit more complicated because the switch distributes according to groups of 32 addresses, so there may be some overlap even if the addresses are different … if the addresses fall in the same MAC group.)

Jose Cacho

Here you have some images, for an overview of our fog server load today.
The active unicast tasks are properly queued if they are more than 10. This setting mantains our unicast tasking giving a good performance.
But the multicast tasks get quite slow if they are not “alone” (“one by one”). And, as you can see on the attached images, we can easily reach to five (or more) multicast groups at the same time.
– FOG Overloaded –

– FOG Managing overload –

@george1421 Thinking aloud, if the mcast-data-address is not part of the performance problem, the way could be to get the multicast tasks queued.

Jose Cacho

@george1421

When you say 2500 hosts, do all of them have the FOG client installed?

Yes, all of them (7000) have the FOG client installed. But 2500 could be polling the server in a ordinary class day. At this time that it is not a problem because all the schools are in summer break. In order to use the task reboot manager to unnatended image deploys, we set the check in time in 180 seconds. (So if a multicast deploy task is sended, the computers would have time to reboot and get suscribed before the 300 seconds limit.)

my intuition is telling me that 8 vCPU is much and you might see better performance with 4 or 6 vCPUs. But at the moment only change one thing at a time.

Ok, 6 vCPUs was the setting until one month. We were running very slow tasks and I ask for more power on our server (6-8CPU - 12-16GB RAM). We were aware that it could not be the best but we were “forced” to test it. We didn’t notice a much better performance and the plan is to have 6 vCPUs after the classrooms are ready for the new academic year.

While we are getting off point of your initial post.

Ok, coming back to the point ;P. I have been talking with one of our network team and he have give me some general information about our network. Our FOG server is atacched to a “CORE” router (10GbE). From this central point there are connections to the four named campus. I have done a sketch map.
0_1533314602550_3d0da188-fcb9-4621-8268-d9b43b794afa-image.png

Looking the map, and remembering the benchmark tests, the first unicast tests hosts are in Campus 1. And the last unicast test hosts (at “the opposite end” ) are in Campus 4.

So, (if my memory serves me correctly) my network workmate has tell me that IGMP does not use the port number parameter (only IP). And, today, we are not sure if the router has the capability of “discard” or “route efficiently” the muliticast data only to the subscribers on IP+portnumber multicast session. (Our high experienced guy on tunning multicast in our network is on holidays.)

I am looking for the cause that doubles the time needed (with the v1.5.2 server not heavily used) for a multicast tasks. (If we compare it with the v0.30 server.)
We could take a multicast deploy to a group in “campus 3” and on april with v0.30 took less than 4h (about 58GB). But yesterday with v1.5.2 more than 8h (about 67GB).
(Don’t get me wrong, I know it is very difficult to tune up all the settings. And in addition, I think our FOG implementation is not an easy one :). So step by step.)

On another level, to add some little test results to the multicast performance problem we try with the Bitrate option (yes, it seems that setting it up on the “Storage” options it get added to the udpsender command):

In “campus 1” deploy 2.43 GB/min Vs 4.24GB/min (the second test 5 minutes later without --max-bitrate 200m to the same two hosts)

root     23705  2180  0 19:44 ?        00:00:00 /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 51604 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img

root     31218  2180  8 19:51 ?        00:00:12 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 52262 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img

I have not had chance to get tested on other “campus”.

Jose Cacho

You are right, they are vCPUs. I’m asking for the CPU cores in our VM Host server and come back with the information.

2 sockets of 6 cores (12 processors) with hyperthreading enabled (24 logical processors). Our FOG server 8 vCPUs.

Jose Cacho

@george1421 Ok, bit by bit is easier.

Can you tell me how many CPU cores are in your VM Host server?

You are right, they are vCPUs. I’m asking for the CPU cores in our VM Host server and come back with the information.

Also how many fog clients you have installed across your campus that is communicating with this FOG server?

We have more than 7000 host on FOG today. And daily (our last count a year ago) it could be more than 2500 computers (switched on at the same time) communicating with this server.

Just as a question, are you trying (or doing) multicasting across your WAN links?

Yes, we are multicasting across our WAN links.
I use the term of campus to get identified the 4 different locations that have the dedicated connections to the central CPD. So the “opposite end” unicast test has been done to one of these locations.

The IP address is not a factor since it listens to every one. The combination of IP address + port number is what the target computer keys off from. Now you are entering into network switch configuration land.

I’m sending this information to our network team (I suppose they know about it but it is better to refresh the concepts.) But, taking in mind I am not an expert (and excuse me in advance if my question is some stupid), if it is the same IP sender address for two different establised multicast session on two different VLANs (one multicast task each VLAN), are all the data packets sent to two VLANs and the IGMP snooping code gets ride off the data on the last switch? Or is IGMP so smart that propagates the route to the suscribers to each sesion and the multicast data is “routed” in an efficient way from the FOG server switch?

We can deploy to 35 VLANs and for us (for our FOG) all the locations are “the same network”. So, I am thinking about some multicast tasks running near the FOG server and other ones running across the WAN links and (may be, if our network set up is not correct) the WAN links getting unnecesarily congested.

P.D.: Very good stuff. A lot of useful information for me @george1421.

Jose Cacho

Ok, thank you for the response.
About the server v1.5.2:
It is a new VM running RHEL 7.5.
8 CPU 2.67GHz, 16 GB RAM.
It has a 10GbE network (shared with some others VMs, but this days there is not a big traffic).
The original version 0.3 is running (on RHEL 5.1) on the same VM host.

Sorry, I can’t upgrade to v1.5.4 now (we are preparing our classrooms for the next course).
We have another test server and I’m asking if there is another “work in process” to get it upgraded.

For benchmarking in our production server v1.5.2:

for a single unicast image I could get above 12GB/min (HP 800 G2 CPU i7 client with SSD and 1GB network - U030688)
for 2 unicast above 9GB/min (same image and host models - U030688/U030731)
for two unicast streams at the opposite end** of our network above 1GB/min (about 1,3GB/min same model, other image U030568/U030569).

** opposite end is a difficult term to define in our network. We are deploying from one server in our campus to another three different campus with links at 3, 6 and 10 GB. The test is to two hosts on the 6GB link (I think so).

In our FOG server 1.5.2 I’ve added the suggested settings of 256MB for php-fpm and I was thinking about add the bitrate on the Storage configuration (as you can see on the updsender commandlines there was used on our prior 0.3 version). If we could do some tweaks without upgrade the version, it would be easier for us.

The question about the IP sender IP address comes on the back of our network. Because (correct me please if necessary) all the hosts on any multicast session would receive the packets of every running multicast task. Then each computer would discards the data not for him. So (with heavy load) it could result in lost packets due to saturation of the lower bandwith network links.

Jose Cacho

Hello,
We have some performance problems in multicast with FOG v1.5.2 (tested on v1.5.4 too). Our FOG, serves images to different VLANs and it can be used for a group of more than 20 tecnicians. We have realized that the multicast throughput goes dramatically down (40MB/min,…10MB/min) when more than one multicast task is running at the same time.
As we have recently migrated from v0.30 and the multicast was stable and faster, we have noticed that v0.30 changes the udpsender IP address for each task but this not happens with v1.5.2.

Version 0.30 udpsender running **with 3 active multicast task**:
[root@fog5 ~]# ps -eaf | grep udp
root     19399 24835  5 14:10 ?        00:19:11 /usr/local/sbin/udp-sender --file /images/aula-sc-ingtecnica-ss-matematicas2-docencia/d1p1.pcz --min-receivers 21 --portbase 63124 --interface eth0 --max-wait 300 --half-duplex --ttl 32 --nokbd --mcast-data-address 239.0.101.14 --max-bitrate 200m --start-timeout 3600
root     22331 24835  0 20:24 ?        00:00:00 /usr/local/sbin/udp-sender --file /images/aula-ehu-upv-enajenacion/d1p1.pcz --min-receivers 2 --portbase 63128 --interface eth0 --max-wait 300 --half-duplex --ttl 32 --nokbd --mcast-data-address 239.0.101.16 --max-bitrate 200m --start-timeout 3600
root     22338 24835  0 20:24 ?        00:00:00 /usr/local/sbin/udp-sender --file /images/aula-ehu-upv-enajenacion/d1p1.pcz --min-receivers 2 --portbase 63130 --interface eth0 --max-wait 300 --half-duplex --ttl 32 --nokbd --mcast-data-address 239.0.101.17 --max-bitrate 200m --start-timeout 3600
root     22350 16333  0 20:24 pts/2    00:00:00 grep udp
[root@fog5 ~]#
 
Version 1.5.2 udpsender running **with 2 active multicast task**:
	
[root@fog7 ]# ps -eaf | grep udp
root     18006 10428  0 20:44 ?        00:00:00 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.104.1 --portbase 51712 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
root     18084 10428  0 20:44 ?        00:00:00 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.104.1 --portbase 65366 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
root     18136  1247  0 20:45 pts/0    00:00:00 grep --color=auto udp
[root@fog7 ]#

Could it be one of the causes of the performance problem?