• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Multicast data address not change from one task to another one

    Scheduled Pinned Locked Moved Solved
    FOG Problems
    4
    16
    1.9k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      Jose Cacho @george1421
      last edited by

      @george1421 Ok, bit by bit is easier.

      Can you tell me how many CPU cores are in your VM Host server?

      You are right, they are vCPUs. I’m asking for the CPU cores in our VM Host server and come back with the information.

      Also how many fog clients you have installed across your campus that is communicating with this FOG server?

      We have more than 7000 host on FOG today. And daily (our last count a year ago) it could be more than 2500 computers (switched on at the same time) communicating with this server.

      Just as a question, are you trying (or doing) multicasting across your WAN links?

      Yes, we are multicasting across our WAN links.
      I use the term of campus to get identified the 4 different locations that have the dedicated connections to the central CPD. So the “opposite end” unicast test has been done to one of these locations.

      The IP address is not a factor since it listens to every one. The combination of IP address + port number is what the target computer keys off from. Now you are entering into network switch configuration land.

      I’m sending this information to our network team (I suppose they know about it but it is better to refresh the concepts.) But, taking in mind I am not an expert (and excuse me in advance if my question is some stupid), if it is the same IP sender address for two different establised multicast session on two different VLANs (one multicast task each VLAN), are all the data packets sent to two VLANs and the IGMP snooping code gets ride off the data on the last switch? Or is IGMP so smart that propagates the route to the suscribers to each sesion and the multicast data is “routed” in an efficient way from the FOG server switch?

      We can deploy to 35 VLANs and for us (for our FOG) all the locations are “the same network”. So, I am thinking about some multicast tasks running near the FOG server and other ones running across the WAN links and (may be, if our network set up is not correct) the WAN links getting unnecesarily congested.

      P.D.: Very good stuff. A lot of useful information for me @george1421.

      J 1 Reply Last reply Reply Quote 0
      • J
        Jose Cacho @Jose Cacho
        last edited by

        You are right, they are vCPUs. I’m asking for the CPU cores in our VM Host server and come back with the information.

        2 sockets of 6 cores (12 processors) with hyperthreading enabled (24 logical processors). Our FOG server 8 vCPUs.

        george1421G 1 Reply Last reply Reply Quote 0
        • george1421G
          george1421 Moderator @Jose Cacho
          last edited by

          @jose-cacho said in Multicast data address not change from one task to another one:

          2 sockets of 6 cores

          The thing is with 8 vCPUs allocated to the VM, the hypervisor needs to have 8 of the 12 cores available for the VM to be scheduled to execute. The other factor is how many VMs are on this VM Host server. While we are getting off point of your initial post. But my intuition is telling me that 8 vCPU is much and you might see better performance with 4 or 6 vCPUs. But at the moment only change one thing at a time.

          When you say 2500 hosts, do all of them have the FOG client installed? If so, what is your client check in time for the fog client? If it still set for 60 seconds, change that to 900 (15 minutes). That will dramatically drop the load on the FOG server.

          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

          J 1 Reply Last reply Reply Quote 0
          • J
            Jose Cacho @george1421
            last edited by

            @george1421

            When you say 2500 hosts, do all of them have the FOG client installed?

            Yes, all of them (7000) have the FOG client installed. But 2500 could be polling the server in a ordinary class day. At this time that it is not a problem because all the schools are in summer break. In order to use the task reboot manager to unnatended image deploys, we set the check in time in 180 seconds. (So if a multicast deploy task is sended, the computers would have time to reboot and get suscribed before the 300 seconds limit.)

            my intuition is telling me that 8 vCPU is much and you might see better performance with 4 or 6 vCPUs. But at the moment only change one thing at a time.

            Ok, 6 vCPUs was the setting until one month. We were running very slow tasks and I ask for more power on our server (6-8CPU - 12-16GB RAM). We were aware that it could not be the best but we were “forced” to test it. We didn’t notice a much better performance and the plan is to have 6 vCPUs after the classrooms are ready for the new academic year.

            While we are getting off point of your initial post.

            Ok, coming back to the point ;P. I have been talking with one of our network team and he have give me some general information about our network. Our FOG server is atacched to a “CORE” router (10GbE). From this central point there are connections to the four named campus. I have done a sketch map.
            0_1533314602550_3d0da188-fcb9-4621-8268-d9b43b794afa-image.png

            Looking the map, and remembering the benchmark tests, the first unicast tests hosts are in Campus 1. And the last unicast test hosts (at “the opposite end” 🙂 ) are in Campus 4.

            So, (if my memory serves me correctly) my network workmate has tell me that IGMP does not use the port number parameter (only IP). And, today, we are not sure if the router has the capability of “discard” or “route efficiently” the muliticast data only to the subscribers on IP+portnumber multicast session. (Our high experienced guy on tunning multicast in our network is on holidays.)

            I am looking for the cause that doubles the time needed (with the v1.5.2 server not heavily used) for a multicast tasks. (If we compare it with the v0.30 server.)
            We could take a multicast deploy to a group in “campus 3” and on april with v0.30 took less than 4h (about 58GB). But yesterday with v1.5.2 more than 8h (about 67GB).
            (Don’t get me wrong, I know it is very difficult to tune up all the settings. And in addition, I think our FOG implementation is not an easy one :). So step by step.)

            On another level, to add some little test results to the multicast performance problem we try with the Bitrate option (yes, it seems that setting it up on the “Storage” options it get added to the udpsender command):

            • In “campus 1” deploy 2.43 GB/min Vs 4.24GB/min (the second test 5 minutes later without --max-bitrate 200m to the same two hosts)
            root     23705  2180  0 19:44 ?        00:00:00 /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 51604 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
            
            root     31218  2180  8 19:51 ?        00:00:12 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 52262 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
            
            • I have not had chance to get tested on other “campus”.
            1 Reply Last reply Reply Quote 0
            • J
              Jose Cacho
              last edited by

              Here you have some images, for an overview of our fog server load today.
              The active unicast tasks are properly queued if they are more than 10. This setting mantains our unicast tasking giving a good performance.
              But the multicast tasks get quite slow if they are not “alone” (“one by one”). And, as you can see on the attached images, we can easily reach to five (or more) multicast groups at the same time.
              – FOG Overloaded –
              0_1533663868194_000_fogOverloaded.jpg
              0_1533663886074_005_fogAtopOverloaded.jpg

              – FOG Managing overload –
              0_1533663980541_010_fogLoadManaging.jpg
              0_1533664000861_012_fogAtopLoadManaging.jpg 0_1533664011998_014_fogMulticastTasks.jpg
              0_1533664038607_016_fogUdpsender.jpg

              @george1421 Thinking aloud, if the mcast-data-address is not part of the performance problem, the way could be to get the multicast tasks queued.

              1 Reply Last reply Reply Quote 0
              • george1421G
                george1421 Moderator
                last edited by george1421

                @jose-cacho said in Multicast data address not change from one task to another one:
                I’ve trying to think about how we can best debug this issue.

                At this moment I’m just thinking out loud here: There has been many changes since 0.30. Partclone is now used instead of Partimage, ZSTD is used as the standard image decompressor (even if gzip is picked for image capture). The FOS kernel (the customized linux that runs on the target computer) has been updated a hundred time or so. Plus all of the ancillary applications to FOG have been updated. The linux OS of the FOG host server has been updated.

                On the other side: The VM is running on the same infrastructure as 0.30 instance. The image is taking the same data path between the VM host server and the target computers.

                Well we know we can manually launch the udp-sender application on the FOG server with this command:

                /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.107.1 --portbase 52262 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img

                On the target computer there will be a udp-receiver command that will connect to the multicast stream initiated by the fog server. I don’t know the exact command that FOG is using but it should be close to this

                udp-receiver --file /tmp/pig.tmp --nokbd --portbase 52262 --ttl 32 --mcast-rdv-address 239.0.107.1

                The one thing I did notice is that the ttl is set to 32, so you can’t have more than 32 hops between the sender and receiver. Unless you have a really big campus then this shouldn’t come into play.

                Now if you schedule a debug capture or debug deploy and then pxe boot the target computer, on the target computer you will be dropped to a linux command prompt where you can key in commands like udp-receiver

                ref: https://www.udpcast.linux.lu/cmd.html

                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                J 1 Reply Last reply Reply Quote 0
                • J
                  Jose Cacho @george1421
                  last edited by

                  @george1421 Thank you very much for your thougts and suppport. I agree with you, there has been many changes since our last version. And we will have to test the udpcast commands to get the best (thanks). But I have some more tests and data.
                  We have run simultaneous “controlled” multicast tasks on different campuses and network team has captured the traffic (port mirroring) on one of the multicasted computers. (Please, let me know if you don’t understand something on this post. I am not used to write about network terms, and It could be a better way to explain it.)

                  The summary is:

                  1. All the traffic of the same multicast address IP reaches the computer NIC. It is not filtered by port.
                  2. As the multicast IP address is the same, the different mcast sessions to a campus are sent by the same data channel and not over another one (to take advantage of the other data channels if the former one is giving its max throughput). Note we have the campus connected by 2, 3 or 4 different aggregated data channels and the data is balanced to get the overall best throughput and performance. But the IP is a vital data to get it properly routed. So, when one (or more) multicast session is running on a campus, all the multicast data is routed by the same data channel.
                  3. FOG server’s CPUs goes to 100% only with 2 simultaneous multicast task on the Campus1: one task of 9 computers and another one of 41.

                  So, I’m thinking about:
                  A) could you help us tweaking FOG to get each specific multicast tasks using a different IP?
                  B) (…And thinking aloud) If FOG needs to resend more packets and it has to be waiting for “an overflow” data channel, could be this the main cause of the CPU comsumption?

                  And now, some additional data courtesy of our network team:

                  • Port mirroring and capturing the traffic on a multicasted computer, we can see it receives the data of all the running multicast tasks
                    0_1533898631426_0c8e6172-bc0a-4ef3-894c-7426b2b35f9c-image.png

                  • From https://community.cisco.com/t5/switching/multicast-ports/td-p/854295

                  But you would do well to use different multicast IP addresses for different application because switches will distribute multicast packets according to the IP address (regardless of port).

                  So if you have two applications that use the same IP address but different ports, a machine that is interested in either application will have to listen to both sets of traffic and filter out the port it is not interested in. If they are using different IP addresses, the switch will do that for them.
                  (Actually, its a bit more complicated because the switch distributes according to groups of 32 addresses, so there may be some overlap even if the addresses are different … if the addresses fall in the same MAC group.)

                  1 Reply Last reply Reply Quote 0
                  • F
                    Fernando Gietz Developer
                    last edited by Fernando Gietz

                    Hi,

                    I changed the multicasttask.class.php file to give different ips in each multicast session, and the performance is better now.

                    One line in /var/www/html/fog/service/multicasttask.class.php:

                    #diff multicasttask.class.php multicasttask.class.php.ori 
                    421,423d420
                    < /* Se añade esta linea para que asigne direcciones IP diferentes a cada tarea multicast*/
                    < 	$address = long2ip(ip2long($address)+(( $this->getPortBase() / 2 + 1) % self::getSetting('FOG_MULTICAST_MAX_SESSIONS')));
                    < /* FIN DEL CAMBIO*/
                    

                    This line assigns dinamic multicast ips to the sessions, to do it the code uses some parameters of the server: the portbase (this port is created by FOG randomly) and FOG_MULTICAST_SESSIONS.

                    You can see the udp-sender commands:

                    Command: /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 239.0.106.12 --portbase 63764 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-ehu-upv-enajenacion/d1p1.img;

                    Command: /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 3 --max-wait 300 --mcast-data-address 239.0.106.31 --portbase 55994 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-W10-UEFI/d1p1.img;

                    1 Reply Last reply Reply Quote 0
                    • J
                      Jose Cacho
                      last edited by

                      Hi @george1421,

                      With the change made by my workmate @Fernando-Gietz, regarding to the use of a multicast data address, we have improved the througput when there are serveral multicast tasks running at the same time. So you can mark this as solved (I don’t find how to do it).

                      We are now focusing our attention on mysql tunning. Because (as you pointed) with the course started the polls of the fog clients on the hosts bring our CPUs “to the red zone”.

                      Only for keep the information on the post: I “remember” (I am back from holidays today) that our colleage from network team told me about IGMP: v3 can avoid delivering multicast packets from specific sources to networks where there are no interested receivers, but v2 can’t. And, our router is running IGMP v2.

                      COMPARISON OF IGMPV1, IGMPV2 AND IGMPV3
                      Understanding difference between IGMPv2 and v3

                      Many thanks for your excellent support.

                      george1421G 2 Replies Last reply Reply Quote 1
                      • george1421G
                        george1421 Moderator @Jose Cacho
                        last edited by

                        @Jose-Cacho said in Multicast data address not change from one task to another one:

                        v3 can avoid delivering multicast packets from specific sources to networks where there are no interested receivers, but v2 can’t. And, our router is running IGMP v2.

                        I’m not a network engineer, but I think that “IGMP Snooping” configured on the switches will supplement IGMP v2, to make it a bit more like v3 by only delivering the multicast stream to the stream subscribers.

                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                        1 Reply Last reply Reply Quote 0
                        • george1421G
                          george1421 Moderator @Jose Cacho
                          last edited by

                          @Jose-Cacho said in Multicast data address not change from one task to another one:

                          With the change made by my workmate @Fernando-Gietz, regarding to the use of a multicast data address, we have improved the througput when there are serveral multicast tasks running at the same time. So you can mark this as solved (I don’t find how to do it).

                          @Developers we might want to consider @Fernando-Gietz patches for the next release of FOG.

                          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                          1 Reply Last reply Reply Quote 1
                          • Tom ElliottT
                            Tom Elliott
                            last edited by Tom Elliott

                            @Fernando-Gietz said in Multicast data address not change from one task to another one:

                            Se añade esta linea para que asigne direcciones IP diferentes a cada tarea multicast

                            I’ve added the patch, but a little more checking involved. This has been added to both the working and working-1.6 branches. It tests the set value for the $address variable. If this variable is set, it will calculate the address. Here’s the snippet of lines:

                            if ($address) {
                                $address = long2ip(
                                    ip2long($address) + (
                                        (
                                            $this->getPortBase() / 2 + 1
                                        ) % self::getSetting('FOG_MULTICAST_MAX_SESSIONS')
                                    )
                                );
                            }
                            

                            Hopefully this will address the problem people have been seeing and allow the use of multiple sessions.

                            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                            1 Reply Last reply Reply Quote 1
                            • 1 / 1
                            • First post
                              Last post

                            194

                            Online

                            12.0k

                            Users

                            17.3k

                            Topics

                            155.2k

                            Posts
                            Copyright © 2012-2024 FOG Project