• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Multicast randomly hangs

    Scheduled Pinned Locked Moved
    FOG Problems
    6
    19
    8.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L
      Leonux
      last edited by

      Hi

      I have been testing the 1.0.0 and now the 1.0.1, and i congratulate the developers team for the hard work, there are nice improvements at the interface level and also supporting new features.

      I’m using Ubuntu server 14.04 with kernel 3.13.0-24-generic but i’m having problems with multicast, when i send the multicast task to a group with 3 clients sometimes they all hang starting the fisrt partition (sda1) and other times randomly starting partition (sda3 or sda4), i have been whatching multicast.log.udpcast.16 but cant find the source of the problem.

      My HD partitions :

      [IMG]http://s27.postimg.org/txiu5tv7j/partitions.jpg[/IMG]

      Screen where multicast hangs , i think it could be any partition (sda1 or sda2 and so on):

      [IMG]http://s27.postimg.org/67teh4wu7/hangs.jpg[/IMG]

      multicast.log.udpcast.16 when it hanged

      [IMG]http://s27.postimg.org/dp2lwcmdb/log.jpg[/IMG]

      I have many partitions, i know, but i already used the same HD configuration with previous versions of Fog 0,32, and i never had this issue before

      I selected Windows 7 OS option and also Multiple Partition - Single Disk , the image is working fine for a single machine download, i only get the problem when downloading using multicast.

      Thank you.

      1 Reply Last reply Reply Quote 0
      • Jaymes DriverJ
        Jaymes Driver Developer
        last edited by

        I can’t see your images they are soo tiny 😞

        please upload your multicast log.

        I don’t like multicast… but that is because I think that unicast does imaging so well! I don’t like setting a group of 30 machines to image to find that only 2 completed and the rest I have to image again in the morning because a client fell out in the middle of imaging.

        I prefer Unicast, I can up the number of machines I can send to and send the same image to all of the hosts at the same time in much the same fashion as Multicast does, except each machine will politely wait it’s turn and each machine will finish in it’s own time, instead of keeping all the machines at the SAME download checkpoint until all machines reach the same checkpoint.

        I just feel that Unicast is a better system

        Now that I got to vent I would be happy to help you solve your multicast woes, but you can always use unicast to circumvent the problem until we resolve the issue permanently.

        WARNING TO USERS: My comments are written completely devoid of emotion, do not mistake my concise to the point manner as a personal insult or attack.

        1 Reply Last reply Reply Quote 0
        • L
          Leonux
          last edited by

          Hi and thank you for the quick response.

          I uploaded the 3 images and the multicast logs.

          Thank you

          [url=“/_imported_xf_attachments/0/777_hangs.jpg?:”]hangs.jpg[/url][url=“/_imported_xf_attachments/0/778_log.jpg?:”]log.jpg[/url][url=“/_imported_xf_attachments/0/779_partitions.jpg?:”]partitions.jpg[/url][url=“/_imported_xf_attachments/0/780_multicast.log.zip?:”]multicast.log.zip[/url]

          1 Reply Last reply Reply Quote 0
          • Jaymes DriverJ
            Jaymes Driver Developer
            last edited by

            Thank you, is there any possibility of your switchgear stopping the multicast?

            I say this because it seems that the stopping point isn’t consistent, so I wonder if the load you are putting on the switch isn’t too much for it to handle. Can you test by putting the machines and the fog machine on a hub for testing purposes?

            WARNING TO USERS: My comments are written completely devoid of emotion, do not mistake my concise to the point manner as a personal insult or attack.

            1 Reply Last reply Reply Quote 0
            • L
              Leonux
              last edited by

              i’m sorry but i dont have any hub’s 😞

              I already used this switch from Nortel in other tests i made using previous version of FOG (0.32 or 0.33) and never had any problems using multicast 😞

              I have tried with a different image with only 2 or partitions but the problem remains …

              I don’t know if this information in important butI was on version 1.0.0 then upgraded to version 1.0.1

              1 Reply Last reply Reply Quote 0
              • Jaymes DriverJ
                Jaymes Driver Developer
                last edited by

                hmm, okay well if you used your switches in the past to image, the mutlicast manager is still the same I don’t believe we made any changes to it, so multicasting should work. Let me get a test going and I will see if I can replicate the issue!

                WARNING TO USERS: My comments are written completely devoid of emotion, do not mistake my concise to the point manner as a personal insult or attack.

                1 Reply Last reply Reply Quote 0
                • L
                  Leonux
                  last edited by

                  OK thanks, but in the previous versions of FOG partclone was also used?

                  It seems to me that there is some kind of syncing that fails, i suppose there is a trigger where the server “knows” that all the machines that belong to a certain multicast sessions are all ready to receive, or he detects that a machine is disconnected when she isn’t.

                  The multicast never stops in the middle of a partition at least i haven’t seen that, i only hangs when the process is starting a new partition replication.

                  1 Reply Last reply Reply Quote 0
                  • K
                    kingofl337
                    last edited by

                    I am having the same problem with multicast. It gets to the Partclone screen and it hangs.
                    I also don’t see any instance of the request for multicast in the log. Both computers started up
                    to multicast but I can’t find a record of it in the multicast log. /opt/fog/log.multicast.log.

                    version unbuntu server 14.01
                    fog 1.0.1

                    Previously working multicast

                    1 Reply Last reply Reply Quote 0
                    • Tom ElliottT
                      Tom Elliott
                      last edited by

                      is the FOGMulticastManager service actually operating? On the fog server is the task udpsender actually running?

                      Try Restarting the FOGMultlicastManager service:
                      [code]service FOGMulticastManager restart[/code]

                      To check if udpsender is actually running run the command:
                      [code]ps -ef|grep udp-sender[/code]

                      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                      Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                      Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                      1 Reply Last reply Reply Quote 0
                      • D
                        domii666
                        last edited by

                        same problem…

                        1 Reply Last reply Reply Quote 0
                        • L
                          Leonux
                          last edited by

                          Hi and good week for every1,

                          Just checked the if udp-sender was running and it wasn’t, just restarted [FONT=Consolas]FOGMulticastManager, restarted the clients and multicast is working, lets see if all goes well until the end, since i have many partitions.[/FONT]

                          [FONT=Consolas]Last week i also managed to get multicast to start, but also hanged between partitions.[/FONT]

                          [FONT=Consolas]I will keep you updated in 1 hour or so.[/FONT]

                          1 Reply Last reply Reply Quote 0
                          • L
                            Leonux
                            last edited by

                            Both client machines finished all partitions.

                            I have one client, lets call him client1 with first boot = Network, and the other client2 with first boot = HD, what happened was, after restoring all partitions, client2 booted to windows and changed the hostname accordingly, but client1 entered again in the multicast task and hanged in the last partition, don’t understand how this happened…

                            Now i’m running the same task in 3 clients, just started, let’s see how it goes.

                            1 Reply Last reply Reply Quote 0
                            • D
                              domii666
                              last edited by

                              i get back unrecognized service after typing “sudo service FOGMulticastManager restart”

                              greez

                              1 Reply Last reply Reply Quote 0
                              • L
                                Leonux
                                last edited by

                                Thats because the service ins’t running, try [CODE]sudo service FOGMulticastManager start[/CODE]

                                I normally use in Ubuntu “sudo service <servicename> start/stop” but for some reason it seems that “sudo /etc/init.d/<servicename>[SIZE=2] start/stop” works somehow [/SIZE]differently[SIZE=2] and with some services better!!![/SIZE]

                                1 Reply Last reply Reply Quote 0
                                • D
                                  domii666
                                  last edited by

                                  i get fail.

                                  1 Reply Last reply Reply Quote 0
                                  • Tom ElliottT
                                    Tom Elliott
                                    last edited by

                                    Try rerunning the installer.

                                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG! Get in contact with me (chat bubble in the top right corner) if you want to join in.

                                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                    1 Reply Last reply Reply Quote 0
                                    • L
                                      Leonux
                                      last edited by

                                      I have been doing some more tests on this issue.

                                      I have started my normal multicast task for a group with 3 clients, they all went the “Starting to restore image…” Window but cloning doesn’t start.

                                      udp-sender is up and running, and what it seems to be the problem is that while i only have 3 clients in the group the "–min-receivers " was 5 , i suppose this has to be 3.

                                      I canceled the task and executed it again and now the task started, with --min-receivers 3, i’m gonna wait the cloning process finish’s just to confirm that it doesn’t hang between partitions.

                                      1 Reply Last reply Reply Quote 0
                                      • L
                                        Leonux
                                        last edited by

                                        The task finished fine yesterday.
                                        But since somehow the --min-receivers is not always the number of members in the multicast group, today :
                                        [LIST=1]
                                        []unnistalled FOG, removed folders, services and erase the fog database.
                                        [
                                        ]Checked out new revision from [url]https://svn.code.sf.net/p/freeghost/code/trunk[/url]
                                        []Clean installed FOG
                                        [
                                        ]Registered 3 clients
                                        []Created 2 groups, one with 2 clients the other with only one client
                                        [
                                        ]Started a multicast task for the 2 client’s group
                                        [/LIST]
                                        The task started but hanged in the beginning of the second partition (sda2).
                                        I uploaded the multicast.log, and has you can see the --min-receivers, somehow changed to 4, which initially were 2.
                                        Thanks

                                        [url=“/_imported_xf_attachments/0/799_multicast.log.zip?:”]multicast.log.zip[/url]

                                        1 Reply Last reply Reply Quote 0
                                        • F
                                          Felipe Solari
                                          last edited by

                                          Don’t know if you got it right by now, but here it goes how I solved it.
                                          I write another thread in BUGS, about this.
                                          It has to do with the way the script fog.download counts partitions.
                                          If you really need multicast with multiple partitions:

                                          • unzip init.xz (the same goes for init32.xz) with xz -d init.xz
                                          • mount the init file in a loop device with: mount -o loop init sometempdir/
                                          • go to sometempdir/bin with cd
                                          • edit fog.download, and search for the part that does the multicast write, for your “method” (mps or mpa)
                                          • look for the part that does a loop on each partition, and fix it in a way that it checks for the existence of the file
                                            (something like if [ ! -f $imgpart ] ; then echo “Partition file missing …jumping”; sleep 1; else writeMulticastImage; fi )
                                            (look in the “not multicast” lines or the previous multicast method, for just linux type)
                                          • save your file, and back to the init file dir.
                                          • zip it with xz -C crc32 init

                                          Put the task again to try it.

                                          Try it again …you should briefly the message “Partition file …” and get the correct ones to the partclone/partimage program stream

                                          Other related bug is in the MulticastTask.class.php. If you have 10 or more partitions, you need natsort() instead of sort().

                                          Maybe you need a (not so much experienced) linux shell programmer to help.

                                          P.S. Be sure to have also installed the php-process extensions to PHP, as the killing of multicast tasks uses posix_ functions in it. (That is for a CentOS / Redhat server; on Ubuntu server I have not tested or searched for them)

                                          1 Reply Last reply Reply Quote 0
                                          • 1 / 1
                                          • First post
                                            Last post

                                          186

                                          Online

                                          12.0k

                                          Users

                                          17.3k

                                          Topics

                                          155.2k

                                          Posts
                                          Copyright © 2012-2024 FOG Project