• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    full registration hangs at bzimage

    Scheduled Pinned Locked Moved
    Hardware Compatibility
    5
    22
    5.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      christian99x
      last edited by

      Thanks for the quick answer!

      Please post a picture of the pfSense settings here.

      It’s rather large…

      https://screenshots.firefoxusercontent.com/images/e08efdf4-40ed-4dde-9146-163a1291f95a.png

      Anything else you need to know?

      You need to know about things like DHCP, option 66 aka next-server and option 67 aka filename. Read up on this stuff, I’d suggest. The internet is full of great explanations on that stuff.

      Yes, I will definitely!

      So the issue is specific to that mainboard I’d say. Please boot up you Linux Mint and run lspci -nn | grep net. Post what you get on the screen here.

      03:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller [11ab:4362] (rev 20)

      Nope, different issue! Probably best if you open a complete new thread on this so we don’t mix up things! It’s a lot easier for everyone to follow if we don’t discuss several issues in one thread.

      I will try to solve this on my own first …

      1 Reply Last reply Reply Quote 0
      • S
        Sebastian Roth Moderator
        last edited by

        @christian99x said:

        It’s rather large…
        https://screenshots.firefoxusercontent.com/images/e08efdf4-40ed-4dde-9146-163a1291f95a.png

        Looking good as far as I see. The last couple of settings above the Save button are important for PXE booting. “Next Server” (option 66), “Default BIOS file name” (option 67 for legacy BIOS systems). Seems fine.

        03:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller [11ab:4362] (rev 20)

        Ok, this is valuable information I suppose. Though a quick search hasn’t brought anything up yet. On the iPXE website they even say this NIC is supported. An well, yes we see it does kind of work but as soon as it starts loading a big file over HTTP it seems to hang. I think we need to do a packet dump to see if there are network (congestion) errors happening. Get your client ready but don’t start it yet. On your FOG server install tcpdump (apt-get/yum install tcpdump). Then run the following command and substitute x.x.x.x with the client’s IP address: tcpdump -w /tmp/boot_issue.pcap host x.x.x.x

        Now boot up the client and till it hangs at 5% or whatever. Wait another 10-20 seconds and then stop tcpdump on the FOG server (Ctrl+c). Upload the /tmp/boot_issue.pcap file to you dropbox/googledrive and post a link here so we can check it out.

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        1 Reply Last reply Reply Quote 0
        • C
          christian99x
          last edited by

          https://www.dropbox.com/s/cwhgiscanusgx6o/boot_issue.pcap?dl=0

          george1421G 1 Reply Last reply Reply Quote 0
          • george1421G
            george1421 Moderator @christian99x
            last edited by

            @christian99x Looking at the pcap we see it transfer undionly.kpxe without issue to the last block. At this point you should see the iPXE boot menu.

            I think where you are getting stuck at XX% is when FOS loads. From your pcap I’m not seeing the pull request for bzImage.

            Do you have the ability to take a second computer and a small hub or a small switch (like SLM2008) using a mirror port capture the traffic actually going in and out of that target computer. We really need to see the entire pxe booting conversation here. The tcpdumps from the FOG perspective only tell us what the FOG server is doing. We need to see from the target computer perspective what is getting to the target from the fog server, dhcp server, tftpboot, etc.

            I know we are asking a lot here. You have an abnormal situation that is causing this to fail. What you have is abnormal at least from what we’ve seen historically.

            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

            1 Reply Last reply Reply Quote 0
            • S
              Sebastian Roth Moderator
              last edited by

              @christian99x Is this packet dump somehow being filtered after capturing? The only thing I see is TFTP and ARP traffic. Missing is DHCP (should at least see broadcasts) and HTTP packets.

              So either those were filtered out or your network is way more complex. Possibly DHCP server, client and FOG server are in three different network segments. That way we wouldn’t see the DHCP messages when capturing packets on the FOG server. But then… where are the HTTP packets? Maybe you filtered to only show UDP packets??

              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

              1 Reply Last reply Reply Quote 0
              • C
                christian99x
                last edited by christian99x

                Is this packet dump somehow being filtered after capturing?

                No - I used the command you mentioned and uploaded the original file

                Possibly DHCP server, client and FOG server are in three different network segments.

                Not that I know, though I did not set it up…

                Do you have the ability to take a second computer and a small hub or a small switch (like SLM2008) using a mirror port capture the traffic actually going in and out of that target computer.

                I’ve never done something like this and I only got a rough idea how to do it, I really would appreciate if you can point me in the right direction

                1 Reply Last reply Reply Quote 0
                • C
                  christian99x
                  last edited by

                  if I cancel with ctrl-c and do a ifstat on the pxe shell I receice the following message(s):

                  net0: 00:1a:92:9e:10:e1 using undionly on 0000:03:00.0 (open)
                  [Link:up, TX:150 TXE:1 RX:276 RXE:9]
                  [TXE: 1 x “Network unreachable (http://ipxe.org/28086011)”]
                  [RXE: 4 x “Operation not supported (http://ipxe.org/3c3f6303)”]
                  [RXE: 4 x “Error 0x42306001 (http://ipxe.org/42306001)”]
                  [RXE: 1 x “Invalid argument (http://ipxe.org/1c056002)”]

                  1 Reply Last reply Reply Quote 0
                  • S
                    Sebastian Roth Moderator
                    last edited by Sebastian Roth

                    @christian99x said in full registration hangs at bzimage:

                    [Link:up, TX:150 TXE:1 RX:276 RXE:9]

                    Looks good! TX and RX having reasonable numbers. Don’t worry about the TXE / RXE, that’s not probelmatic errors.

                    I keep wondering why we don’t see the HTTP traffic in the packet dump?!

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    1 Reply Last reply Reply Quote 0
                    • C
                      christian99x
                      last edited by

                      I did another attempt with tcpdump and now I can see at least some HTTP traffic - maybe this helps:

                      https://www.dropbox.com/s/4sl1rog18suypmu/bootissue.pcap?dl=0

                      After ctrl-c tcpdump told me:

                      tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
                      97 packets captured
                      97 packets received by filter
                      0 packets dropped by kernel
                      67 packets dropped by interface

                      …still working on the mirror port capture…

                      Thanks!

                      1 Reply Last reply Reply Quote 0
                      • S
                        Sebastian Roth Moderator
                        last edited by

                        @christian99x Now we see a lot more in the packet dump! Yeah. So it does request boot.php which is transferred just fine. Next is bg.png - here we already see some first TCP retransmission packets, though it seems to finish properly. Then bzImage transfer begins and seems of for the first couple of data and acknowledge packets going back and forth. But in the first microseconds the transfer seems to stall completely. From my point of view this is because the client machine does not acknowledge the packets anymore. The interesting thing is that we see ACKs from the client 15, 30 and 45 seconds after the stall. So it kind of seems that the client is not “dead”.

                        Unfortunately there is not much we can do for you I think. I’d need access to such a machine and a lot of time to debug what is causing the network stall. It’s a driver issue within iPXE I reckon.

                        But to be sure we’d need to rule out other things. Can you try connecting the FOG server and this single one client by using a dump mini switch or even a crossover cable. Does it do the same thing?

                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                        1 Reply Last reply Reply Quote 0
                        • Q
                          Quazz Moderator
                          last edited by

                          It might be worthwhile to try a different boot file (eg ipxe.pxe instead of undionly.kpxe) as well.

                          1 Reply Last reply Reply Quote 1
                          • C
                            christian99x
                            last edited by

                            @quazz said in full registration hangs at bzimage:

                            It might be worthwhile to try a different boot file (eg ipxe.pxe instead of undionly.kpxe) as well.

                            changing the boot file to ipxe.pxe did the trick! The host performed the full registration successfully without any errors!

                            Should we continue? It will take me some time to do a mirror port capture (but I’m definitely okay doing this)

                            1 Reply Last reply Reply Quote 0
                            • S
                              Sebastian Roth Moderator
                              last edited by

                              @christian99x said:

                              Should we continue? It will take me some time to do a mirror port capture (but I’m definitely okay doing this)

                              Don’t worry about the mirror port. It’s definitely fine to use ipxe.pxe if it works for you. Some work better (or at all) for different hardware. Just see if you can boot all your hardware using ipxe.pxe. If so, just stick to that. We default to undionly.kpxe because that causes the least issues. But as we see there are pieces of hardware around not liking the UNDI driver stuff.

                              @Quazz Thanks heaps for mentioning the other binaries. I had thought about this as well but forgot to mention it in my last post.

                              Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                              Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                              1 Reply Last reply Reply Quote 0
                              • 1
                              • 2
                              • 1 / 2
                              • First post
                                Last post

                              185

                              Online

                              12.1k

                              Users

                              17.3k

                              Topics

                              155.3k

                              Posts
                              Copyright © 2012-2024 FOG Project