• Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login
  • Recent
  • Unsolved
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Register
  • Login

rcu_sched self detected stall on CPU when Deploying

Scheduled Pinned Locked Moved Solved
FOG Problems
3
19
1.9k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • W
    Wolfbane8653 Developer
    last edited by Wolfbane8653 Oct 14, 2019, 9:30 AM Oct 14, 2019, 2:06 PM

    Fog v1.5.4
    Kernal 4.19.64
    Device: HP DC7800 (YES OLD!)

    When deploying image “rcu_sched self detected stall on CPU when” is displayed an some cpu statics are shown. I figured bad motherboard or cpu so I replaced the entire board and still the same issue occurred. Tried 3 other machines of the same model and no luck. rcu_sched self detected stall on CPU when capture gave me the info I needed but this apparently has not been fixed.

    Is this due to a conflict with the age of the hardware and the new kernal updates?

    This does not effect my other hardware at all they all imaged perfectly with Kernal 4.19.64. Only dc7800’s are the issue.

    FYI:
    Kernal 4.19.64 64 – not working
    Kernal 4.19.48 64 – not working
    Kernal 4.19.36 64 – not working
    Kernal 4.19.6 64 – need to test
    Kernal 4.19.1 64 – need to test
    …
    Kernal 4.17.0 64 – not working
    Kernal 4.16.6 64 – works
    Kernal 4.15.2 64 – works

    HP Compaq dc7800p Small Form Factor
    Intel® Core™2 Duo CPU E6750 @ 2.66GHz
    6GB RAM
    160 GB HDD – WDC WD1600AAJS-00B4A0

    BIOS Version786F1 v01.04 v01.35
    Motherboard 0AA8h

    2019-10-14_10-05-49.png

    1 Reply Last reply Reply Quote 0
    • G
      george1421 Moderator
      last edited by Oct 14, 2019, 2:19 PM

      For this specific host manually register it and then in the host definition add the following to the kernel args field acpi=off. Lets see where that gets us.

      Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

      1 Reply Last reply Reply Quote 1
      • Q
        Quazz Moderator
        last edited by Oct 14, 2019, 2:24 PM

        The original problem in that thread was solved afaik.

        That said, rcu_sched can actually be caused by many different things, your case seems different.

        Since it’s directly preceded by ACPI errors, disabling ACPI as George suggested is a good start.

        1 Reply Last reply Reply Quote 0
        • W
          Wolfbane8653 Developer
          last edited by Oct 14, 2019, 2:33 PM

          FYI 4.16.6 64 – works

          • Reset Kernals back to v4.19.64
          • Deleted host from Fog database
          • Manual registration
          • Set Host Kernel Arguments with acpi=off
          • Set to Deploy
          • Kernal panic
            d59d6733-31fb-4df5-9ce3-1bd9da3c4cb8-image.png
          Q 1 Reply Last reply Oct 14, 2019, 2:36 PM Reply Quote 0
          • Q
            Quazz Moderator @Wolfbane8653
            last edited by Oct 14, 2019, 2:36 PM

            @Wolfbane8653 Can you share a picture of the result with acpi=off?

            Also can you share the specifications of the system?

            1 Reply Last reply Reply Quote 0
            • W
              Wolfbane8653 Developer
              last edited by Oct 14, 2019, 2:40 PM

              Kernal 4.17.0 64 – does not work.

              • Kernal 4.19.64 with acpi=off
                49df2ac0-baf5-4aa8-9c3c-99374a3a1a54-image.png
              G Q 2 Replies Last reply Oct 14, 2019, 2:47 PM Reply Quote 0
              • W
                Wolfbane8653 Developer
                last edited by Oct 14, 2019, 2:45 PM

                @Quazz –

                HP Compaq dc7800p Small Form Factor
                Intel® Core™2 Duo CPU E6750 @ 2.66GHz
                6GB RAM
                160 GB HDD – WDC WD1600AAJS-00B4A0

                BIOS Version786F1 v01.04
                Motherboard 0AA8h

                1 Reply Last reply Reply Quote 0
                • G
                  george1421 Moderator @Wolfbane8653
                  last edited by Oct 14, 2019, 2:47 PM

                  @Wolfbane8653 Also make sure the firmware is updated on this target computer.

                  We just ran through debugging this issue with a new intel platinum processor. That processor had an issue with the number of cores being capped at 8. In this case the processor is old so we should not be hitting that bug here.

                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                  1 Reply Last reply Reply Quote 0
                  • Q
                    Quazz Moderator @Wolfbane8653
                    last edited by Oct 14, 2019, 2:52 PM

                    @Wolfbane8653 A BIOS update may be required.

                    Other than that, trying out this kernel here might be interesting: https://drive.google.com/open?id=1ZiRWrrN3dv26bLwW8GAEdLtzGw5xkyQI

                    1 Reply Last reply Reply Quote 0
                    • W
                      Wolfbane8653 Developer
                      last edited by Oct 14, 2019, 3:29 PM

                      BIOS updated to v1.35 (Latest)
                      Kernal 4.19.64 still does not work.

                      Custom Kernal still does not work
                      32de5071-ae8c-4d5e-8605-b933e9487a51-image.png

                      Q G 2 Replies Last reply Oct 15, 2019, 8:06 AM Reply Quote 0
                      • Q
                        Quazz Moderator @Wolfbane8653
                        last edited by Quazz Oct 15, 2019, 2:09 AM Oct 15, 2019, 8:06 AM

                        @Wolfbane8653 Please try kernel argument tsc=unstable

                        Then try kernel argument clocksource=hpet

                        1 Reply Last reply Reply Quote 1
                        • G
                          george1421 Moderator @Wolfbane8653
                          last edited by Oct 15, 2019, 11:48 AM

                          @Wolfbane8653 I’m currently building a FOS Linux kernel without acpi support to see if we can get past the rcu_sched issue. I can say tracking down this type of issue does take time because its hardware/model specific. If you can get one of the kernel parameters that Quazz mentioned to work is the preferable route. I’ll post a link to the noacpi generated kernel when its done building.

                          Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                          W 1 Reply Last reply Oct 15, 2019, 1:25 PM Reply Quote 0
                          • G
                            george1421 Moderator
                            last edited by Oct 15, 2019, 12:51 PM

                            Here is a test kernel with no acpi functions supported: https://drive.google.com/open?id=1siERUC9h8MfQIXbqrQShKOHc55h5xK3q

                            Download it as bzImageNoACPI and move it to /var/www/html/fog/service/ipxe directory on your fog server. Then go into the host definition for this specific host with the rcu_sched error and enter bzImageNoACPI (watch the case) into the kernel field and save the host configuration. Then pxe boot the target computer into imaging to see if we can get past the cpu stall.

                            Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                            1 Reply Last reply Reply Quote 0
                            • W
                              Wolfbane8653 Developer @george1421
                              last edited by Wolfbane8653 Oct 15, 2019, 7:32 AM Oct 15, 2019, 1:25 PM

                              @Quazz

                              • tsc=unstable – works with Kernel 4.19.64
                              • clocksource=hpet – works with Kernel 4.19.64

                              So both of these commands work!

                              @george1421 – bzImageNoACPI creates a kernal panic. IDE is turned on in the BIOS. I do not use the RAID function for these machines.
                              8db818e6-c69c-48eb-b6dc-307b5f989e36-image.png

                              G 1 Reply Last reply Oct 15, 2019, 2:09 PM Reply Quote 0
                              • Q
                                Quazz Moderator
                                last edited by Oct 15, 2019, 1:47 PM

                                I’m glad those commands worked.

                                So this is a problem that I think was introduced in the Spectre/Meltdown patches and only affects Core 2 CPUs.

                                I thought it was supposed to be fixed in Kernel 4.19, but apparently not.

                                1 Reply Last reply Reply Quote 0
                                • G
                                  george1421 Moderator @Wolfbane8653
                                  last edited by Oct 15, 2019, 2:09 PM

                                  @Wolfbane8653 said in rcu_sched self detected stall on CPU when Deploying:

                                  tsc=unstable – works with Kernel 4.19.64
                                  clocksource=hpet – works with Kernel 4.19.64

                                  Great on fixing it with the timing source. That is the solution. As for my noacpi I figured that would happen because I also removed the acpi boot device drivers too. It was a risk, but the right answer is with the kernel parameters with the stock kernel. Well done!

                                  Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                  W 1 Reply Last reply Oct 15, 2019, 2:25 PM Reply Quote 0
                                  • W
                                    Wolfbane8653 Developer @george1421
                                    last edited by Oct 15, 2019, 2:25 PM

                                    So I’m guessing I’m going to need to edit all 100 of my units to have this argument? Or are you working on having a new bzImage for me to test?

                                    Q 1 Reply Last reply Oct 15, 2019, 2:46 PM Reply Quote 0
                                    • Q
                                      Quazz Moderator @Wolfbane8653
                                      last edited by Quazz Oct 15, 2019, 8:47 AM Oct 15, 2019, 2:46 PM

                                      @Wolfbane8653 You can safely set this globally, unless you have even older CPUs

                                      W 1 Reply Last reply Oct 15, 2019, 3:32 PM Reply Quote 1
                                      • W
                                        Wolfbane8653 Developer @Quazz
                                        last edited by Oct 15, 2019, 3:32 PM

                                        Current Solution set Kernel to 4.19.64 and set global option in Fog Configuration–> FOG Settings --> General Settings --> Kernel ARGS to tsc=unstable

                                        Luckly this is the last year for these machines.

                                        1 Reply Last reply Reply Quote 0
                                        • 1 / 1
                                        1 / 1
                                        • First post
                                          2/19
                                          Last post

                                        161

                                        Online

                                        12.0k

                                        Users

                                        17.3k

                                        Topics

                                        155.2k

                                        Posts
                                        Copyright © 2012-2024 FOG Project