Dell 5090 MFF restores during the capture process. HELP!


  • Hello,

    We used to use Clonezilla and still do occasionally but recently transitioned to FOG Project.
    We are using FOG successfully for the past few weeks across the 5060, 5080 and 5090 Dell computers and laptops. But for the 5090 MFF it restarts during the partition 3 of the capture process (about 8% is completed).
    The only thing we do before we capture is restore the 5090 MFF with an existing clonezilla image before we start the capture with FOG.
    There is no error or anything displayed.
    We tried to put the system back to the original state and try and it still fails.
    Would really appreciate any help.

  • Moderator

    @handso said in Dell 5090 MFF restores during the capture process. HELP!:

    The Linux Kernel is 5.10.71 Tom Elliott arm64.

    I am just wondering about the arm64 part. Probably just what you copy & pasted when grabbing the version but I still want to bring this to attention.

    So I did the capture through the debug method. Unfortunately no error message outputted on the screen during the capture. It happens around 8% of disk 3 being cloned. It just exited directly and restarted.

    As George already said this is hardly ever the case with FOG. So we don’t have a simple step by step guide to debug this issue yet. There is a slight chance that updating to dev-branch can help because we updated the FOS inits to a newer buildroot version just recently. Though the Linux kernel version has not changed much (5.10.86) and I don’t think that’s gonna make the difference. But it’s still worth a try.

    If that doesn’t help I suggest you take a video of the screen while capturing. Make sure you setup the camera/smartphone (on a pile of books for example) to get a steady recording. Some cameras even allow for 60 fpm videos. This way we might have a chance to see even a very brief error message flashing the screen just before it reboots.


  • @george1421 I booted it into windows and did disk check and memory check. Every thing came back clean. I am kinda out of ideas to try. The only remaining one is the dev branch of FOG.

  • Moderator

    @handso said in Dell 5090 MFF restores during the capture process. HELP!:

    So I did the capture through the debug method.

    ok great you have the latest kernel. So that rules out an older 4.19.x linux kernel issue. I seemed to miss the point where you were capturing the image and it reboots. The title mentioned “restores”, so I thought it was a during a deploy.
    Now that makes me think there is some kind of disk corruption on the source disk. Can you boot that source disk system into windows and scan for hard disk issues? I seem to remember an issue when a FOG admin did an in place upgrade of windows on the golden image. For whatever reason it messed up the golden disk until the disk was checked for errors and compacted. Its been so long now I don’t remember if that was the solution or to just recreate the golden image using the latest iso.


  • @george1421 So I did the capture through the debug method. Unfortunately no error message outputted on the screen during the capture. It happens around 8% of disk 3 being cloned. It just exited directly and restarted.
    The Linux Kernel is 5.10.71 Tom Elliott arm64. Hope that helps. Appreciate the time.

  • Moderator

    @handso Well definatly upgrading to the dev build will solve a problem later on. It still troubles me that we don’t know the version of the FOS Linux kernel. You can also get the version if you schedule a deploy but tick the debug checkbox. Then pxe boot the target computer. After a few screens of text you will be dropped to the FOS Linux command prompt. If you key in uname -a it will print out the version of the linux kernel.

    As long as you are in debug mode, key in fog to start the deployment process in single step mode. You will need to press the enter key at each breakpoint. The hope is that you can catch the real error message before it does the reboot. I’m expecting something printed on the partclone screen in random text locations on the partclone screen.


  • @george1421 Thanks I will try to upgrade to the Dev build and give it a try. I am using the latest fog linux and kernel. I don’t have access to the system to provide you with exact number, but I do know I updated to the latest stable release. If there is anything else I can try please let me know.

  • Moderator

    @handso said in Dell 5090 MFF restores during the capture process. HELP!:

    FOG release 1.5.9 and Ubuntu 20.04 LTS.

    While this isn’t your problem at the moment, you should upgrade to the dev branch that will take your FOG build to 1.5.9.110 or later. MS changed some disk structures in Win10 20H1 and later that you will need the dev branch to fix.

    Now I’m still missing the version of FOS Linux you have. You can either get this answer via the web gui in fog configuation -> kernel update or from the FOG sever command line with this command file /var/www/html/fog/service/ipxe/bzImage This version should be 5.6.x or later. Ideally 5.10.x series for the latest hardware support.


  • @george1421 Thanks for the reply. We are using the latest FOG release 1.5.9 and Ubuntu 20.04 LTS.
    There was 90 gigs of space available on the FOG server. Is that not enough? Usually the MFF image. takes only about 50 gigs of space.
    We did a memory check etc and everything came back clean.
    This issue happened with 2 MFF’s but never with any other system.

  • Moderator

    @handso It would be helpful to know the version of FOG as well as the FOS Linux kernel (Web UI-> FOG Configuration -> Kernel update).

    Its very rare that FOG will just abort and reboot without giving an error. It kind of sounds like a hardware memory error and not something specific to FOG. But lets see.

    Also make sure your FOG server is not out of storage space using df -h from the fog server linux console.

336
Online

9.1k
Users

15.7k
Topics

145.8k
Posts