Images not deploying to computers

  • I have four images that I’ve created in my FOG server:

    DellLat3350 - 4
    Single Disk - Resizable
    Partclone Compressed default 23.71 GiB 2020-08-31 16:08:20
    HP840Laptop - 2
    Single Disk - Resizable
    Partclone Compressed default 33.65 GiB 2020-08-24 18:32:12
    LenovoE540-550 - 3
    Single Disk - Resizable
    Partclone Compressed default 24.10 GiB 2020-08-31 16:09:20
    Win102004 - 1
    Single Disk - Resizable
    Partclone Compressed default 23.61 GiB 2020-08-24 18:02:26

    FOG v1.5.90-RC2.11
    bzImage Version: 4.15.2
    bzImage32 Version: 4.19.123

    Image 1 was created on a VM and works fine when deployed to any computer. I installed the base image on those other three computers, ran Windows updates to install their missing drivers, and updated a couple of drivers after. I then captured each of those images without issue, and set the computers aside in case I needed them again for whatever reason.

    If I try to use any of the others, I get errors during the imaging, and a Windows error after that it can’t start. The “Windows_Startup_Error.png” is a screenshot from a VM I created and tried to install the Lenovo image onto. I checked and there is no /var/log/partclone.log file for me to look at.
    These are from the Delll laptop:

    These are from the Lenovo:

    And finally the VM with the Lenovo image:

  • I had tried that originally when I started working with all of this. Either I wasn’t doing something correctly, or it’s not built that way, because I never could get it to enable the build-in administrator account. I still have the files if you want to take a look at them.

  • Moderator

    @gadams Look into unattended installations (unattend.xml)

  • Alright, now that I’m back in the office we can continue this madness. After doing a bit of searching online and prepping some computers at home for sale, I do see why I should use Sysprep on my base image so the SID isn’t duplicated to all of the computers it’s deployed to.

    However I need to retain the admin and local user accounts that I’ve set up and I don’t want the computer to boot into OOBE or Audit Mode after the image is deployed. It might be outside the scope of this thread and/or forum, but is that possible?

  • @gadams

    From the latest pics in the post… The Fog server…
    Seems hard drive related for sure. Problems mounting the file system and other issues related to the file system.

  • Moderator

    @gadams Ok that is good, it is as I said to take away from home and give to /images. That guild will help you well.

  • Thanks. This is what I followed to set everything up (making minor adjustments for my environment, of course) Is there something better I should look at?

  • Moderator

    @gadams I would say yes, since it will take less time to reinstall centos, update it and then install FOG than trying to fix what is broken and then repair the image files.

    When you recreate your FOG server. Change your disk allocations manually because the centos installer will put all of the extra disk space in /home where you don’t need it. Instead change the mapping from /home to /images that way your images directory will be not on the same partition as the OS as well as giving it the most space on the disk. You will see what I mean if you option to manually partition the disk.

  • So best to just delete the entire CentOS VM and start a new one?

  • Moderator

    @gadams Looks like you found the corruption, it is in your fog server’s disk structure. Sounds like its time to start over.

  • This is a new FOG install that I created at the end of July, and these are the first systems to use it.

    The original golden image is on a VM on the same Proxmox system, and that is what I have been able to deploy to all of the computers without issue.

    I wasn’t using sysprep, because it’s on a VM, so I didn’t think it would matter.

    However, all of that is currently a much smaller issue compared to the server not being able to boot properly. I’ve been getting the below image for quite some time now, and even had to rollback to a snapshot from 7/14 to get it working again last month.

    I recaptured the golden image from the VM using the shutdown /s /t 0 command and left it to deploy to one of the Lenovo laptops. When I came in about an hour ago, it was giving me this message: 2020-09-03 13.03.47.png, which prompted me to check the site only to find out I couldn’t get to it. I got in via SSH and rebooted it, which resulted in this: CentOS_FOG_Boot_Error.png which resulted in this: CentOS_FOG_Emergency_Mode.png.

    And I’m sure all of this should go into a new thread, so just let me know if I should move it or not.

  • @george1421

    I can confirm those mixed kernel versions work fine. We also have dc7800’s they can make it to the FOG menu but they can’t register, deploy or capture with the newer kernels.

    alt text

    When I first encountered DC7800 problems I rolled the kernel way back to one even older than 4.15.2 and I would get an error message saying kernel was too old at some point during the DC7800 registration process.
    4.15.2 was the first kernel I came to where it didn’t throw the kernel was too old error.

    With 4.15.2 I have been able to image, capture, etc etc all different kinds of laptops and workstations via legacy and UEFI PXE boot without any problems.

    Once the DC7800’s are gone I will go to the latest kernel. I hate those things.

    I am thinking there is some corruption with the captured image maybe even a bad sector on the FOG servers hard drive?. Just a guess but definitely not the kernel.

    what is up (PTM) @gadams 👍

  • Moderator

    @gadams said in Images not deploying to computers:

    The computers are all on the same subnet and there’s nothing interesting between them and the FOG server. The server is hosted on a Proxmox VM, so really thing only things between it and the computers are a couple of switches.

    Understand what I’m doing is trying to build a truth table of what we know so far.

    Is this a new FOG install and these are the first systems you are trying or have you had this server in production before and this problem is just something new?

    Ok so there is nothing notable between the server and the clients.

    So if we capture a new image from a source computer plugged into the same switch as the proxmox server can you deploy the image right back to either the same system or same model using the same network cable?

    When you capture the image does the task close out properly and is removed from the list of currently running tasks? (this will show us that the FOG server at least thinks it did it job well).

    As Quazz said, when you go to capture your reference image make sure you have sysprep power off the golden image or if you are not sysprepping (why not??) use shutdown -s -t 0 to power off the golden image before capturing with FOG.

    If you can capture and deploy properly on the same switch as the fog server and golden image start moving your deployment computer 1 switch away from the FOG server’s switch and see if you get the same results?

    Something else to note, if when you deploy the image, you tick the debug checkbox before you schedule the task that will put the target computer into debug mode. PXE boot the target computer and after a few screens of text you will be dropped to a linux command prompt. Key in fog to start the imaging process. You will need to press enter at each break point. But the idea of debug mode is to be able to catch the actual message that partclone generates. That is the important message everything else really doesn’t matter at this point.

  • @george1421 I created the group and assigned the kernel to it, then updated the default kernel to 5.6.18. I still have bzImage32 Version: 4.19.123 listed in there.

    The computers are all on the same subnet and there’s nothing interesting between them and the FOG server. The server is hosted on a Proxmox VM, so really thing only things between it and the computers are a couple of switches.

  • Moderator

    @gadams If you used sysprep for the original base image, then it likely shut down the PC cleanly; thus avoiding the problem.

  • Moderator

    @gadams said in Images not deploying to computers:

    I have the 4.15 kernel in there because I have old HP dc5800 and dc7800 desktops that are using legacy BIOS and read that they work best with that kernel. I’m not entirely sure how to remove a kernel, though. I should still upgrade to the latest kernel though, right?

    This is a bit outside of the scope of your problem at the moment but here is what I would do in regards to the kernels.

    copy bzImage to bzImage415 on the fog server in /var/www/html/fog/service/ipxe directory.

    Create a new FOG set group call it Assign bzImage415 (or really what ever).

    Assign your HP dc5800 and dc7800 desktops to that group. This group should be fixed moving forward unless you plan on adding new (old) dc5800 and dc7800 to your environment.

    Then use that set group to configure the Group Kernel to bzImage415 (watch your case) for all of the computers in that group.

    Once that is done then you can/should upgrade the default kernel using the FOG Configuration -> Kernel to upgrade to the 5.6.18 or later kernel. This will help you with new hardware. I can’t say it will help you with your current issue.

    Looking at your screen shots I see the files ARE there and have a size to them. This makes me think you have a communication issue between the fog server and the target computers, OR the images you have are corrupt.

    Are the target computers on the same subnet as the FOG server?

    Are they at the same site or is there something interesting in your network between the FOG server and target computers?

  • @Quazz I did find the Fastboot option in the power settings and turned it off on the base image. Still strange that the base image works fine no matter which computer it’s sent to, so there must be something with these laptops that’s getting corrupted during the upload.

  • @george1421 I have the 4.15 kernel in there because I have old HP dc5800 and dc7800 desktops that are using legacy BIOS and read that they work best with that kernel. I’m not entirely sure how to remove a kernel, though. I should still upgrade to the latest kernel though, right?

    I do have three partitions on each of the images and they do have almost identical sizes to them.



    This is the original master image the other three are based on:

  • Moderator

    @gadams It’s common to confuse the two, but I think Sebastian is referring to Fastboot on the Windows 10 installation itself. (in the software, not the BIOS).

    It’s an option under Power Management -> Behavior of power button (translated, could be called differently, it’s top on the left menu anyway)

    Then you need to click on the thing to show all options, requiring admin access.

    You can now uncheck “Fast boot”

    The reason “Fast boot” can cause issues is because it stores certain things in the “hibernation” file so it can more quickly load on next boot. Unfortunately this causes issues for FOG (and certain drivers…) and is pretty pointless for SSDs anyway.

    Alternatively, simply shutdown cleanly before capture. (Windows Key + R -> shutdown /s /t 0)

  • Moderator

    @gadams First thing I would recommend is to upgrade the FOS Linux kernel to 5.6.0 or later using the FOG Configuration -> Kernel menu.

    I also noticed that you are using a mixed version of the kernels, why is that (bzImage 4.15 and bzImage32 4.19)?

    Once upgraded I would go back and recapture your master image again. Then look in (on the fog server) /images/<image_name> and ensure you see all of the partitions you should have (i.e if you have 4 partitions you should see d1p1.img, d1p2.img, d1p3.img, and d1p4.img) And they all should have a size to it. I’m going to guess something happened on the upload of the image, in someway its broken.

    These partclone errors (they are the most telling) gives me the impression (if the d1xx.img files exist and are intact) of a network communication problem. 7.2GB/min is a decent rate so this makes me think the networking is probably not the issue because if there WHERE errors this transfer rate would be pretty low because of retransmissions of data.

    I think I would ignore the recovery bit for now because it looks like the image isn’t being fully transferred to the target computer. The image on the target computer is broken so the recovery panel is being displayed. There is no chance for recovering this image.