fog.deploy image option in PXE boot not resizing disk



  • Hello everyone!

    Been using Fog for a bit now but decided to create an account to resolve an issue with my setup. So as it stands now, the following processes work without any issues:

    • Registering host, assigning image, setting image task = good.
    • Quick register in PXE, reboot into NIC, auto-starts imaging = good.

    My setup is a small, isolated fog server with one gold image (win 10) that I’m pushing out to dozens of identical machines at a time. They do not need to be registered or managed by fog for my needs.

    However, if I try to use the fog.deployimage option in PXE in order to save the extra step of going back into the PXE menu after another reboot, then imaging process goes fine, but I end up with a hard drive that only has 22.34 GB of space (size of the image) rather than the full 500 GB drive.

    I’m guessing there’s settings I need to insert into the Parameters or Boot Option area of “fog.deployimage” but I’m having a hard time finding documentation. Here’s what my parameters currently say:

    set username foglogin
    set password foglogin
    params
    param mac0 ${net0/mac}
    param arch ${arch}
    param username ${username}
    param password ${password}
    param qihost 1
    isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
    isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
    

    My “Boot Options” field is blank. Again image process starts and completes, however once I get into the system I have no free disk space as it doesn’t seem to auto-resize the partition. All the other imaging methods do not run into this issue, so I don’t think it’s the image itself.

    Thanks to anyone that can help!

    EDIT
    As requested here are my d1 file settings:

    *d1.fixed*
    
    :1:2:1:2
    
    *d1.minimum.partitions*
    
    label: gpt
    label-id: 783BF3D4-7584-46CD-BD4F-A6C3A205ADB9
    device: /dev/sda
    unit: sectors
    first-lba: 34
    last-lba: 976773134
    
    /dev/sda1 : start=        2048, size=     1331200, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=12A79277-1317-4197-8DD7-C1D35A81446B, name="EFI system partition", attrs="GUID:63"
    /dev/sda2 : start=     1333248, size=      262144, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=A37C02AC-BF15-47DA-895B-77BE9310A8D2, name="Microsoft reserved partition", attrs="GUID:63"
    /dev/sda3 : start=     1595392, size=    45477692, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=D62B301E-DC7F-4D51-9FC5-86F93D164FF3, name="Basic data partition"
    /dev/sda4 : start=   974745600, size=      855524, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=45D30F51-7028-45DC-A645-1AE317B8282C, name="attrs=\x22RequiredPartition GUID:63"
    
    
    *d1.partitions*
    
    label: gpt
    label-id: 783BF3D4-7584-46CD-BD4F-A6C3A205ADB9
    device: /dev/sda
    unit: sectors
    first-lba: 34
    last-lba: 976773134
    
    /dev/sda1 : start=        2048, size=     1331200, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=12A79277-1317-4197-8DD7-C1D35A81446B, name="EFI system partition", attrs="GUID:63"
    /dev/sda2 : start=     1333248, size=      262144, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=A37C02AC-BF15-47DA-895B-77BE9310A8D2, name="Microsoft reserved partition", attrs="GUID:63"
    /dev/sda3 : start=     1595392, size=   973150208, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=D62B301E-DC7F-4D51-9FC5-86F93D164FF3, name="Basic data partition"
    /dev/sda4 : start=   974745600, size=     2027520, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=45D30F51-7028-45DC-A645-1AE317B8282C, name="attrs=\x22RequiredPartition GUID:63"
    

    Thanks!



  • @Sebastian-Roth

    Well if they don’t match that may be part of the issue. The layout is from two identical Optiplex 7060 with 500GB drives. Top was one that imaged successfully, bottom is what I see when it doesn’t. All should match my golden image which is also an Optiplex 7060 with a 500 GB drive. I’m not sure where the disconnect would be from fogs settings, the image, and what ends up on the machine.

    Again I’ve more or less gotten around this by modifying my image. So I figure if this issue is unique to me it’s probably safe to let it go. If it does start poking up from others at least my info would be a good starting point.


  • Developer

    @androsszit Well we can still open this thread again. Do you want us to look into this or are you happy with the modified partition layout you fixed it with?

    It will be quite hard to replicate this issue as things are working great with my test machines/partition layout. I could try using the information you provided (d1.partitions …) but I have a feeling that there is something else to it. Lately we have seen at least two posts where expand didn’t work properly and I still assume it was a very weird issue in the binary captured d1.mbr which is very hard to come by (hex editing…).

    Edit: What partition layout is this really? From my point of view this doesn’t match the layout figures you posted above.



  • @Sebastian-Roth

    Hey not sure if it matters now since the issue is closed. But I finally got a chance to post the images. Here’s pics of the good (top) bad (bottom). Please ignore the samsung on the 2nd, that’s just a USB SSD I’m using.

    When it works:
    good imaging.PNG

    Lots of unallocated space:
    image resize bad.PNG


  • Moderator

    @Sebastian-Roth I also did testing of iPXE Deploy image vs the web ui Deploy image and the instructions (kernel parameters passed) to FOS are identical. So at least what FOS knows about there should be no difference in the way it deploys the image.


  • Developer

    @androsszit Just as a quick note: Running a test because I had the test setup for another issue as well and for me it expanded perfectly fine on an unregistered host.



  • @Sebastian-Roth

    Problem is I overwrote my other image when I created the new one. I’ll look into creating another image and see if I can repeat the process. Thing is it doesn’t sound like the issue impacted anyone else and was pretty unique to me. I’m comfortable with closing it as fixed.

    Thanks again to everyone for all your help! Very pleased with the friendliness and speed of replies on this board!


  • Developer

    @androsszit said in fog.deploy image option in PXE boot not resizing disk:

    Looks like messing with the partitions may have fixed it! I redid the golden image with reduced partitions and pushed it out (via the PXE boot Deploy Image) to a couple brand new machines that had never communicated with fog before. Seems like they imaged fine and the drives were the correct size!

    That makes the whole story even more strange to me :-D I can’t see why it would make a difference in this case now when it would “fail” with the other partition layout?!

    Well, I’ll wait for your pictures now or we simply close this as fixed… up to you. :-)



  • @Sebastian-Roth

    Thanks for continuing to look into this! I’ll try to post the images when I can. I’ve got one for a “proper” install but I’m having to deploy machines right now so I haven’t had time to “break” one again! :P

    I’ll definitely look into modifying my golden image. I have no qualms about reducing the number of partitions since we don’t plan on using Dell’s recovery partition or anything of that sort.

    Edit:
    Looks like messing with the partitions may have fixed it! I redid the golden image with reduced partitions and pushed it out (via the PXE boot Deploy Image) to a couple brand new machines that had never communicated with fog before. Seems like they imaged fine and the drives were the correct size!


  • Developer

    @androsszit I’ll definitely look into this and do my own testing when I get a bit more time. Maybe tomorrow but can’t promise. From all you have tried so far it really looks like an odd bug in FOG/FOS but I have no idea yet. I’ll just try to replicate the steps you took and we’ll see.

    I’d still be interested to see how this partition layout looks like when you say it’s “properly” expanded versus when it’s not expanded. Can you take pictures of disk management and post here?

    Just doing some maths:

    • 976773134 (last-lba = kind of the last usable sector that Linux would use on the original source disk) * 512 byte (sector size) = 465.76 GB ~ 500 GB drive, ok.
    • 974745600 (last start sector, which FOG never moves AFAIK) * 512 byte = 464.79 GB -> this means that you won’t be able to deploy this image to a disk that is smaller than roughtly 466 GB (last start sector plus a ~ 990 MB sda4 partition)
    • When you deploy to a much larger disk what happens is that sda3 will be as big as it has been on the source drive (because start of sda4 is not moved) and sda4 will be expanded to whatever is left on the drive.

    Nevertheless there might be a bug that prevents expansion on unregistered deployments I’d still highly recommend you look into (re)moving the forth partition to make your partition layout work nicely for expansion.



  • @george1421

    Well hopefully it’s an issue that may be addressed in the next update. Until then I will make sure to register my hosts before pushing the image.

    To confirm, I just finished reimaging (via Deploy Image in PXE) after unregistering, and it once again did not properly resize the drive.

    It’s bizarre, but thankfully for me doesn’t stop me from using fog, which outside of the occasional hiccup has been an awesome time saver. I hope that despite all the troubleshooting posts the devs know IT people like me really appreciate this tool!


  • Moderator

    @androsszit said in fog.deploy image option in PXE boot not resizing disk:

    For me this means that every imaging method I run works ONLY if I register the host first

    I’m still looking into the Deploy image code. The parts that I traced out before is actually calling another script that I need to decode called fog.capone within FOS. Which appears to be another path than the deploy image question at the end of full registration.

    I may be going down a rabbit hole here, but I want to understand the difference you say you are seeing (I haven’t been able to duplicate it with my fog server, it images and expands correctly no matter what path I choose).

    Edit I started writing this and removed it, but I’ll ask it anyway. That disk layout, is that an OEM created image you are using for your golden image? That 4th partition is raising the question for me.



  • @Sebastian-Roth

    Hmm. Well in my case all my machines are identical 500 GB seagate drives in Optiplex 7060s. As far as I can tell there should be no difference between the machines in terms of hardware or drivers. Even so, I find it strange that deployment through all (1) register host -> (2) deploy image methods work. It seems like skipping the registration part is what causes my issue. I’m confirming now though.

    Also is there anyway to change that for my image? I basically setup my “golden” machine and then did the register host -> capture task -> and then captured the image. I’m not sure how to change the partition starting points.



  • Well funny enough I just ran it again in order to capture screenshots and it worked properly. I think I know why though;

    This time even though I used the “deploy Image” again, I didn’t remove the now already registered host from fog. I think fog does a quick reg even if you just pick “Deploy Image”, because now that I’m back in the PXE menu “quick reg” and “full reg” are missing and I only see “Quick Host Deletion”. For me this means that every imaging method I run works ONLY if I register the host first. It appears that trying to bypass this step causes the issue. I’ll confirm by running “Host Deletion” and trying “Deploy Image” again but I’m 99% certain it will have the disk size issue again.

    Does anyone know why this is?


  • Developer

    @androsszit I’ve just had a quick look at the partition layout and from my point of view FOG was never able to properly expand this exact image/partition layout at any time because the last partition starts at sector 974745600 and FOG up until today does not move partition starting points even if it would be possible. We still don’t go there because there are so many situations that things can get messed up when we try to move partitions’ start locations.

    So as a quick idea I can only imagine it looking like it does expand properly on some models which have a different disk size than others. Might be wrong on that but it’s my first impression…



  • @george1421

    Correct. That’s my confusion too on why it would cause issues. My golden image is a Dell Optiplex 7060 and my hosts are all the same. All are using 500GB hard drives. Only difference is the way I push the image. Everything seems to work except for the “Deploy Image” option in PXE menu. I’m running it again the “wrong” way so I can show the difference in drive partitions and so I have one to answer any other questions.

    I feel like this “Deploy Image” option is pulling settings from somewhere different, I’m doing something wrong, or it’s a bug.


  • Moderator

    @androsszit said in fog.deploy image option in PXE boot not resizing disk:

    However on the “wrongly” imaged machine all the unallocated space ends up on a recovery partition

    Stick with me here, we are trying to make sure we understand the exact situation. On the wrongly image machine, it is an exact match to a system that did image correctly, with the only difference between the two systems being how you launch imaging? I’m just trying to rule out hardware variations causing this issue. I’m still having a problem understanding why there is a difference because the imaging process is exactly the same in regards to disk expansion.

    Note to self: This is the iPXE boot parameters for a quick image.

    #!ipxe
    set fog-ip 192.168.1.53
    set fog-webroot fog
    set boot-url http://${fog-ip}/${fog-webroot}
    kernel bzImage loglevel=4 initrd=init.xz root=/dev/ram0 rw ramdisk_size=270000 web=http://192.168.1.53/fog/ consoleblank=0 rootfstype=ext4 mdraid=true mac=00:00:00:00:00:00 ftp=192.168.1.53 storage=192.168.1.53:/images/ storageip=192.168.1.53 osid=9 irqpoll chkdsk=0 img=Dell3630Base imgType=n imgPartitionType=all imgid=39 imgFormat=5 capone=1 type=down
    imgfetch init.xz
    boot
    


  • @george1421

    I just confirmed and it worked fine doing the full registration and imaging. Under disk management it correctly created a ~464GB primary partition that contains the image, page file, etc. However on the “wrongly” imaged machine all the unallocated space ends up on a recovery partition.

    I’ll post screen captures in a moment.



  • @george1421

    That’s a correct interpretation. Basically any method I use outside of PXE boot option of “Deploy Image” seems to push the image to the machine properly and also resize the disk properly. I’m doing a pretty simple setup and my “golden image” machine is identical to my new hosts, so no weird changes that need to be accounted for.

    Quick registration does work if I set it to auto-image when it completes (but it still requires the extra step of rebooting and going back into the PXE menu since I can’t have the machines default to PXE on bootup).

    I’ll run the full registration now and let you know the results, but I’m pretty sure I’ve used the option before without problems.


  • Moderator

    @androsszit So just to be clear, deploy image from the iPXE menu doesn’t expand the disk correctly, but registering the host then and then from the web ui deploying the image to it works?

    I know this is asking a bit more from you, but if you do a iPXE menu full registration at the end of the full registration it asks you if you want to deploy the image, if you answer yes to this does it expand the disk correctly? This is the first I’ve heard about a different reaction from the iPXE Deploy image and from the web ui deploy image.



571
Online

6.3k
Users

13.7k
Topics

129.3k
Posts