• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    FOG 1.5.6: Auto resize is unpredictable

    Scheduled Pinned Locked Moved Solved Bug Reports
    57 Posts 7 Posters 17.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Cheetah2003C
      Cheetah2003 @Sebastian Roth
      last edited by Cheetah2003

      @Sebastian-Roth said in FOG 1.5.6: Auto resize is unpredictable:

      @Cheetah2003 Are you still keen to look into this?

      I’d be happy to. What do you want me to do?

      Also, for what it’s worth, I’m not sure multi-partition resizing is really necessary. I can’t really think of any use cases for this ‘feature.’

      The percentage thing described earlier sounds pretty dubious, especially if you’re capturing 5 partitions from a 50GB disk… and the recovery partition is 20% of that space (10GB)… you don’t need that taking 20% of a target drive. That would be kinda crazy.

      So really, IMHO, a percentage of the original drive captured from seems kinda not-useful. I still think this should be controllable entirely from the image specification. But I think that would require the image specification to actually pull info out of the captured image to offer the user options for how to handle the partitions contained within that image. Probably a pretty big rewrite of that entire part of the system. I’d love to see this, but yeah, it’s going to be a big task from my perspective.

      So I’ll be happy to peek/test whatever you need help with, as time permits, but I’m a little unsure of the goal.

      Q 1 Reply Last reply Reply Quote 0
      • S
        Sebastian Roth Moderator
        last edited by

        @Cheetah2003 A couple of posts down the road (four days earlier) I offered instructions on how to manually run the re-size calculation script. This is a good start to play with and get to see how this is all working. I am fairly sure this is not without flaw and it would be great if you are keen to look into it and suggest things you find.

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        1 Reply Last reply Reply Quote 0
        • E
          Eric Johnson
          last edited by

          Im just joining in here. But we are seeing the something like the same problem here. Fog 1.5.6. We have Dell 3430’s we are getting ready for deployment this fall.

          A 3430 is a new model for us, and the first we have that doesn’t let you have a MBR boot disk, just GPT. Got everything working with a GPT clonemaster which for various ugly reasons has partitions like this:

          [root@fog clonemaster10-lab-gpt]# cat d1.minimum.partitions
          label: gpt
          label-id: 701D9ABD-7D9A-11E9-B9AE-5254009E1079
          device: /dev/sda
          unit: sectors
          first-lba: 34
          last-lba: 257228766

          /dev/sda1 : start= 2048, size= 1124352, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=701D9AB9-7D9A-11E9-B9AE-5254009E1079
          /dev/sda2 : start= 1126400, size= 234728416, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=701D9ABA-7D9A-11E9-B9AE-5254009E1079
          /dev/sda3 : start= 255332352, size= 204800, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=701D9ABB-7D9A-11E9-B9AE-5254009E1079, name=“attrs=\x22GUID:63”

          sda2 is the real windows 10 partition…

          cat d1.partitions
          label: gpt
          label-id: 701D9ABD-7D9A-11E9-B9AE-5254009E1079
          device: /dev/sda
          unit: sectors
          first-lba: 34
          last-lba: 257228766

          /dev/sda1 : start= 2048, size= 1124352, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=701D9AB9-7D9A-11E9-B9AE-5254009E1079
          /dev/sda2 : start= 1126400, size= 254204148, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=701D9ABA-7D9A-11E9-B9AE-5254009E1079
          /dev/sda3 : start= 255332352, size= 204800, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=701D9ABB-7D9A-11E9-B9AE-5254009E1079, attrs=“GUID:63”

          cat d1.fixed_size_partitions
          :3:3

          All seemed great till we tried it on some new machines to deploy and after fog/oobe/namechange/domainjoin SOME of them wouldn’t let anyone log in. Turns out the middle partition didn’t get extended correctly in some cases so windows was out of disk.

          I can multicast to 4 identical machines and on 3 of them /dev/sda2 gets resized correctly, but one it doesn’t. And the one it fails on is not always the same… Funky eh?

          When I do a debug deploy with ismajordebug=9 it always works…

          Was going to go digging into my memory to rebuild a init.xz that has ismajordebug=9 next. See if that makes 4 host multicast work. Or points to the problem.

          Oh, and a manual run of /usr/share/fog/lib/procsfdisk.awk in debug mode seems to be producing the correct output.

          vaguely wondering if $tmp_file2 is getting hosed some how before fillSfdiskWithPartitions calls applySfdiskPartitions… But like i said I can not get problem to replicate in majordebug mode yet.

          Would be glad to instrument out fog.download in any way you suggest.

          More tomorrow if I find anything useful.

          E

          1 Reply Last reply Reply Quote 1
          • Q
            Quazz Moderator @Cheetah2003
            last edited by

            @Cheetah2003 OS-built recovery partitions have the partition flags that keeps them fixed size.

            The reason for multi partition resize is in case you have your normal partition layout (fixed size 1-3 + Windows partition) and an additional data partition.

            eg

            /dev/sda 1 200mb
            /dev/sda2 800mb
            /dev/sda3 200mb
            /dev/sda4 30GB
            /dev/sda5 200GB

            You can’t automagically know which of the last 2 partitions to resize and which to ignore. Windows needs room to breathe, but if you deploy this to a 2TB drive then having a 1.8TB windows partition and 200GB data partition feels silly.

            I agree that the current method isn’t good enough, of course, but it’s not without its logic.

            Back to the topic of trying to figure this (this being why sometimes partitions don’t resize) out, as far as I can tell, these resize issues only occur on GPT based layouts.

            I’ll be looking over partition-funcs.sh in that sort of a direction.

            george1421G Cheetah2003C 2 Replies Last reply Reply Quote 0
            • george1421G
              george1421 Moderator @Quazz
              last edited by

              @Quazz Do we know enough about the problem to say… The problem started with Windows 10 version XXXX yet? I’m a bit suprised that if this is a GPT disk layout issue we haven’t had this problem before now? Or is it related to changes in FOS that caused this issue to come up (like building FOS from a newer release of buildroot causing packages to be updated)?

              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

              Q 1 Reply Last reply Reply Quote 0
              • Q
                Quazz Moderator @george1421
                last edited by

                @george1421 There were some changes to GPT related stuff, not a lot, but some

                https://github.com/FOGProject/fos/commits/master/Buildroot/board/FOG/FOS/rootfs_overlay/usr/share/fog/lib

                I also think I remember a case where an existing image only started showing odd issues after updating FOG, so I’m currently leaning towards FOS, especially since I have experienced no problems on the latest Windows 10 versions at all.

                So I’m guessing there’s something funky going on under certain conditions, but not sure what. Given the ambiguity it might not even have anything to do with GPT, but since those were the only relevant changes to the files currently being examined it seems the most likely path all the same.

                1 Reply Last reply Reply Quote 1
                • E
                  Eric Johnson
                  last edited by

                  I think I have found HOW the problem (or at least my problem) is happening, but still not clear on WHY…

                  /usr/share/fog/lib/partition-funcs.sh line 76 in restoreSfdiskPartitions
                  is where the resize occurs.

                  sfdisk $disk < $file >/dev/null 2>&1
                  [[ ! $? -eq 0 ]] && majorDebugEcho “sfdisk failed in (${FUNCNAME[0]})”

                  $file is a sfdisk input built in processSfdisk via /usr/share/fog/lib/procsfdisk.awk and stored in $tmp_file2 = /tmp/sfdisk2.$$

                  But if $tmp_file2 is empty $? from that sfdisk is still 0 (ie silent error) This I found via testing in a debug deploy.

                  Not sure why /tmp/sfdisk2.$$ is getting empty semi-randomly . Still tracking that down. /tmp is tmpfs filesystem, target machine has 16G ram. Doubt it is flling up…

                  george1421G 1 Reply Last reply Reply Quote 0
                  • george1421G
                    george1421 Moderator @Eric Johnson
                    last edited by

                    @Eric-Johnson Just to collect a bit more data. In your FOG ui FOG Configuration->FOG Settings->TFTP Server->KERNEL RAMDISK SIZE What is the value there 127000? If so does it change the reliability if you change it to 255000? This ups the amount of virtual disk FOS Linux has available during imaging.

                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                    E 1 Reply Last reply Reply Quote 0
                    • Cheetah2003C
                      Cheetah2003 @Quazz
                      last edited by

                      @Quazz said in FOG 1.5.6: Auto resize is unpredictable:

                      @Cheetah2003 OS-built recovery partitions have the partition flags that keeps them fixed size.

                      @Quazz Argh. As I said several times, this isn’t a OS built recovery partition. I built it myself. Are you even reading my posts???

                      @Eric-Johnson Welcome. And yeah, what you’re describing sounds very similar to the issue I had with the previous version of FOG that required I move my recovery partition to be before the OS partition, making the OS partition last on the disk for resize to work properly.

                      @Sebastian-Roth Sure sure. I’ll do some experiments and report back any findings. Might be a few days, so I hope you’re not in a hurry.

                      1 Reply Last reply Reply Quote 0
                      • george1421G
                        george1421 Moderator
                        last edited by

                        This other thread issue seems to be related to (maybe as a cousin) to this issue. In that thread the drive is not being expanded again after its being captured by FOG.

                        ref: https://forums.fogproject.org/topic/13479/install-windows-error-after-capturing-image

                        Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                        1 Reply Last reply Reply Quote 0
                        • E
                          Eric Johnson @george1421
                          last edited by

                          @george1421 it was indeed 127000… And of course since bigger is better I set to 511000. Will report back on the effect! Thanks!

                          1 Reply Last reply Reply Quote 0
                          • E
                            Eric Johnson
                            last edited by Eric Johnson

                            TLDR: still not 100% convinced it is fixed…

                            Todays fun start with metering init.xz like this:

                            diff -u /mnt/init-orig/usr/share/fog/lib/funcs.sh /mnt/init/usr/share/fog/lib/funcs.sh
                            --- /mnt/init-orig/usr/share/fog/lib/funcs.sh   2019-05-04 17:58:07.000000000 -0400
                            +++ /mnt/init/usr/share/fog/lib/funcs.sh        2019-07-10 12:29:53.000000000 -0400
                            @@ -1690,6 +1690,10 @@
                                 runPartprobe "$disk"
                             }
                             # Waits for enter if system is debug type
                            +Pause() {
                            +    echo " * Press [Enter] key to continue"
                            +    read  -p "$*"
                            +}
                             debugPause() {
                                 case $isdebug in
                                     [Yy][Ee][Ss]|[Yy])
                            
                            diff -u /mnt/init-orig/usr/share/fog/lib/partition-funcs.sh /mnt/init/usr/share/fog/lib/partition-funcs.sh
                            --- /mnt/init-orig/usr/share/fog/lib/partition-funcs.sh 2019-05-04 17:58:07.000000000 -0400
                            +++ /mnt/init/usr/share/fog/lib/partition-funcs.sh      2019-07-10 15:39:58.000000000 -0400
                            @@ -401,8 +401,15 @@
                                 #    majorDebugPause
                                 #fi
                                 #[[ $status -eq 0 ]] && applySfdiskPartitions "$disk" "$tmp_file1" "$tmp_file2"
                            +    processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" 
                            +        Pause
                                 processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" > "$tmp_file2"
                                 status=$?
                            +       echo $tmp_file2
                            +       ls -l $tmp_file2
                            +        cat $tmp_file2
                            +        Pause
                            +
                                 if [[ $ismajordebug -gt 0 ]]; then
                                     echo "Debug"
                                     majorDebugEcho "Trying to fill with the disk with these partititions:"
                            

                            Printing out the output of processSfdisk, pausing, then doing it for real and ls’ing $tmp_file2 and printing it out, then pausing again.

                            In one sense this worked. Multicast to 4 machines and all came out right. Previous multicasts to the same four machines would have 1 or 2 displaying the failure… But with the metering… No failures.

                            Ok, so I decide, lets just quit if $tmp_file2 is zero… Next version of init.xz had this diff

                            diff -u init-orig/usr/share/fog/lib/partition-funcs.sh init/usr/share/fog/lib/partition-funcs.sh
                            --- init-orig/usr/share/fog/lib/partition-funcs.sh      2019-05-04 17:58:07.000000000 -0400
                            +++ init/usr/share/fog/lib/partition-funcs.sh   2019-07-10 14:29:47.000000000 -0400
                            @@ -73,6 +73,7 @@
                                 local file="$2"
                                 [[ -z $disk ]] && handleError "No disk passed (${FUNCNAME[0]})\n   Args Passed: $*"
                                 [[ -z $file ]] && handleError "No file to receive from passed (${FUNCNAME[0]})\n   Args Passed: $*"
                            +    [[ ! -s $file ]] && handleError "in /usr/share/fog/lib/partition-funcs.sh fillSfdiskWithPartitions $tmp_file2 is zero length" #ESJ
                                 sfdisk $disk < $file >/dev/null 2>&1
                                 [[ ! $? -eq 0 ]] && majorDebugEcho "sfdisk failed in (${FUNCNAME[0]})"
                             }
                            @@ -403,6 +404,9 @@
                                 #[[ $status -eq 0 ]] && applySfdiskPartitions "$disk" "$tmp_file1" "$tmp_file2"
                                 processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" > "$tmp_file2"
                                 status=$?
                            +
                            +    [[ ! -s $tmp_file2 ]] && handleError "in /usr/share/fog/lib/partition-funcs.sh fillSfdiskWithPartitions $tmp_file2 is zero size" #ESJ
                            +
                                 if [[ $ismajordebug -gt 0 ]]; then
                                     echo "Debug"
                                     majorDebugEcho "Trying to fill with the disk with these partititions:"
                            

                            Checked $tmp_file2 in two places, once when created and once right before being used. It would exit if $tmp_file2 was zero size right?

                            Did a clone to the four machines… all worked well…
                            Did another clone… Crapola… One of the four didn’t resize. So it wasn’t zero length… But the other metering was not there so I don’t know what was there…

                            Have done a bunch of metered multicast since then. All with no errors. A Heisenberg bug. If you look to close it always works… Sigh…

                            Am going trying a few with no metering and KERNEL RAMDISK SIZE set ot 511000 per my supersize of @george1421 's suggestion.

                            But I am feeling like I am missing something…

                            george1421G 1 Reply Last reply Reply Quote 0
                            • george1421G
                              george1421 Moderator @Eric Johnson
                              last edited by george1421

                              @Eric-Johnson While you are way over my head with this coding, I can make a suggestion that may make debugging faster.

                              1. If you create a /images/dev/postinit script, you can patch (copy over from the FOG server to FOS) /usr/share/fog/lib/partition-funcs.sh on every boot of FOS. You don’t need to unpack and repack the inits. There is an example in the forum on how to patch (replace) fog.man.reg for a custom registration. ref: https://forums.fogproject.org/topic/9754/custom-full-host-registration-for-1-3-4/45

                              2. At the FOS Linux command prompt, if you give root a password with passwd (just some simple password like hello) and then get the IP address of FOS with ip addr show you can connect to FOS via putty/ssh from a second computer. Of course you need to be in debug mode to do this, but with putty you can copy/paste/debug from your normal workstation.

                              Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                              1 Reply Last reply Reply Quote 1
                              • S
                                Sebastian Roth Moderator
                                last edited by

                                @Eric-Johnson I think you are doing a really great job here trying to nail this down. I’d love to help you but as I can’t replicate the issue same as you can I can only assist with commenting here in the forum. The checks you added seem appropriate! Let’s try to get even more output and see what we find. Remove the output redirection in line 76 of /usr/share/fog/lib/partition-funcs.sh:

                                applySfdiskPartitions() {
                                    local disk="$1"
                                    local file="$2"
                                    [[ -z $disk ]] && handleError "No disk passed (${FUNCNAME[0]})\n   Args Passed: $*"
                                    [[ -z $file ]] && handleError "No file to receive from passed (${FUNCNAME[0]})\n   Args Passed: $*"
                                    sfdisk $disk < $file
                                    [[ ! $? -eq 0 ]] && majorDebugEcho "sfdisk failed in (${FUNCNAME[0]})"
                                }
                                

                                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                                E 1 Reply Last reply Reply Quote 0
                                • E
                                  Eric Johnson
                                  last edited by

                                  So… current status.

                                  Since it worked all the time with my diff with a Pause I just removed my Pause in non-debug mode… Seriously.

                                  I have done 6 multicast deploys to 4 machines with this init and had no resize problems. Running with the original init showed 1-2 out of 4 resize failures.

                                  This of course is NOT what I would call a fix. Kinda a workaround. And I don’t know if this is fixing the problem or just making it less likely to happen, as this should not be actually fixing anything. I through it out here for folks who might run into the same problem as a workaround till we figure out what the hell is going on… 🙂

                                  diff -u /mnt/init-orig/usr/share/fog/lib/partition-funcs.sh /mnt/init/usr/share/fog/lib/partition-funcs.sh
                                  --- /mnt/init-orig/usr/share/fog/lib/partition-funcs.sh 2019-05-04 17:58:07.000000000 -0400
                                  +++ /mnt/init/usr/share/fog/lib/partition-funcs.sh      2019-07-11 09:22:49.000000000 -0400
                                  @@ -401,8 +401,15 @@
                                       #    majorDebugPause
                                       #fi
                                       #[[ $status -eq 0 ]] && applySfdiskPartitions "$disk" "$tmp_file1" "$tmp_file2"
                                  +    processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" 
                                  +#        Pause ESJ
                                       processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" > "$tmp_file2"
                                       status=$?
                                  +       echo $tmp_file2
                                  +       ls -l $tmp_file2
                                  +        cat $tmp_file2
                                  +#        Pause ESJ
                                  +
                                       if [[ $ismajordebug -gt 0 ]]; then
                                           echo "Debug"
                                           majorDebugEcho "Trying to fill with the disk with these partititions:"
                                  
                                  1 Reply Last reply Reply Quote 0
                                  • E
                                    Eric Johnson @Sebastian Roth
                                    last edited by

                                    @Sebastian-Roth Doh. Good idea, should have thought of that. But since even just doing some other output (see my recent post) seems to make the problem vanish…

                                    Dis one is very odd.

                                    Q 1 Reply Last reply Reply Quote 0
                                    • Q
                                      Quazz Moderator
                                      last edited by

                                      I created a PR to fix some stuff and to hopefully handle errors better/more reliably.

                                      I don’t think it will fix this problem necessarily, but it should should hopefully throw errors when we expect it to.

                                      https://github.com/FOGProject/fos/pull/26

                                      You can download the inits of it here: https://dev.fogproject.org/blue/organizations/jenkins/fos/detail/PR-26/2/artifacts

                                      if you’re interested in testing it out.

                                      1 Reply Last reply Reply Quote 1
                                      • Q
                                        Quazz Moderator @Eric Johnson
                                        last edited by Quazz

                                        @Eric-Johnson Interesting. Could you add

                                        cat /proc/sys/kernel/random/entropy_avail
                                        

                                        before and after your output commands?

                                        Just a wild hunch

                                        If my hunch is correct, then I suspect you could replace your output commands with any other output commands and it would work. Alternatively, you could smash the keyboard a couple of times.

                                        E 2 Replies Last reply Reply Quote 0
                                        • E
                                          Eric Johnson @Quazz
                                          last edited by

                                          @Quazz That was a crazy brilliant idea! 🙂

                                          I did the below, first cat before anything, then the processSfdisk to the screen then a second cat, then a pause so I can read it and it probably wasn’t important at that point (I think that would have shown the entropy increase, please let me know if I am miss-understanding what you wanted to see) but on two unicast tests so far… 2260 before / 2261 after (identical)

                                          Going to do a multicast test to all 4 in a sec. See if there is any variation.

                                          diff -u /mnt/init-orig/usr/share/fog/lib/partition-funcs.sh /mnt/init/usr/share/fog/lib/partition-funcs.sh
                                          --- /mnt/init-orig/usr/share/fog/lib/partition-funcs.sh 2019-05-04 17:58:07.000000000 -0400
                                          +++ /mnt/init/usr/share/fog/lib/partition-funcs.sh      2019-07-11 15:31:10.000000000 -0400
                                          @@ -401,8 +401,19 @@
                                               #    majorDebugPause
                                               #fi
                                               #[[ $status -eq 0 ]] && applySfdiskPartitions "$disk" "$tmp_file1" "$tmp_file2"
                                          +cat /proc/sys/kernel/random/entropy_avail
                                          +    processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" 
                                          +echo "after"
                                          +cat /proc/sys/kernel/random/entropy_avail
                                          +
                                          +        Pause #ESJ
                                               processSfdisk "$minf" filldisk "$disk" "$disk_size" "$fixed" "$orig" > "$tmp_file2"
                                               status=$?
                                          +       echo $tmp_file2
                                          +       ls -l $tmp_file2
                                          +        cat $tmp_file2
                                          +#        Pause ESJ
                                          +
                                               if [[ $ismajordebug -gt 0 ]]; then
                                                   echo "Debug"
                                                   majorDebugEcho "Trying to fill with the disk with these partititions:"
                                          
                                          Q 1 Reply Last reply Reply Quote 0
                                          • E
                                            Eric Johnson @Quazz
                                            last edited by Eric Johnson

                                            @Quazz grrr. no luck. entropy started pretty high and was only slightly higher after the call to processSfdisk in 8 out of 8 cases… 2258-2261 before and after in all cases.

                                            Was a great idea, worth trying though!

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 1 / 3
                                            • First post
                                              Last post

                                            169

                                            Online

                                            12.3k

                                            Users

                                            17.4k

                                            Topics

                                            155.8k

                                            Posts
                                            Copyright © 2012-2025 FOG Project