Could not resize (expandPartition)
-
Hi there.
I have a fog server running as follows
Ubuntu 20.04
Fog 1.5.9I am trying to deploy a single disk image containing a customized ubuntu-mate install
The image clones successfully, but fails at the step of Resizing extfs volume /dev/sda1
The only useful information that i am getting is Please run e2fsck -f /dev/sda1 first. I ran deployment in debug mode and ran that command first but am still getting an issue.
Any help would be greatly appreciated as we are getting ready to convert a former fleet of lenovo L440/450 to this image to turn them into VM Terminals
-
Additional info
Kernel Variables and settings
bzimage loglevel=4 initrd=init.xz root=dev/ram0 rw ramdisk_size=27500 web=https://(IPADDRESS)/fog/ consoleblank=0 rootfstype=ext4 nvme_core.default_ps_max_latency_us=0 mac=(MAC ADDRESS) ftp=(IP ADDRESS) storage=(IP ADDRESS)/images/ storageip=(IP Address) osid=50 irqpoll hostname=(MACID/HOSTNAME) chkdsk=0 img=MobileTClenovo imgType=n imgPartitiontype=all imgid=2 imgFormat=0 PIGZ_Comp=-9 hostearly=1 type=down -
@omegaxis said in Could not resize (expandPartition):
The only useful information that i am getting is Please run e2fsck -f /dev/sda1 first. I ran deployment in debug mode and ran that command first but am still getting an issue.
Can you please take pictures of the initial deploy error as well as the output you get when running the
e2fsck
command and post the pictures here? -
Soooo, the worlds most annoying thing happened this morning. I went in, ran debug mode deploy to try and replicate everything using settings that were not working last night, and everything worked. I am not considering this resolved yet on account of the fact that i need to rebuild the image because I had it set to legacy mode vs uefi mode which is what is needed. I will keep people updated on if the UEFI Variant still works.
-
@sebastian-roth And my issues are back
Below is all the images requested.
-
Ok, so continuing on with this. It might be me being an idiot and things not quite working the way i think they do. The issue seems to be deploying to “Dis similar” hardware. if i deploy to like hardware, everything is fine. Given that its ubuntu, this still doesnt really make sense.
EDIT
NOPE That wasnt it. Just had a l450 fail as well in the exact same ways the x250 was failingEDIT AGAIN
Finally had the bright idea to remove the internet cable after imaging and try just booting. it works, but the resizing is not happening, im having to boot manually into a usb stick with gparted on it and manually resize after install. for obvious reasons, this is not a good solution.
-
@OmegaXis OK, thanks for the pictures and new information. We need one more detail. Please post the contents of
d1.minimum.partitions
,d1.partitions
andd1.fixed_size_partitions
of the captured image. All are text files and can be opened using your favorit editor. You find those files in/images/MobileTClenovo/
on your FOG server. -
Minimum partitions file
d1.minimum.partitions label: gpt label-id: 5FC6A4FE-C9B9-48D1-85A1-04B915E5FD78 device: /dev/sda unit: sectors first-lba: 34 last-lba: 976773134 sector-size: 512 /dev/sda1 : start= 2048, size= 1048576, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=992E52CA-4E96-4DB0-813D-769534D59A54, name="EFI System Partition" /dev/sda2 : start= 1050624, size= 23333108, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, uuid=D884B35F-887F-4265-B910-8F3FAD6D44DA
D1 partitions
d1.partitions label: gpt label-id: 5FC6A4FE-C9B9-48D1-85A1-04B915E5FD78 device: /dev/sda unit: sectors first-lba: 34 last-lba: 976773134 sector-size: 512 /dev/sda1 : start= 2048, size= 1048576, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=992E52CA-4E96-4DB0-813D-769534D59A54, name="EFI System Partition" /dev/sda2 : start= 1050624, size= 975720448, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, uuid=D884B35F-887F-4265-B910-8F3FAD6D44DA
D1 Fixed_size
1
-
@OmegaXis OK the partition layout looks simple and correct. Not much that can go wrong here.
Searching the forums I found this topic that seems related: https://forums.fogproject.org/topic/15599/resize-centos7-fails-e2fsck
We will try to replicate the issue and see what we find. @Wayne-Workman any chance you find some time to look into this?
-
@sebastian-roth it does look similar. But unfortunately there doesn’t seem to be a solution. I will try running chkdsk on the gold and issuing a capture to see if that helps. Unfortunately my entire project is on hold till I can get this resolved
-
Update, running chckdsk scans did not solve anything.
im now at a point where i can get like one in 10 to image correctly, but its a different laptop every time. Worst part, nothing has changed.
There doesnt seem to be any reason that the ones that work do and the ones that don’t don’t. But this is not a good solution.
-
So, A funny thing (not really funny as much as annoying) has started happening. I have thrown the same laptop (Thinkpad L450) now at the fog server 5 times. All 5 times resulted in failure at the resizing sda2 partition.
For the purposes of sheer insanity, I threw it at the fog server one more time expecting the exact same results. Unfortunately (or fortunately, still not sure) With that one laptop, it worked.
This whole thing seems to be hit or miss all of the sudden. When I had this setup in my home lab (minus the lenovo laptops, I was using vm’s and desktops instead)
-
@OmegaXis Thanks for the update, it’s very unfortunate you get different results when deploying to the exact same machine. This should not happen!!
I’ll have a look at it over the weekend and see what I can find out.
-
@sebastian-roth thanks
So after that one went through, everything seemed to start working fine again. Unfortunately, as its the weekend, its in a state at my office desk where I don’t trust it. I’m working this weekend on some other projects with my home lab. Let me know if you think of anything.
-
@OmegaXis Looking at
resize2fs
’s source code I see that the issue you face can be caused by four different conditions:- Filesystem is marked with an error (e.g. not clean unmounted) - I am sure we can ignore this condition because you proofed it’s not unclean by running
e2fsck
in debug mode and before capture! - Filesystem is not in a valid state - Again I think we can dismiss this condition as
e2fsck
would let us know about this! - Timestamp of the last filesystem check is older than timestamp of last mount - I think this is the most likely cause. If the BIOS battery is old and the system clock is way off (back in the past) the timestamp set on the last filesystem check (done by our FOS scripts just before running
resize2fs
) could be set to a really old date. - Free blocks counter is larger than total blocks counter - Again fairly sure this is not the trouble in your case because this would cause
e2fsck
to fail as well I am sure.
To work around the issue I would suggest you add a little post init script to force
resize2fs
to do it’s work and skip the checks. I don’t see why this could cause major trouble as the FOS scripts do a filesystem check beforehand and would error out in case something is really wrong with the filesystem.Edit
/images/dev/postinitscripts/fog.postinit
on your FOG server and add ased
call to the end of the file. If you don’t have any post init scripts in there yet the file would like this with the addedsed
call:#!/bin/bash ## This file serves as a starting point to call your custom pre-imaging/post init loading scripts. ## <SCRIPTNAME> should be changed to the script you're planning to use. ## Syntax of post init scripts are #. ${postinitpath}<SCRIPTNAME> # modify resize2fs call sed -i -e 's/resize2fs \$part$/resize2fs -f \$part/g' /usr/share/fog/lib/funcs.sh
If you have other things in there, just add my last two lines it to the end of your file. Make sure you copy and paste the line to not risk a typo! I tested this and it works.
Now schedule another deploy task and see if you can reproduce the issue again.
- Filesystem is marked with an error (e.g. not clean unmounted) - I am sure we can ignore this condition because you proofed it’s not unclean by running
-
Hay Sebastian.
I think your theory on the bios clock is probably correct. Most of the laptops have been in storage for over a year. Once they started working, they had been plugged into network and power for a bit. I’ll take a look Monday at the other recommendations.
-
@omegaxis Because the root cause seems to be identified, I won’t pursue recreating the issue. Ping me directly if you find this to not be true. I’d need to know your Linux system layout & sizes, as well as source HDD size and destination HDD size. I can replicate all this stuff using VMs.
-
@wayne-workman
im willing to call this resolved given that it seems to have been a time sync issue. -
Hi,
We had the same exact problem, we used a laptop with a 256GB NVMe SSD to create the image and then deployed on a 240GB Sata SSD and we got that error.
We then tried pretty much every possible hack available on the internet to change the partition size without success.
We then tried to capture the image on the 240GB SSD to deploy it on the 256 and it worked.
We ran into that issue a few more times. Everytime the source disk only has arround 30GB of data but if the total size of the disk is more than the target disk, it will fail, no matter what.
-
@Patrix While you might have the same issue it could be a totally different problem as well. We just don’t know enough of you specs to tell yet.
So if you want help I ask you to open your very own topic and post all the information of you setup:
- FOG version
- Init version
- Linux OS and version of your FOG server
- OS installed on the host you want to capture/deploy
- contents of
d1.minimum.partitions
of the particular image you want to deploy to a smaller size disk