BTRFS: open_ctree failed after ubuntu image deploy
-
Hello guys,
by uploading an 16.06 image with this partition table:# <file system> <mount point> <type> <options> <dump> <pass> # / was on /dev/sda3 during installation UUID=dfa3bf1a-9ca1-4fc4-863f-72815db61539 / btrfs defaults,discard,relatime,subvol=@ 0 1 # /boot was on /dev/sda1 during installation UUID=0a46a7fe-23c2-4e4c-a76a-7d2410311a25 /boot ext4 defaults 0 2 # /home was on /dev/sda5 during installation UUID=dfa3bf1a-9ca1-4fc4-863f-72815db61539 /home btrfs defaults,discard,relatime,subvol=@home 0 2 # /opt was on /dev/sda6 during installation UUID=719313d4-5d42-44e7-8cda-87c492b92ae6 /opt btrfs defaults,discard 0 2 # swap was on /dev/sda2 during installation UUID=c5c9a2e6-885a-4ed1-aeb6-909043dae122 none swap sw 0 0
i get the above after uploading the first partition.
Fog is finishing the job though. But if I put the image on onother computer, ubuntu starts with busybox.
I was not able to find any solution, so I hope someone can help. Thanks
-
I’ve done a new installation of the ubuntu 16.04 with default partition-table (ext4+swap) and it seems it is a hardware+kernel problem.
The computer is a Fujitsu Q556 with skylake chipset and SSD Drive. I will try do install other kernels and hope the problem dissapear.
Any other solutions? -
@Oleg That’s an interesting one. The message
random: nonblocking pool is initialized
is printed by the kernel on boot (if kernel messages are turned on - which we don’t by default). Very strange that you see this message while imaging. My guess is this message is not causing any trouble. But we seem to not properly set the filesystem UUIDs which are used in your fstab. We should definitely fix this.From your description (you end up in a busybox shell) it should be the root partition’s filesystem UUID (on /dev/sda5) that we screw up. Can you please post the contents of
/images/<imagename>/d1.original.uuids
. -
@Sebastian-Roth
I don’t have this file my image-folder.
There is a “d1.partitions”:label: dos label-id: 0x173d9cfe device: /dev/sda unit: sectors /dev/sda1 : start= 2048, size= 5857280, type=83, bootable /dev/sda2 : start= 5859328, size= 15624192, type=82 /dev/sda3 : start= 21483520, size= 19531776, type=83 /dev/sda4 : start= 41017342, size= 209051650, type=5 /dev/sda5 : start= 41017344, size= 19529728, type=83 /dev/sda6 : start= 60549120, size= 189519872, type=83
and the “d1.original.swapuuids”:
/dev/sda2 c5c9a2e6-885a-4ed1-aeb6-909043dae122
The “d1.has_grub” is empty
-
@Tom-Elliott Seems like we only do
saveUUIDInformation
on resizable image type. Do you know why?@Oleg Till we get this sorted in the code can you try this: create a new image definition, set image type to resizable, upload again and see if it is working after deploy.
-
@Sebastian-Roth
I changed the image to resizable and did an upload. After downloading I stuck on busybox again.
here are the image-files:
d1.fixed_size_partitions2:3:4:5:6
d1.minimum.partitions:
label: dos label-id: 0x173d9cfe device: /dev/sda unit: sectors /dev/sda1 : start= 2048, size= 347702, type=83, bootable /dev/sda2 : start= 5859328, size= 15624192, type=82 /dev/sda3 : start= 21483520, size= 19531776, type=83 /dev/sda4 : start= 41017342, size= 209051650, type=5 /dev/sda5 : start= 41017344, size= 19529728, type=83 /dev/sda6 : start= 60549120, size= 189519872, type=83
d1.partitions:
label: dos label-id: 0x173d9cfe device: /dev/sda unit: sectors /dev/sda1 : start= 2048, size= 5857280, type=83, bootable /dev/sda2 : start= 5859328, size= 15624192, type=82 /dev/sda3 : start= 21483520, size= 19531776, type=83 /dev/sda4 : start= 41017342, size= 209051650, type=5 /dev/sda5 : start= 41017344, size= 19529728, type=83 /dev/sda6 : start= 60549120, size= 189519872, type=83
d1.original.fstypes:
/dev/sda1 extfs
d1.original.swapuuids is the same
and this is how i my busybox look like:
-
@Oleg said:
and this is how i my busybox look like…
Good that you posted this picture because, oh well, I guess I was on the wrong track with this. Do you get the same error on non-resizable image type or is it a different error before getting to the busybox shell?
-
@Oleg To me this looks like it could be an issue in the partclone.btrfs code. Hope it’s not but you never know.
Here it says:
Luckily all decent systems that support btrfs (like Ubuntu 14.04) will have btrfs tools included in the initramfs environment, so you can run btrfs commands from there and try to recover from the situation without the need to boot the system form an alternative media, like a live CD.
So can you please boot up the system after deploy using a live CD and try the following commands:
btrfs-show-super -a /dev/sda3 btrfs check /dev/sda3 btrfs-find-root /dev/sda3 mkdir -p /mnt/sda3 && mount -t btrfs -o ro,recovery /dev/sda3 /mnt/sda3 btrfs restore -F -i -D -v /dev/sda3 /dev/null
-
@Sebastian-Roth said:
… Do you get the same error on non-resizable image type or is it a different error before getting to the busybox shell?
This is what i get:
The commands doesn’t work for me - will try to repair that. But is it btrfs-problem?
My system is a completly new with a only a couple files added. -
I’ve just tried to create an Image with latest clonezilla but it’s the same.
-
Does it upload all the partitions or just /dev/sda1?
-
Also, just for clarification.
The first upload broke this system, from what I understand. Are you uploading the broken system or are you ensuring the system is operational between uploads?
-
@Tom-Elliott
in fog - yes, all partitions have been uploaded
The system is running fine before uploading.
I mean if I setup a clean Ubuntu 16.06 Server with BTRFS and the partition table i mentioned, then i get the same error. Tested with another Fujitsu Computer, which is a couple years old. -
Then the last question I have, I suppose.
Have you updated to the latest FOG Version and retried uploading?
-
@Tom-Elliott
sorry, had to mention that at first - last try was with Trunk 8046 -
@Oleg said:
But is it btrfs-problem? … I’ve just tried to create an Image with latest clonezilla but it’s the same.
As you see from your tests it seems to be a partclone/clonezilla issue. This confirms this as well. Although I really wonder why I can’t find anything about this on the web… There should be other people running into this issue!!
Possibly I will be able to do some tests over the weekend. Kepp us posted if you find anything new on this.
-
@Oleg Starting to get my VM setup to test your issue I am a bit confused about the partition layout. While I am not saying that this is causing the error I am wondering why:
- sda1 (/boot) is about 3 GB - not bad but usually you don’t need that much for it
- sda3 (/) is around 9.5 GB - might be enough but I’d use a little more
- sda5 (/home) is around 9.5 GB - this is where users store all their data… usually need a lot more
- sda6 (/opt) is around 90 GB - usually /opt is for optional software. Do you install that much custom tools?
I am not saying that this layout is wrong. Depending on your requirements it might be very useful this way. Just saying that this is not the way I’d partition my disk.
-
@Sebastian-Roth this seems relevant to this discussion as well.
Should be fixed in the partclone version FOG uses, but sounds like a btrfsfsck might be useful to try.
-
@Quazz Good find man! Although I am wondering if this is the exact same issue as they are talking about an issue with “lzo compressed btrfs volumes” which @Oleg does not seem to have according to his
/etc/fstab
…I had a bit of time while I was waiting for some other installations today so I setup Ubuntu 16.04 server (should be close enough to the scenario with Oleg’s Ubuntu desktop), booted it a couple of times without an issue, uploaded an image and deployed it again. My Ubuntu server is coming up and seems normal but taking a look at
/var/log/kern.log
I see a lot of these messages:BTRFS error (device sda3): bad tree block start 0 40345712 BTRFS error (device sda3): bad tree block start 0 40484864 BTRFS error (device sda3): bad tree block start 0 40091648 BTRFS error (device sda3): bad tree block start 0 40108032 BTRFS error (device sda3): bad tree block start 0 40042496 ...
Notice the different numbers at the end of the lines. I am not sure what that means. Guess we need to do some more research on this as it does not seem to be a showstopper in my case. I don’t see the
BTRFS: open_ctree failed
but I have some other btrfs related messages:BTRFS info (device sda3): read error corrected: ino 1 off 125304832 (dev /dev/sda3 sector 261120) BTRFS info (device sda3): read error corrected: ino 1 off 125308928 (dev /dev/sda3 sector 261128) BTRFS info (device sda3): read error corrected: ino 1 off 125313024 (dev /dev/sda3 sector 261136) BTRFS info (device sda3): read error corrected: ino 1 off 125317120 (dev /dev/sda3 sector 261144)
Anyone keen to dig into this and take a look at the partclone code as well. I’d love to but I guess I won’t find the time in the near future.
-
@Sebastian-Roth
Thanks for your suggestion! for a normal use yours is better - in our case we have only a couple applications which are storing their data in the /opt. For sda2 and sda3 I think I will follow your suggestion.
Yes your right - in my setting I don’t have the “lzo compressed” options in the fstab.
In your case the system comes up in my not. Will look further today to confine the issue.I think if it’s a partcone “code-issue”, the solution could take a while?! I’m asking because then I have to switch to another filesystem.