BTRFS: open_ctree failed after ubuntu image deploy
-
@Oleg said:
and this is how i my busybox look like…
Good that you posted this picture because, oh well, I guess I was on the wrong track with this. Do you get the same error on non-resizable image type or is it a different error before getting to the busybox shell?
-
@Oleg To me this looks like it could be an issue in the partclone.btrfs code. Hope it’s not but you never know.
Here it says:
Luckily all decent systems that support btrfs (like Ubuntu 14.04) will have btrfs tools included in the initramfs environment, so you can run btrfs commands from there and try to recover from the situation without the need to boot the system form an alternative media, like a live CD.
So can you please boot up the system after deploy using a live CD and try the following commands:
btrfs-show-super -a /dev/sda3 btrfs check /dev/sda3 btrfs-find-root /dev/sda3 mkdir -p /mnt/sda3 && mount -t btrfs -o ro,recovery /dev/sda3 /mnt/sda3 btrfs restore -F -i -D -v /dev/sda3 /dev/null
-
@Sebastian-Roth said:
… Do you get the same error on non-resizable image type or is it a different error before getting to the busybox shell?
This is what i get:
The commands doesn’t work for me - will try to repair that. But is it btrfs-problem?
My system is a completly new with a only a couple files added. -
I’ve just tried to create an Image with latest clonezilla but it’s the same.
-
Does it upload all the partitions or just /dev/sda1?
-
Also, just for clarification.
The first upload broke this system, from what I understand. Are you uploading the broken system or are you ensuring the system is operational between uploads?
-
@Tom-Elliott
in fog - yes, all partitions have been uploaded
The system is running fine before uploading.
I mean if I setup a clean Ubuntu 16.06 Server with BTRFS and the partition table i mentioned, then i get the same error. Tested with another Fujitsu Computer, which is a couple years old. -
Then the last question I have, I suppose.
Have you updated to the latest FOG Version and retried uploading?
-
@Tom-Elliott
sorry, had to mention that at first - last try was with Trunk 8046 -
@Oleg said:
But is it btrfs-problem? … I’ve just tried to create an Image with latest clonezilla but it’s the same.
As you see from your tests it seems to be a partclone/clonezilla issue. This confirms this as well. Although I really wonder why I can’t find anything about this on the web… There should be other people running into this issue!!
Possibly I will be able to do some tests over the weekend. Kepp us posted if you find anything new on this.
-
@Oleg Starting to get my VM setup to test your issue I am a bit confused about the partition layout. While I am not saying that this is causing the error I am wondering why:
- sda1 (/boot) is about 3 GB - not bad but usually you don’t need that much for it
- sda3 (/) is around 9.5 GB - might be enough but I’d use a little more
- sda5 (/home) is around 9.5 GB - this is where users store all their data… usually need a lot more
- sda6 (/opt) is around 90 GB - usually /opt is for optional software. Do you install that much custom tools?
I am not saying that this layout is wrong. Depending on your requirements it might be very useful this way. Just saying that this is not the way I’d partition my disk.
-
@Sebastian-Roth this seems relevant to this discussion as well.
Should be fixed in the partclone version FOG uses, but sounds like a btrfsfsck might be useful to try.
-
@Quazz Good find man! Although I am wondering if this is the exact same issue as they are talking about an issue with “lzo compressed btrfs volumes” which @Oleg does not seem to have according to his
/etc/fstab
…I had a bit of time while I was waiting for some other installations today so I setup Ubuntu 16.04 server (should be close enough to the scenario with Oleg’s Ubuntu desktop), booted it a couple of times without an issue, uploaded an image and deployed it again. My Ubuntu server is coming up and seems normal but taking a look at
/var/log/kern.log
I see a lot of these messages:BTRFS error (device sda3): bad tree block start 0 40345712 BTRFS error (device sda3): bad tree block start 0 40484864 BTRFS error (device sda3): bad tree block start 0 40091648 BTRFS error (device sda3): bad tree block start 0 40108032 BTRFS error (device sda3): bad tree block start 0 40042496 ...
Notice the different numbers at the end of the lines. I am not sure what that means. Guess we need to do some more research on this as it does not seem to be a showstopper in my case. I don’t see the
BTRFS: open_ctree failed
but I have some other btrfs related messages:BTRFS info (device sda3): read error corrected: ino 1 off 125304832 (dev /dev/sda3 sector 261120) BTRFS info (device sda3): read error corrected: ino 1 off 125308928 (dev /dev/sda3 sector 261128) BTRFS info (device sda3): read error corrected: ino 1 off 125313024 (dev /dev/sda3 sector 261136) BTRFS info (device sda3): read error corrected: ino 1 off 125317120 (dev /dev/sda3 sector 261144)
Anyone keen to dig into this and take a look at the partclone code as well. I’d love to but I guess I won’t find the time in the near future.
-
@Sebastian-Roth
Thanks for your suggestion! for a normal use yours is better - in our case we have only a couple applications which are storing their data in the /opt. For sda2 and sda3 I think I will follow your suggestion.
Yes your right - in my setting I don’t have the “lzo compressed” options in the fstab.
In your case the system comes up in my not. Will look further today to confine the issue.I think if it’s a partcone “code-issue”, the solution could take a while?! I’m asking because then I have to switch to another filesystem.
-
I feel I should at least kind of chime in a little bit.
The issue here is not in any way, shape, or form, related to the message as described in the title. The “random: nonblocking pool is initialized” is simply a kernel debug statement just telling you the pool to randomize elements in a non-blocking form has been initiated. This is NOT what is causing the failure to boot after upload, nor is it impeding with BTRFS in anyway.
I think @Quazz is right, at least in that we can perform a btrfs filesystem check. I doubt it will fix anything though. See, @Sebastian-Roth has successfully imaged the system using similar layout as yours, and while there are a few concerning error messages, the system is still operation. Maybe something else is causing issues?
-
@Tom-Elliott said in Upload image "random: nonblocking pool is initialized":
See, @Sebastian-Roth has successfully imaged the system using similar layout as yours, and while there are a few concerning error messages, the system is still operation. Maybe something else is causing issues?
While you’re right that my system seemed to boot up properly after cloning I am still very concerned about those messages I posted. I just started up the system again. Booted ok, but I get a couple of these
bad tree block start
messages every minute now. I am trying to get in contact with the clonezilla developers about this as I think this is not a very special case and will hit is from time to time. I don’t think we should do raw imaging with btrfs filesystems just to circumnavigate this issue.PS: Tom is right about the title. @Oleg would you mind changing the title to something appropriate?
-
@Sebastian-Roth Have you tried a fsck? There might be some useful info coming out of that if anything, although this seem like a partclone issue, more info is always nice.
-
@Sebastian-Roth
no problem - change the topic in something more related to the discussed issue.today I tried to clone that image with another fstab-mount-options. I remove
discard
because I read that this option should not be used with BTRFS and also tried withclear_cache
andnospace_cache
but with no success. -
@Quazz said in Upload image "random: nonblocking pool is initialized":
Have you tried a fsck? There might be some useful info coming out of that if anything, although this seem like a partclone issue, more info is always nice.
There are about a dozen or so of btrfs-tools (like
btrfs
,btrfs-find-root
,btrfs-debug-tree
and some more) to examine and fix those kind of filesystems. Unfortunately I haven’t played with those tools before and don’t really know enough about btrfs to find out what is causing thisBTRFS: open_ctree failed
. -
@Oleg what if you use those added arguments to the host kernel args?