On Tue, Feb 11, 2020 at 2:02 PM Simeon Felis <arch-general@sfelis.de> wrote:
Am 10.02.20 um 01:22 schrieb Chris Murphy:
You might just go straight to ARM, and try to mount -o ro and see if it mounts it OK. I think the error messages you got from Btrfs previously had to do with the bogus GPT error messages - which we don't know why that happened.
Unfortunately it still does not mount:
mount -o ro /dev/disk/by-label/URAID /mnt/URAID/ mount: /mnt/URAID: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error.
[ 182.039688] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd [ 182.071047] usb 2-2: New USB device found, idVendor=152d, idProduct=0567, bcdDevice=52.03 [ 182.071063] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 182.071076] usb 2-2: Product: External USB 3.0 [ 182.071089] usb 2-2: Manufacturer: JMicron [ 182.071101] usb 2-2: SerialNumber: 20170331000C3 [ 182.074585] usb-storage 2-2:1.0: USB Mass Storage device detected [ 182.079212] usb-storage 2-2:1.0: Quirks match for vid 152d pid 0567: 5000000 [ 182.079424] scsi host0: usb-storage 2-2:1.0 [ 183.130129] scsi 0:0:0:0: Direct-Access External USB3.0 DISK03 5203 PQ: 0 ANSI: 6 [ 183.131024] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [ 183.131252] sd 0:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) [ 183.131267] sd 0:0:0:0: [sda] 4096-byte physical blocks [ 183.131904] sd 0:0:0:0: [sda] Write Protect is off [ 183.131919] sd 0:0:0:0: [sda] Mode Sense: 2b 00 00 00 [ 183.132258] scsi 0:0:0:1: Direct-Access External USB3.0 DISK04 5203 PQ: 0 ANSI: 6 [ 183.139512] sd 0:0:0:0: [sda] No Caching mode page found [ 183.139528] sd 0:0:0:0: [sda] Assuming drive cache: write through [ 183.140186] sd 0:0:0:1: [sdb] Very big device. Trying to use READ CAPACITY(16). [ 183.140467] sd 0:0:0:1: [sdb] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) [ 183.140483] sd 0:0:0:1: [sdb] 4096-byte physical blocks [ 183.141135] sd 0:0:0:1: [sdb] Write Protect is off [ 183.141151] sd 0:0:0:1: [sdb] Mode Sense: 2b 00 00 00 [ 183.142195] sd 0:0:0:1: [sdb] No Caching mode page found [ 183.142211] sd 0:0:0:1: [sdb] Assuming drive cache: write through [ 183.151939] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 183.152161] sd 0:0:0:1: Attached scsi generic sg1 type 0 [ 183.308231] sdb: sdb1 [ 183.308429] sda: sda1 [ 183.310527] sd 0:0:0:1: [sdb] Attached SCSI disk [ 183.311834] sd 0:0:0:0: [sda] Attached SCSI disk [ 183.572564] BTRFS: device label URAID devid 8 transid 1254643 /dev/sdb1 [ 183.575573] BTRFS: device label URAID devid 7 transid 1254643 /dev/sda1 [ 228.067813] BTRFS info (device sda1): disk space caching is enabled [ 228.067827] BTRFS info (device sda1): has skinny extents [ 228.072861] BTRFS critical (device sda1): unable to find logical 4306137776128 length 4096 [ 228.081639] BTRFS critical (device sda1): unable to find logical 4306137776128 length 4096 [ 228.090173] BTRFS critical (device sda1): unable to find logical 4306137776128 length 4096 [ 228.098571] BTRFS critical (device sda1): unable to find logical 4306137776128 length 4096 [ 228.107030] BTRFS critical (device sda1): unable to find logical 4306137776128 length 4096 [ 228.115469] BTRFS critical (device sda1): unable to find logical 4306137776128 length 4096 [ 228.123928] BTRFS error (device sda1): failed to read chunk root [ 228.160012] BTRFS error (device sda1): open_ctree failed
apt-cache show btrfs-progs | grep Version Version: 4.20.1-2 uname -a Linux omv 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
My suggestion is to take this upstream, it smells like a bug. And there's a good chance a bug fix can be backported since 4.19 is a long term kernel. The only way bugs ever get fixed is if they get reported to the proper upstream.
I'm going to get some large external drives, copy the data and will go back to an old-school software raid.
btrfs is not portable in my eyes. When this is happening I would strongly recommend to wait another 5 years before using it anywhere.
Good luck with that. I monitor upstream mdadm and LVM lists and bugs and regressions are a fact of life. File systems are hard. The older any file system gets, the more non-deterministic it becomes. The more you mix kernel versions, the non-determinism explodes. And it's even greater when moving it across archs. That doesn't mean the problem isn't a bug, it just increases the chance of bug exposures. And this isn't going to get more reliable unless there are reliable bug reports where people persevere through the tedious task of making the software better. The reality is, your data is intact. This is not a data loss scenario. And you have metadata and data checksumming to verify your data. Giving that up you will have to completely trust the limited error detection abilities of the drives. Any corruption there will be propagated to user space, replicating in backups, silently.
I'm pretty sure this is not an Arch Linux issue, it might be a raspbian issue.
I think it's a straight up Btrfs bug. But it should be reported upstream to find out what's going on. Off hand I don't see a relevant patch between 4.19.95 and 4.19.103. If you write up the email, put me in the cc and I can fill in some of the gaps and hopefully get the proper attention. -- Chris Murphy