Sorry, Valentines preparation. Am 07.02.20 um 23:10 schrieb Chris Murphy:
OK so neither fdisk nor gdisk have any complaints about the GPT. And yet the kernel is complaining. That's wrong and weird.
Have the drives always been in these USB enclosures, for the life of this Btrfs file system? They've always been connected to x86 and ARM while in these USB enclosures?
The two disk were brand new and added to the raid. Other drives were removed due to their old age and failing SMART. They were solely used in this btrfs raid and only on x86_64 and ARM. I can't say if I provisioned both disks on the same platform. So one disk was maybe provisioned while still hooked up to Arch Linux 5.x, the second one might be provisioned with raspbian 4.19.
The two drives have identical sectors, 7814037168. Why don't they have identical partition maps? The /dev/sdb drive says partition 1 starts at 2048, and yet "first usable sector is 34" and that drive has 3693 sectors free.
/dev/sda 7814037134-2048=7814035086, 7814035086*512=4000785964032
/dev/sdb 7814035455-2048=7814033407, 7814033407*512=4000785104384
The Btrfs super provided for one of them (I can't tell which it's for)
dev_item.total_bytes 4000785104896
If that's the value for /dev/sda, it's wrong but safe. i.e. the partition is bigger than what Btrfs says it should be. And btrfs will live inside its own dev size constraints. However, if it's the value for /dev/sdb, it's wrong and not safe, because the partition is exactly 512 bytes smaller than Btrfs thinks it should be. But we need clarification. Can you provide the super block for both /dev/sda and /dev/sdb unambiguously with the above reported gdisk/fdisk outputs? Note the /dev/sda /dev/sdb designations can change if the drives have been disconnected/reconnected or the system rebooted since those commands were issued. So it's important to make certain which partition map goes with which super, because no matter what something is not exactly correct and probably should be fixed.
I'm gonna reference their UUIDs from fdisk. Also I'm gonna use Arch Linux 5.5 x86_64 in favour of raspbian 4.19 unless otherwise instructed. btrfs raid1 disks UUIDs: 63E9CA8E-F1B4-8A41-9C21-F058C1AC0783 9904ABA2-B9F8-4544-9699-9935CE8A7B1F Oh, by the way it's a Seagate and Western Digital. Damn I thought I bought identical ones. 63E9CA8E-F1B4-8A41-9C21-F058C1AC0783 ==================================== # LANG=C fdisk -l /dev/sdb Disk /dev/sdb: 3,65 TiB, 4000787030016 bytes, 7814037168 sectors Disk model: USB3.0 DISK03 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 63E9CA8E-F1B4-8A41-9C21-F058C1AC0783 Device Start End Sectors Size Type /dev/sdb1 2048 7814037134 7814035087 3,7T Linux filesystem --> (7814037134-2048)*512 = 4000785964032 of available bytes # btrfs inspect-internal dump-super /dev/sdb1 | grep total_bytes total_bytes 8001571065856 dev_item.total_bytes 4000785960960 --> 4000785960960-4000785964032 = -3072 --> broken? 9904ABA2-B9F8-4544-9699-9935CE8A7B1F ==================================== LANG=C fdisk -l /dev/sdc Disk /dev/sdc: 3,65 TiB, 4000787030016 bytes, 7814037168 sectors Disk model: USB3.0 DISK04 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 9904ABA2-B9F8-4544-9699-9935CE8A7B1F Device Start End Sectors Size Type /dev/sdc1 2048 7814035455 7814033408 3,7T Linux filesystem --> (7814035455-2048)*512 = 4000785104384 of available bytes # btrfs inspect-internal dump-super /dev/sdc1 | grep total_bytes total_bytes 8001571065856 dev_item.total_bytes 4000785104896 --> 4000785104896-4000785104384 = 512 --> safe? So here one could create the backup GPT...
And also as a sanity test:
sudo btrfs rescue super -v /dev/sda
# btrfs rescue super -v /dev/sdb1 All Devices: Device: id = 8, name = /dev/sdc1 Device: id = 7, name = /dev/sdb1 Before Recovering: [All good supers]: device name = /dev/sdc1 superblock bytenr = 65536 device name = /dev/sdc1 superblock bytenr = 67108864 device name = /dev/sdc1 superblock bytenr = 274877906944 device name = /dev/sdb1 superblock bytenr = 65536 device name = /dev/sdb1 superblock bytenr = 67108864 device name = /dev/sdb1 superblock bytenr = 274877906944 [All bad supers]: All supers are valid, no need to recover
It only needs to be run on one of the devices; but both need to be present and unmounted. This is a read only command to verify all six supers are valid.
Last, I wonder if there's some weird bug. Any chance you can update the Pi to kernel 4.19.97-1-ARCH and see if this same kernel GPT error messages happen? You won't need to mount the volume (and I don't recommend trying yet anyway), just connect the devices individually and report back what the kernel says about each drive as you connect them. I do find it a bit hard to believe a GPT related bug could exist in 4.19.75...but worth a shot. For what it's worth, I did an upgrade to 4.19.97 on my Pi and it's fine.
The raspbian on my pi4 now has 4.19.97: uname -a Linux omv 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux The GPT error messages are gone...
Anyway, I wanna be extremely deliberate. It's a bit tedious. But having monitored linux-raid@, LVM, linux-xfs@, linux-btrfs@ upstream lists, it's extremely common to get user induced data loss by getting panicky and doing things too fast. And also I'm intentionally verbose so as to invite anyone reading to call b.s. - it's really easy to make simple stupid mistakes.
Yeah, it would be a bit sad since this I don't 4TB sparse yet. Since the outputs of dump-* are pretty large here are the links after they did not made it to the list here is a link for downloading: https://nextcloud.sfelis.de/s/SJEGp2gCGKenDsj The diffs showed that the dump-s were pretty much the same, but for dump-t there are a few differences.