[arch-general] btrfs kernel incompatibility?

Simeon Felis arch-general at sfelis.de
Sun Feb 9 15:14:58 UTC 2020


Sorry, Valentines preparation.

Am 07.02.20 um 23:10 schrieb Chris Murphy:
> OK so neither fdisk nor gdisk have any complaints about the GPT. And
> yet the kernel is complaining. That's wrong and weird.
> 
> Have the drives always been in these USB enclosures, for the life of
> this Btrfs file system? They've always been connected to x86 and ARM
> while in these USB enclosures?

The two disk were brand new and added to the raid. Other drives were removed due to their old age and failing SMART. They were solely used in this btrfs raid and only on x86_64 and ARM.

I can't say if I provisioned both disks on the same platform. So one disk was maybe provisioned while still hooked up to Arch Linux 5.x, the second one might be provisioned with raspbian 4.19.

> 
> The two drives have identical sectors, 7814037168. Why don't they have
> identical partition maps? The /dev/sdb drive says partition 1 starts
> at 2048, and yet "first usable sector is 34" and that drive has 3693
> sectors free.
> 
> /dev/sda
> 7814037134-2048=7814035086, 7814035086*512=4000785964032
> 
> /dev/sdb
> 7814035455-2048=7814033407, 7814033407*512=4000785104384
> 
> 
> The Btrfs super provided for one of them (I can't tell which it's for)
> 
> dev_item.total_bytes    4000785104896
> 
> If that's the value for /dev/sda, it's wrong but safe. i.e. the
> partition is bigger than what Btrfs says it should be. And btrfs will
> live inside its own dev size constraints. However, if it's the value
> for /dev/sdb, it's wrong and not safe, because the partition is
> exactly 512 bytes smaller than Btrfs thinks it should be. But we need
> clarification. Can you provide the super block for both /dev/sda and
> /dev/sdb unambiguously with the above reported gdisk/fdisk outputs?
> Note the /dev/sda /dev/sdb designations can change if the drives have
> been disconnected/reconnected or the system rebooted since those
> commands were issued. So it's important to make certain which
> partition map goes with which super, because no matter what something
> is not exactly correct and probably should be fixed.

I'm gonna reference their UUIDs from fdisk. Also I'm gonna use Arch Linux 5.5 x86_64 in favour of raspbian 4.19 unless otherwise instructed.

btrfs raid1 disks UUIDs:
63E9CA8E-F1B4-8A41-9C21-F058C1AC0783
9904ABA2-B9F8-4544-9699-9935CE8A7B1F
Oh, by the way it's a Seagate and Western Digital. Damn I thought I bought identical ones.


63E9CA8E-F1B4-8A41-9C21-F058C1AC0783
====================================

# LANG=C fdisk -l /dev/sdb
Disk /dev/sdb: 3,65 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: USB3.0 DISK03   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 63E9CA8E-F1B4-8A41-9C21-F058C1AC0783

Device     Start        End    Sectors  Size Type
/dev/sdb1   2048 7814037134 7814035087  3,7T Linux filesystem

--> (7814037134-2048)*512 = 4000785964032 of available bytes

# btrfs inspect-internal dump-super /dev/sdb1 | grep total_bytes
total_bytes		8001571065856
dev_item.total_bytes	4000785960960

--> 4000785960960-4000785964032 = -3072 --> broken?



9904ABA2-B9F8-4544-9699-9935CE8A7B1F
====================================

LANG=C fdisk -l /dev/sdc
Disk /dev/sdc: 3,65 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: USB3.0 DISK04   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 9904ABA2-B9F8-4544-9699-9935CE8A7B1F

Device     Start        End    Sectors  Size Type
/dev/sdc1   2048 7814035455 7814033408  3,7T Linux filesystem

--> (7814035455-2048)*512 = 4000785104384 of available bytes



# btrfs inspect-internal dump-super /dev/sdc1 | grep total_bytes
total_bytes		8001571065856
dev_item.total_bytes	4000785104896

--> 4000785104896-4000785104384 = 512 --> safe? So here one could create the backup GPT...


> 
> And also as a sanity test:
> 
> sudo btrfs rescue super -v /dev/sda

# btrfs rescue super -v /dev/sdb1
All Devices:
	Device: id = 8, name = /dev/sdc1
	Device: id = 7, name = /dev/sdb1

Before Recovering:
	[All good supers]:
		device name = /dev/sdc1
		superblock bytenr = 65536

		device name = /dev/sdc1
		superblock bytenr = 67108864

		device name = /dev/sdc1
		superblock bytenr = 274877906944

		device name = /dev/sdb1
		superblock bytenr = 65536

		device name = /dev/sdb1
		superblock bytenr = 67108864

		device name = /dev/sdb1
		superblock bytenr = 274877906944

	[All bad supers]:

All supers are valid, no need to recover


> 
> It only needs to be run on one of the devices; but both need to be
> present and unmounted. This is a read only command to verify all six
> supers are valid.
> 
> Last, I wonder if there's some weird bug. Any chance you can update
> the Pi to kernel 4.19.97-1-ARCH and see if this same kernel GPT error
> messages happen? You won't need to mount the volume (and I don't
> recommend trying yet anyway), just connect the devices individually
> and report back what the kernel says about each drive as you connect
> them. I do find it a bit hard to believe a GPT related bug could exist
> in 4.19.75...but worth a shot. For what it's worth, I did an upgrade
> to 4.19.97 on my Pi and it's fine.

The raspbian on my pi4 now has 4.19.97:
uname -a
Linux omv 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux

The GPT error messages are gone...


> 
> Anyway, I wanna be extremely deliberate. It's a bit tedious. But
> having monitored linux-raid@, LVM, linux-xfs@, linux-btrfs@ upstream
> lists, it's extremely common to get user induced data loss by getting
> panicky and doing things too fast. And also I'm intentionally verbose
> so as to invite anyone reading to call b.s. - it's really easy to make
> simple stupid mistakes.
> 

Yeah, it would be a bit sad since this I don't 4TB sparse yet.


Since the outputs of dump-* are pretty large here are the links after they did not made it to the list here is a  link for downloading:

https://nextcloud.sfelis.de/s/SJEGp2gCGKenDsj

The diffs showed that the dump-s were pretty much the same, but for dump-t there are a few differences.


More information about the arch-general mailing list