On Sun, Dec 12, 2010 at 9:29 AM, Dieter Plaetinck <dieter@plaetinck.be> wrote:
Anthony and other interested folks, I've been looking a bit further, and it seems like btrfs support shouldn't be too hard to implement. It actually seems simpler then LVM (because lvm has 3 levels: PV,VG and LV; btrfs has just the btrfs itself (~default subvolume) and other subvolumes) subvolumes don't get a new devicefile but i'll probably use something like: /dev/sda:$spec to denote what's what. if $spec is a number, it will become mount option subvolid=$spec; otherwise subvol=$spec although from what i can tell, the id's aren't used often. and it seems more robust to me use names anyway.
yeah, that would work fine; should be simpler than LVM. the problem with mounting by name is that it only works when "name" is in the btrfs root (the real root, subvolid=0). ie: /subvol ... works, but not ... /nested/subvol the hook i'm soon to release doesn't support names; it's just too inflexible. btw, for clarity to anyone else, the default subvol is not the same as the btrfs root (though initially they are the same). default subvol is any subvol marked as the _mount_ default (and later mountable via `subvol=.` or none at all)... the real root will always be subvolid = 0 or 5.
* which are the requirements your btrfs_advanced mkinitcpio hook implies? what things does aif need to do other then just doing mkfs.btrfs to get the full potential out of btrfs/your hook? please explain why a default btrfs configuration does not suffice. does it have something to do with https://btrfs.wiki.kernel.org/index.php/UseCases#Can_a_snapshot_be_replaced_... ?
it's sort of related to that i think. the reeeeeaaaaalllly messy part is what to do when a user has installed the system into the btrfs root, instead of a dedicated subvol. the issue is the btrfs root is not movable/editable/replaceable; all other subvols can be moved/renamed/deleted/etc... except the root. thus, there is no clean way to programatically "move" the system (in preparation for rollback/manage snapshots/etc.). everything in / must be rm -rf'ed manually or it will ultimately become dead space. i've brought this up probably 5 different times to the list be never get any response :-( the hook (and other impls i'd assume) use the btrfs root for volume management, the "sub-root". the actual "system root" is just one of many subvols in the pool, and may change between reboots. at the very least, if AIF created a subvol, marked as default, and installed into that subvol, my hook could then safely "rotate" the user into a more advanced configuration... i just need the system in a subvol. the only difference user sees by this procedure (dedicated subvol by default) is a "mysterious" directory when they run "btrfs subvolume list" that doesn't seem to exist :-) because it's actually underneath their /. but really, under no cases do i think the system should be installed into the btrfs root, i wouldn't even offer it at install time. if use wants that they can do it themselves... they will be happy it's in a subvol.
* I've read a bit more about btrfs and I think an implementation like this will suffice for most users: - allow creation of a btrfs on top of 1-n blockdevices (user can pick raid levels for data and metadata) - allow creation of 0-m subvolumes - each subvolume as well as the default can get an arbitrary mountpoint, as well as specific mount options like compress, ssd, etc. if i understood correctly, that is.
yup, i thinks that's everything for now! ssd should enable automatically when btrfs detects non rotating media. and ssd_spread is for cheaper flash i believe... i forget what the reason was. compress we should be sure to note the CPU overhead of zlib (though LZO patches will be in next kernel i believe, exciting), though for many systems it may not matter.
However, to be fully compatible with your hook, I will probably "strongly recommend" to create a subvolume __active and mount that as / Right? anything I missed?
in the newer setup __active isn't used anymore; i don't intend to develop on that configuration anymore, and will phase anyone out in favor of this upcoming release. the new structure looks like this: --------------------------------------------------------------------------------- /var/lib/btrfsadm |-- boot | |-- extlinux.conf | `-- vesamenu.c32 |-- HEAD -> refs/rw/PRI |-- pool | |-- FREE -> /dev/disk/by-label/btrfs-pool-free | `-- SELF -> /dev/disk/by-label/btrfs-pool-self |-- refs | |-- ro | | |-- log | | | |-- 1291021356 -> ../../../vols/260 | | | |-- 1291056164 -> ../../../vols/261 | | | `-- 1291102035 -> ../../../vols/262 | | `-- usr | | `-- ORIG -> ../../../vols/260 | `-- rw | |-- PRI -> ../../vols/262 | |-- SEC -> ../../vols/261 | `-- usr `-- vols |-- 260 | |-- boot | | |-- kernel26-fallback.img | | |-- kernel26.img | | |-- System.map26 | | `-- vmlinuz26 | `-- fs (THIS IS A SUBVOL) |-- 261 | |-- boot | | |-- kernel26-fallback.img | | |-- kernel26.img | | |-- System.map26 | | `-- vmlinuz26 | `-- fs (THIS IS A SUBVOL) `-- 262 |-- boot | |-- kernel26-fallback.img | |-- kernel26.img | |-- kxloader.img | |-- System.map26 | `-- vmlinuz26 `-- fs (THIS IS THE ACTIVE SYSTEM ROOT) --------------------------------------------------------------------------------- so... while much more involved, it's still is very simple and 1000x more flexible. heavily inspired by the .git directory setup. a quick breakdown: /boot this is the real boot device; can be a separate partition/disk, multiple disks, or on the same btrfs FS (currently extlinux only). also used for a 2-stage boot -- a kernel based "bootramfs" bootloader is used to mount, find, and kexec the real kernel within a snapshot, since standard bootloaders can't see inside subvols yet. /HEAD a symlink to a symlink. HEAD points to the active ref (or directly to a subvol, the git equivalent of a "detached head"), which points to a particular subvol. at at given time, when the system is running, HEAD will _always_ point to the current subvol in use. /pool symlinks to ourself (SELF -- the active btrfs pool), and any others (FREE will be used in the future if available to "steal" devices; this will enable hot spares and automatic array repair) /refs a hierarchy of symlinks into the /vols directory. for every subvol the user has, a symlink in here will exist. there will also be some system managed ones (such as "log"... which is autosnap on reboot, if enabled). ORIG=snapshot after install, PRI=primary system root, SEC=the previous system root. user can manage these with the upcoming btrfsadm tool. /vols all the actual subvols. named by id. the above `tree` shows a "detached boot" state... where boot is outside the fs. this setup enables extlinux (and others potentially) to perform kernel level rollbacks without the use of a 2-stage boot process, but requires /boot (from within the system) to be a symlink: # mount ... /dev/sda on /var/lib/btrfsadm type btrfs (rw,noatime,subvolid=0) ... # ls -l /boot lrwxrwxrwx 1 root root 26 Nov 29 03:11 /boot -> var/lib/btrfsadm/HEAD/boot this way, mkinitcpio and friends work, and copy the kernel to the proper detached boot by dereferencing HEAD. also, since extlinux can follow symlinks, simply pointing to HEAD or other refs in extlinux.conf works (must be under 255 chars). ultimately this is a workaround for bootloaders unable to handle btrfs or btrfs subvols, but it works very well, and is easy to move to an "inclusive boot" later on when bootloader support is better. --------------------------------------------------------------------------------- i know that's a lot of information, and probably more than needed, but i've been meaning to write it down anyway :-) let me know how you think that could jive with AIF. C Anthony