[arch-projects] [devtools] [PATCH] makechrootpkg: Be recursive when deleting btrfs subvolumes.

Fri Feb 10 20:30:01 UTC 2017

On Fri, 10 Feb 2017 09:57:32 -0500,
Dave Reisner wrote:
> > +	LC_ALL=C btrfs subvolume show "$1" | sed -n 's/^\tSubvolume ID:\s*//p'
> 
> This looks like you're parsing human readable output to get the subvol
> ID. Is this really the only way to get this information?

As far as I can tell, yes.

> > +)
> > +
> > +# Usage: btrfs_subvolume_list_all $FILEPATH
> > +#
> > +# Given $FILEPATH somewhere on a mounted btrfs filesystem, print the
> > +# ID and full path of every subvolume on the filesystem, one per line
> > +# in the format "$ID $PATH".
> > +btrfs_subvolume_list_all() (
> > +	set -o pipefail
> > +	local mountpoint
> > +	mountpoint="$(df --output=target "$1" | sed 1d)" || return $?
> 
> "return $?" is a long winded way of just "return". The status is
> implied.

Huh.  TIL.  Thanks.

> > +	LC_ALL=C btrfs subvolume list -a "$mountpoint" | sed -r 's|^ID ([0-9]+) .* path (<FS_TREE>/)?(\S*).*|\1 \3|'
> 
> You might want to validate that you actually get an integer back here,
> rather than just assuming the calls succeed. You cannot rely on sed to
> provide a useful exit status.

I suppose that would be a good idea.  The `set -o pipefail` a couple
of lines up will catch the cases where the failure is from `btrfs`;
the check would be for if btrfs's output format changes.

I really should have called it out in a comment and the commit message
that that regex is the place where I most expect btrfs to break us in
the future.  The output format is a space-separated sequence of "key
value key value..."; the problem is that both keys and values can
contain space, and there's no escaping or indication of when this
happens.  The reason I tacked `^` in the expression is that there's
already a "parent ID" key.  The continued success of the regex assumes
that there will never be another key or value that contains " path",
and that there is no space in the path value.

So yeah, I probably should add a check.

> > +	id="$(btrfs_subvolume_id "$subvolume")" || return $?
> > +	all="$(btrfs_subvolume_list_all "$subvolume")" || return $?
> > +	path="$(sed -n "s/^$id //p" <<<"$all")"
> 
> Rather than injecting unvalidated data into a sed program, I'd suggest
> using awk:
> 
> path=$(awk -v id="$1" '$1 == id { sub($1 FS, ""); print }' <<<"$all")

Sure.  Sed was simpler, and I figured that if `btrfs` starts
misbehaving, the whole thing is hosed anyway.

> > +	subvolumes=($(btrfs_subvolume_list "$dir")) || return $?
> 
> This is a broken way to read lines into an array, because you're not
> reading lines at all -- you're reading words.
> 
> > +	for subvolume in "${subvolumes[@]}"; do
> > +		btrfs subvolume delete "$dir/$subvolume" || return $?
> > +	done

Yeah, but the lines are guaranteed to be words because of the `(\S*)`
in the regexp in `btrfs_subvolume_list_all`.

But I should have written it as

	subvolumes="$(btrfs_subvolume_list "$dir")" || return
	for read -r subvolume; do
		btrfs subvolume delete "$dir/$subvolume" || return
	done <<<"$subvolumes"

-- 
Happy hacking,
~ Luke Shumaker