[pacman-dev] [PATCHv2 1/2] contrib: adding pacsize

Dave Reisner d at falconindy.com
Wed Mar 5 18:05:45 EST 2014


On Wed, Mar 05, 2014 at 09:55:45PM +0100, Pierre Neidhardt wrote:
> Printing package size is useful for maintenance. Indeed, the first entry on the
> wiki is focused on this topic:
> 
>   https://wiki.archlinux.org/index.php/Pacman_Tips#Maintenance
> 
> None of the proposed solutions will allow you to:
> - select packages;
> - work on the output of other commands yielding a list of packages;
> - change the sorting;
> - be locale independent;
> - print a grand total;
> - be fast (most solution are wasting a lot of time -- only expac is faster);
> - not rely on third-party tools.
> 
> Pacsize is a POSIX shell script that is generic enough to enclose all these
> features (and more).
> 
> Adding a 'pacsize' script eliminates the unneeded abundance of workarounds for
> this simple matter.
> 
> Signed-off-by: Pierre Neidhardt <ambrevar at gmail.com>
> ---
>  contrib/.gitignore    |   1 +
>  contrib/Makefile.am   |   3 +
>  contrib/README        |   4 ++
>  contrib/pacsize.sh.in | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 176 insertions(+)
>  create mode 100644 contrib/pacsize.sh.in
> 
> diff --git a/contrib/.gitignore b/contrib/.gitignore
> index a181813..9cecd5e 100644
> --- a/contrib/.gitignore
> +++ b/contrib/.gitignore
> @@ -7,6 +7,7 @@ paclist
>  paclog-pkglist
>  pacscripts
>  pacsearch
> +pacsize
>  pacsysclean
>  rankmirrors
>  updpkgsums
> diff --git a/contrib/Makefile.am b/contrib/Makefile.am
> index f6ca3f1..8c5c6da 100644
> --- a/contrib/Makefile.am
> +++ b/contrib/Makefile.am
> @@ -12,6 +12,7 @@ BASHSCRIPTS = \
>  	paclist \
>  	paclog-pkglist \
>  	pacscripts \
> +	pacsize \
>  	pacsysclean \
>  	rankmirrors \
>  	updpkgsums
> @@ -38,6 +39,7 @@ EXTRA_DIST = \
>  	paclist.sh.in \
>  	pacscripts.sh.in \
>  	pacsearch.in \
> +	pacsize.sh.in \
>  	pacsysclean.sh.in \
>  	rankmirrors.sh.in \
>  	updpkgsums.sh.in \
> @@ -102,6 +104,7 @@ paclist: $(srcdir)/paclist.sh.in
>  paclog-pkglist: $(srcdir)/paclog-pkglist.sh.in
>  pacscripts: $(srcdir)/pacscripts.sh.in
>  pacsearch: $(srcdir)/pacsearch.in
> +pacsize: $(srcdir)/pacsize.sh.in
>  pacsysclean: $(srcdir)/pacsysclean.sh.in
>  rankmirrors: $(srcdir)/rankmirrors.sh.in
>  updpkgsums: $(srcdir)/updpkgsums.sh.in
> diff --git a/contrib/README b/contrib/README
> index ae33bb2..4f5c17f 100644
> --- a/contrib/README
> +++ b/contrib/README
> @@ -31,6 +31,10 @@ pacsearch - a colorized search combining both -Ss and -Qs output. Installed
>  packages are easily identified with a *** and local-only packages are also
>  listed.
>  
> +pacsize - display the size of packages. Duplicates are removed if any. The local
> +database is queried first; if the package is not found, the sync database is
> +then used for lookup.
> +
>  pacsysclean - lists installed packages sorted by size.
>  
>  rankmirrors - ranks pacman mirrors by their connection and opening speed.
> diff --git a/contrib/pacsize.sh.in b/contrib/pacsize.sh.in
> new file mode 100644
> index 0000000..9be800d
> --- /dev/null
> +++ b/contrib/pacsize.sh.in
> @@ -0,0 +1,168 @@
> +#!/bin/sh
> +# pacsize -- display package sizes
> +#
> +# Copyright (C) 2014 Pierre Neidhardt <ambrevar at gmail.com>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License
> +# as published by the Free Software Foundation; either version 2
> +# of the License, or (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +readonly myname='pacsize'
> +readonly myver='@PACKAGE_VERSION@'
> +
> +calc_total () {
> +	awk '{
> +	total += $1
> +	print
> +}
> +END {
> +	printf ("%7s KIB TOTAL\n", total)
> +}'
> +}
> +
> +error () {
> +	echo "$@" >&2
> +}
> +
> +## Print size and name. We strip the arguably useless decimals. This makes
> +## output lighter.
> +filter () {
> +	awk -F ": " \
> +		'$0 ~ "^Name" {
> +	pkg = $2
> +}
> +$0 ~ "^Installed Size" {
> +	gsub (/[\.,][^ ]*/, "")
> +	split($2, a, " ")
> +	printf ("%4d%s %s\n", a[1], a[2], pkg)
> +}'
> +}
> +
> +remove_duplicates () {
> +	awk '! table[$0]++'
> +}
> +
> +usage () {
> +    cat <<EOF
> +Usage: ${1##*/} [OPTIONS] PACKAGES
> +       ${1##*/} -a [OPTIONS]
> +
> +Display the size of PACKAGES. Duplicates are removed if any. The local database
> +is queried first; if the package is not found, the sync database is then used
> +for lookup.

Duplicates seem rather unexpected given the explanation that follows.
You're querying either the local DB *or* the sync DB as a fallback. If
there's duplicates, it's an implementation bug, no?

> +
> +Options:
> +
> +  -a: Process all installed packages.
> +  -h: Show this help.
> +  -n: Sort output by name.
> +  -s: Sort output by size.
> +  -t: Print total.
> +
> +Examples:
> +
> +  $ ${1##*/} -ast
> +    Convenient way to keep track of big packages.
> +
> +  $ ${1##*/} \$(pactree -ld1 linux)
> +    Print the size of linux and all its direct dependencies.
> +
> +  $ ${1##*/} -st \$(pacman -Qdtq)
> +    Print a grand total of orphan packages, and sort by size.
> +EOF
> +}
> +
> +version () {
> +	echo "$myname $myver"
> +	echo 'Copyright (C) 2014 Pierre Neidhardt <ambrevar at gmail.com>'
> +}
> +
> +opt_sort=false
> +opt_all=false
> +opt_total=false
> +
> +while getopts ":ahnstv" opt; do
> +	case $opt in
> +		a)
> +			opt_all=true ;;
> +		h)
> +			usage "$0"
> +			exit ;;
> +		n)
> +			opt_sort="sort -uk3" ;;
> +		s)
> +			opt_sort="sort -uh" ;;
> +		t)
> +			opt_total="calc_total" ;;
> +		v)
> +			version "$0"
> +			exit ;;

We seem to use -V more than -v to mean version.

> +		?)
> +			usage "$0"
> +			exit 1 ;;
> +	esac
> +done
> +
> +shift $(($OPTIND - 1))
> +
> +## All-packages mode.
> +## We use a dedicated algorithm which is much faster than per-package mode.
> +## Unfortunately there is no easy way to select packages with this method.
> +if $opt_all; then
> +	DBPath="$(awk -F = '/^ *DBPath/{print $2}' @sysconfdir@/pacman.conf 2>/dev/null)"

What about leading tabs? What about trailing space and tabs? What about
whitespace between the '=' and the actual value? I'm fairly sure that
the -d test which follows this fails in pretty much all cases.

> +	[ ! -d "$DBPath" ] && DBPath="@localstatedir@/lib/pacman"
> +
> +	if [ ! -d "$DBPath/local/" ]; then
> +		error "Could not find local database in $DBPath/local/."

If pacman.conf contains a DBPath which doesn't exist, the error message
here will be rather odd, as it'll show the compile time default and not
the path from pacman.conf.

> +		exit 1
> +	fi
> +
> +	awk 'BEGIN {
> +	split("B KiB MiB GiB TiB PiB EiB ZiB YiB", unit)
> +}
> +/^%NAME%/ {

The whole field is %NAME%, there's no need to use a regex here.

> +	getline
> +	pkg=$0

getline pkg

> +}
> +/^%SIZE%/ {
> +	getline
> +	size = $0

getline size

> +	i = 1
> +	while (size > 2048) {
> +		size /= 1024
> +		i++
> +	}
> +	printf ("%4d%s %s\n", size, unit[i], pkg)
> +}' "$DBPath"/local/*/desc | ($opt_sort || cat) | ($opt_total || cat)

These subshells aren't wanted. You should be using command grouping
instead.

> +	exit
> +fi
> +
> +## Per-package mode.
> +if [ $# -eq 0 ]; then
> +	error "Missing argument."
> +	usage "$0"
> +	exit 1
> +fi
> +
> +if ! command -v pacman >/dev/null 2>&1; then

I find it very strange that you check for pacman -- the project that
might distribute this script, but you never check for awk or sort.

> +	error "'pacman' not found."
> +	exit 1
> +fi
> +
> +{
> +	## If package is not found locally (-Q), we use the sync database (-S). We
> +	## use LC_ALL=C to make sure pacman output is not localized.
> +	buffer=$(LC_ALL=C pacman -Qi "$@" 2>&1 1>&3 3>&- | cut -f2 -d "'")
> +	[ -n "$buffer" ] && LC_ALL=C pacman -Si $buffer

Not only are you parsing the output of pacman and the internal format of
the ALPM db, you're also parsing *error* output from pacman? So much
groan...

> +} 3>&1 | filter | ($opt_sort || remove_duplicates) | ($opt_total || cat)

More unnecessary subshells.

> +
> +# vim: set noet:
> -- 
> 1.9.0
> 
> 


More information about the pacman-dev mailing list