[pacman-dev] [PATCH] pacsysclean: Add new contrib script

Eric Bélanger snowmaniscool at gmail.com
Mon Aug 8 21:19:36 EDT 2011


On Mon, Aug 8, 2011 at 6:30 PM, Dan McGee <dpmcgee at gmail.com> wrote:
> On Sun, Aug 7, 2011 at 4:14 PM, Eric Bélanger <snowmaniscool at gmail.com> wrote:
>> pacsysclean sort installed packages by decreasing installed size. It's
>> useful for finding large unused package when doing system clean-up. This
>> script is an improved version of other similar scripts posted on the
>> forums. Thanks goes to Dave as I reused the size_to_human function from his
>> paccache script.
>>
>> Signed-off-by: Eric Bélanger <snowmaniscool at gmail.com>
>>
>> ---
>>
>> If you can think of a better name, feel free to suggest one.
>> ---
>>  contrib/.gitignore     |    1 +
>>  contrib/Makefile.am    |    5 ++-
>>  contrib/pacsysclean.in |   87 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 92 insertions(+), 1 deletions(-)
>>  create mode 100755 contrib/pacsysclean.in
>>
>> diff --git a/contrib/.gitignore b/contrib/.gitignore
>> index 1bd145f..19b81e0 100644
>> --- a/contrib/.gitignore
>> +++ b/contrib/.gitignore
>> @@ -6,5 +6,6 @@ paclist
>>  paclog-pkglist
>>  pacscripts
>>  pacsearch
>> +pacsysclean
>>  wget-xdelta.sh
>>  zsh_completion
>> diff --git a/contrib/Makefile.am b/contrib/Makefile.am
>> index 10b03a2..754096d 100644
>> --- a/contrib/Makefile.am
>> +++ b/contrib/Makefile.am
>> @@ -5,7 +5,8 @@ OURSCRIPTS = \
>>        paclist \
>>        paclog-pkglist \
>>        pacscripts \
>> -       pacsearch
>> +       pacsearch \
>> +       pacsysclean
>>
>>  OURFILES = \
>>        bash_completion \
>> @@ -21,6 +22,7 @@ EXTRA_DIST = \
>>        paclist.in \
>>        pacscripts.in \
>>        pacsearch.in \
>> +       pacsysclean.in \
>>        vimprojects \
>>        zsh_completion.in \
>>        README
>> @@ -59,6 +61,7 @@ paclist: $(srcdir)/paclist.in
>>  paclog-pkglist: $(srcdir)/paclog-pkglist.in
>>  pacscripts: $(srcdir)/pacscripts.in
>>  pacsearch: $(srcdir)/pacsearch.in
>> +pacsysclean: $(srcdir)/pacsysclean.in
>>  pactree: $(srcdir)/pactree.in
>>  zsh_completion: $(srcdir)/zsh_completion.in
>>
>> diff --git a/contrib/pacsysclean.in b/contrib/pacsysclean.in
>> new file mode 100755
>> index 0000000..e393e24
>> --- /dev/null
>> +++ b/contrib/pacsysclean.in
>> @@ -0,0 +1,87 @@
>> +#!/bin/bash
>> +
>> +# pacsysclean - Sort installed packages by decreasing installed size. Useful for system clean-up.
>> +#
>> +# Copyright (C) 2011 Eric Bélanger <eric at archlinux.org>
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License
>> +# as published by the Free Software Foundation; either version 2
>> +# of the License, or (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
> cut from here
>> +export TEXTDOMAIN='pacman'
>> +export TEXTDOMAINDIR='/usr/share/locale'
>> +
>> +# determine whether we have gettext; make it a no-op if we do not
>> +if ! type gettext &>/dev/null; then
>> +       gettext() {
>> +               echo "$@"
>> +       }
>> +fi
> to here. You aren't using gettext() and we don't support it in contrib anyway.

OK. I saw other scripts with this at their beginning so i thought it
was standard stuff.

>
>> +
>> +usage() {
>> +       echo "$0 - Sort installed packages by decreasing installed size."
>> +       echo
>> +       echo "Usage: $0 [options]"
>> +       echo
>> +       echo "Options:"
>> +       echo "  -a               List all packages (Default)"
>> +       echo "  -e               List unrequired explicitely installed packages"
> spelling, explicitly. Slightly related is using "not required" in the
> description as unrequired is not really a word (but it makes sense as
> a one-word flag, just not as a definition).
>
> Wouldn't it make more sense to allow any options pacman -Q allows on
> filtering, rather than just trying to emulate 1? I can see people
> wanting to do -Qdt, Qet, -Qm, etc.

Probably.  I could change the -e option so it acccept an argument
instead and use that as the pacman query option.

>
>> +       echo "  -h, --help       Show this help message and exit"
>> +}
>> +
>> +size_to_human() {
>> +       awk -v size="$1" '
>> +       BEGIN {
>> +               suffix[1] = "KiB"
>> +               suffix[2] = "MiB"
>> +               suffix[3] = "GiB"
>> +               suffix[4] = "TiB"
>> +               count = 1
>> +
>> +               while (size > 1024) {
>> +                       size /= 1024
>> +                       count++
>> +               }
>> +
>> +               sizestr = sprintf("%.2f", size)
>> +               sub(/.?0+$/, "", sizestr)
>> +               printf("%s %s", sizestr, suffix[count])
>> +       }'
> Isn't this fairly expensive to invoke awk each time you call it? This
> seems bash-math-able. It also already differs from Dave's
> implementation as he added the low 'B' suffix, and neither of these
> have the 'PiB' suffix that our formatter in pacman has.

It isn't expensive. Here, with 1100 packages installed, it takes 8
seconds to execute the last while loop. I could also make the human
readable format optional if it's an issue.

After stripping the ending '.00' to the installed size reported by
pacman, it's bash-math-able but you don't have any decimal places  as
bash can only do integer division. I guess that shouldn't be a big
problem as estimated sizes are good enough for this script. If we make
the human size optional, we could display the sizes in KB as reported
by pacman by default and have the human size done in bash.
Alternatively, I could also add the 'B' and 'PiB' suffix to the
size_to_human function if we decide to keep it. Let me know which
method would be preferable.

>
>> +}
>> +
>> +PACMAN_OPTS="-Qq"
>> +if [ -n "$1" ]; then
>> +       case "$1" in
>> +               -a) PACMAN_OPTS="-Qq" ;;
>> +               -e) PACMAN_OPTS="-Qetq" ;;
>> +               -h|--help) usage; exit 0 ;;
>> +               *) usage; exit 1 ;;
>> +       esac
>> +fi
>> +
>> +TEMPDIR=$(mktemp -d /tmp/cleanup-script.XXXX)
>> +cd $TEMPDIR
>> +
>> +# Sort installed packages by decreasing installed size. Useful for system clean-up.
>> +for package in $(pacman $PACMAN_OPTS); do
>> +       echo $(pacman -Qi $package |grep 'Installed Size' |awk '{print $4}') $package
> I believe $(pacman -Qiet) would work just fine, right? And save you
> several invocations of commands making this a lot more efficient, as
> long as you properly navigate the output.
>

I'm not sure what you mean. If you're talking about replacing:
$(pacman -Qi $package |grep 'Installed Size' |awk '{print $4}')
by
$(pacman -Qiet)
then it won't work. The current expression gives the installed size so
I end up with two columns: one with the sizes and one with their
corresponding packages. What you suggest would just output a lot of
junk wich will make it more difficult to sort and keep track of which
size goes with which packages.


> This also won't work as written if you are in a different locale; I
> highly recommend testing every pacman script by enabling zh_CN.UTF-8
> in /etc/locale.gen, regenerating locales, and then executing via
> 'LANG=zh_CN.UTF-8 ./my_awesome_script.sh".
>

OK, will do.

>> +done | sort -g -r -o raw.txt
>> +
>> +N=0
>> +while read LINE ; do
>> +       N=$((N+1))
>> +       size_to_human $(echo -n $LINE |cut -d' ' -f1) >> sorted-list.txt
>> +       echo -n ' : ' >> sorted-list.txt
>> +       echo -n $LINE |cut -d' ' -f2 >> sorted-list.txt
>> +done < raw.txt
>> +
>> +echo "Files saved to $TEMPDIR"
>> --
>> 1.7.6
>
>


More information about the pacman-dev mailing list