[arch-general] Updated scripts for removing duplicates from /var/cache/pacman/pkg

David C. Rankin drankinatty at suddenlinkmail.com
Thu Nov 3 00:41:34 EDT 2011


All,

   I don't know if anybody uses these scripts, but I've updated them to handle 
the stray packages with the nonconforming filenames in the form of:

<name>-num-num-arch.pkg.tar.xz

   I've additionally added a check to optimize the scripts a bit by checking for 
at least the presence of one actual duplicate before executing the removal code. 
Prior to the overhead of the nonconforming file check, it wasn't really needed, 
but after the nonconforming filename code was added, you get a meaningful time 
benefit from the check. The scripts are here:

http://www.3111skyline.com/dl/arch/scripts/fduparch.sh
http://www.3111skyline.com/dl/arch/scripts/fduppkg

   For those not familiar with the scripts, they simply parse the files in 
/var/cache/pacman/pkg and move any duplicates (old versions) to a separate 
directory (/home/backup/pkg-1 by default). This does 2 things: (1) is cleans the 
pkg directory so that only current packages are present and (2) by moving 
duplicates to /home it frees space on the / partition. The files on the /home 
dir can be kept for a period of time or simply deleted.

   The pair of scripts function as one script. The "wrapper" script 
'fduparch.sh' simply calls 'fduppkg' setting the search dir and duplicate dir 
according to the directory array. fduppkg generically just removes duplicates 
(saving the most recent file) from dir_1 -> dir_2. They provide output advising 
of the duplicates found as well as logging detailed information about what was 
done to /home/backup/log/pkgdups.log.bz2.

   By default, fduppkg is called 3 times:

/var/cache/pacman/pkg -> /home/backup/pkg-1
/home/backup/pkg-1    -> /home/backup/pkg-2
/home/backup/pkg-2    -> /home/backup/pkg-del

   If you simply want to run the script once, just edit fduparch.sh and just 
remove directories from:

DIRLIST=( /var/cache/pacman/pkg /home/backup/pkg-1 /home/backup/pkg-2 
/home/backup/pkg-del )

   Making it:

DIRLIST=( /var/cache/pacman/pkg /home/backup/pkg-1 )

   The scripts are reasonably commented, so they are easy to follow. The only 
requirement is that both be placed in /usr/local/bin (or just modify the script 
location in fduparch.sh) I just soft link then to /usr/local/bin. Obviously 
since they are moving files from the pkg directory, they must be run as root or 
the user must have sudo privileges. Output is verbose by default.

   The output is shown below. The pkg [index] is simply the file index number of 
that file in the list of 2743 below:

17:27 archangel:~> fduparch.sh

   calling:  'fduppkg /var/cache/pacman/pkg -d /home/backup/pkg-1 -l pkgdups.log'

Total packages to screen:  2743
Removing duplicates from:  /var/cache/pacman/pkg
Duplicates directory:      /home/backup/pkg-1
Log file location:         pkgdups.log
Verbose mode set:          [use -q to stop pkg output | -s to stop all output]
<snip>
pkg [ 116] boost                       dup => boost-1.47.0-2-i686.pkg.tar.xz
pkg [ 118] boost-libs                  dup => boost-libs-1.47.0-2-i686.pkg.tar.xz
<snip>

754  duplicates moved to /home/backup/pkg-1


   calling:  'fduppkg /home/backup/pkg-1 -d /home/backup/pkg-2 -l pkgdups.log'


Total packages to screen:  2527
Removing duplicates from:  /home/backup/pkg-1
Duplicates directory:      /home/backup/pkg-2
Log file location:         pkgdups.log
Verbose mode set:          [use -q to stop pkg output | -s to stop all output]
<snip>
pkg [  55] b43-fwcutter                dup => b43-fwcutter-013-1-i686.pkg.tar.xz
pkg [  75] boost                       dup => boost-1.47.0-1-i686.pkg.tar.xz
pkg [  77] boost-libs                  dup => boost-libs-1.47.0-1-i686.pkg.tar.xz
<snip>

601  duplicates moved to /home/backup/pkg-del


   Package Disk Usage Summary

     4.7G   /var/cache/pacman/pkg
     3.4G   /home/backup/pkg-1
     1.1G   /home/backup/pkg-2
     1.3G   /home/backup/pkg-del

   That's all there is to it. The log entries look like this:

Nov 02 18:16:52 killerz fduppkg Removing duplicates from: /var/cache/pacman/pkg
Number of packages: 3545
`/var/cache/pacman/pkg/accountsservice-0.6.14-1-i686.pkg.tar.xz' -> 
`/home/backup/pkg-1/accountsservice-0.6.14-1-i686.pkg.tar.xz'
removed `/var/cache/pacman/pkg/accountsservice-0.6.14-1-i686.pkg.tar.xz'
`/var/cache/pacman/pkg/arch-wiki-docs-20100914-1-any.pkg.tar.xz' -> 
`/home/backup/pkg-1/arch-wiki-docs-20100914-1-any.pkg.tar.xz'
removed `/var/cache/pacman/pkg/arch-wiki-docs-20100914-1-any.pkg.tar.xz'
`/var/cache/pacman/pkg/ati-dri-7.11-2-i686.pkg.tar.xz' -> 
`/home/backup/pkg-1/ati-dri-7.11-2-i686.pkg.tar.xz'
removed `/var/cache/pacman/pkg/ati-dri-7.11-2-i686.pkg.tar.xz'
<snip>

So if you ever have any questions about what was done, it is there for your 
review in the log file.

They take less than 60 seconds or so to run. Very handy for cleaning the pkg dir 
and also for keeping a 'last known' set of good packages around. I agree, no 
need for a 2nd or 3rd set, but that's just kind of hung around since they were 
first developed. Give them a try. The scripts are here:

http://www.3111skyline.com/dl/arch/scripts/fduparch.sh
http://www.3111skyline.com/dl/arch/scripts/fduppkg

Then make sure they are executable (chmod 0755 will do). The just link them from 
wherever you save them to /usr/local/bin as follows:

ln -s /path/to/fduparch.sh /usr/local/bin/fduparch
ln -s /path/to/fduppkg /usr/local/bin/fduppkg

Then just execute the fduparch link and watch it run. With the addition of the 
check for nonconfirming filenames, you will pick up another 300-600 duplicates 
out of your package directory. Anyway.

Enjoy! Send any bugs to me. Thanks!


-- 
David C. Rankin, J.D.,P.E.


More information about the arch-general mailing list