[arch-general] Script for finding and "moving" older duplicate packages from /var/cache/pacman/pkg
Listmates, I like to keep the current set of updated packages in /var/cache/pacman/pkg but not have all of the older duplicates in there as well. However, I don't want to delete the old packages either, in case I need to downgrade a package. I simply want to move the older packages to a backup location on a different partition so I'm not filling up / with several extra gigabytes of older packages. To do this I adapted a script that I used to clean duplicates from my local rpm repository and changed the script to deal with package-name-v.e.r-arch.pkg.tar.gz instead of rpm-name-v.e.r-s.u.b-arch.rpm. (Thankfully arch just uses the pkg directory instead of a nested spiderweb under /var/cache/zypp/packages/.....) The script for finding and moving older duplicate packages is named fduppkg and is fairly self-explanitory. Running without arguments gives the options: 23:33 supersff:~/scripts/file> fduppkg Usage: <search dir> [ -d -l logfile -s -v ] Searches <search dir> for duplicate pkgs and moves duplicate files to [dup_dir] or <search dir>/duplicates by default. -d | --dupdir Used to specify directory to hold duplicate rpms -l | --logfile Specify the log file name (default ./duplicates.log) -s | --silent Don't output anything to stdout, just log results -v | --verbose Output information showing which files are kept and not moved You should note that the actual check for older/newer uses the file modification date to make the decision rather than parsing the version number digit-by-digit. So if you have done something nutty like moving all the packages out of /var/cache and then copying them back so all the file modification times are the same, this script won't work. (I could see it happening) If you have a need for something like this, you can grab the script at: http://www.3111skyline.com/download/Archlinux/scripts/fduppkg Since the script if fairly generic for packages, if you have a number of directories you want to operate on, you can just call the script from another script that sets the command line inputs. Example: http://www.3111skyline.com/download/Archlinux/scripts/fdup-archpkg I generally just create a link to the script in /usr/local/bin so it is in the search path. Also, if you just want to tear it apart for the bash scripting learning, I have commented it so it is somewhat readable. Enjoy. -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
On Monday 24 August 2009 11:51:45 pm David C. Rankin wrote:
<snip>
The script for finding and moving older duplicate packages is named fduppkg and is fairly self-explanitory. Running without arguments gives the options:
BASH never fails to amaze me, 3116 element array to handle the packages, and BASH never batted an eye. It just kept on trucking.... [00:41 archangel:/home/david/scripts/file] # ./fdup-archpkg -v <big snip> ${SARRAY[3114]} zenity-2.26.0-1-x86_64.pkg.tar.gz Search filename: zenity Search Architecture: x86_64 Duplicate filename: zenity Duplicate Architecture: x86_64 Keeping: /var/cache/pacman/pkg/zenity-2.26.0-2-x86_64.pkg.tar.gz Duplicate found: 1 Search filename: zenity Search Architecture: x86_64 Duplicate filename: zenity Duplicate Architecture: x86_64 `/var/cache/pacman/pkg/zenity-2.26.0-1-x86_64.pkg.tar.gz' -> `/home/backup/pkg-old/zenity-2.26.0-1-x86_64.pkg.tar.gz' removed `/var/cache/pacman/pkg/zenity-2.26.0-1-x86_64.pkg.tar.gz' Duplicate found: 2 ${SARRAY[3116]} zip-3.0-1-x86_64.pkg.tar.gz Search filename: zip Search Architecture: x86_64 Duplicate filename: zip Duplicate Architecture: x86_64 Keeping: /var/cache/pacman/pkg/zip-3.0-1-x86_64.pkg.tar.gz Duplicate found: 1 Done! [00:43 archangel:/home/david/scripts/file] # du -hcs /var/cache/pacman/pkg/ 3.0G /var/cache/pacman/pkg/ 3.0G total [00:48 archangel:/home/david/scripts/file] # du -hcs /home/backup/pkg-old/ 5.2G /home/backup/pkg-old/ 5.2G total There's an extra 5.2G back ;-) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
A while back i wrote a similar script which extracts the .PKGINFO file from each package in one's cache. slow, but I think this is a more accurate way to compare versions. as-is, it will cp $N versions back from installed packages (inclusive) to $WD/saveme, where N and WD are defined in the script. i use it to save the N version, clean out my cache, then move them back. but i think you could easily modify it to suit your usage case instead of moving N versions to saveme you could echo the filename into a saveme.lst, then mv anything not in saveme.lst to your oldpkg directory. or you could just take my little .PKGINFO / awk trickery and merge it into your script... or you could just ignore me and go with mod times :). anyways, more info is always better, right? i've attached the script for you to peruse. thanks, pat On 08/25/09 at 12:51am, David C. Rankin wrote:
On Monday 24 August 2009 11:51:45 pm David C. Rankin wrote:
<snip>
The script for finding and moving older duplicate packages is named fduppkg and is fairly self-explanitory. Running without arguments gives the options:
BASH never fails to amaze me, 3116 element array to handle the packages, and BASH never batted an eye. It just kept on trucking....
[00:41 archangel:/home/david/scripts/file] # ./fdup-archpkg -v
<big snip>
${SARRAY[3114]} zenity-2.26.0-1-x86_64.pkg.tar.gz Search filename: zenity Search Architecture: x86_64 Duplicate filename: zenity Duplicate Architecture: x86_64 Keeping: /var/cache/pacman/pkg/zenity-2.26.0-2-x86_64.pkg.tar.gz Duplicate found: 1 Search filename: zenity Search Architecture: x86_64 Duplicate filename: zenity Duplicate Architecture: x86_64 `/var/cache/pacman/pkg/zenity-2.26.0-1-x86_64.pkg.tar.gz' -> `/home/backup/pkg-old/zenity-2.26.0-1-x86_64.pkg.tar.gz' removed `/var/cache/pacman/pkg/zenity-2.26.0-1-x86_64.pkg.tar.gz' Duplicate found: 2
${SARRAY[3116]} zip-3.0-1-x86_64.pkg.tar.gz Search filename: zip Search Architecture: x86_64 Duplicate filename: zip Duplicate Architecture: x86_64 Keeping: /var/cache/pacman/pkg/zip-3.0-1-x86_64.pkg.tar.gz Duplicate found: 1
Done!
[00:43 archangel:/home/david/scripts/file] # du -hcs /var/cache/pacman/pkg/ 3.0G /var/cache/pacman/pkg/ 3.0G total [00:48 archangel:/home/david/scripts/file] # du -hcs /home/backup/pkg-old/ 5.2G /home/backup/pkg-old/ 5.2G total
There's an extra 5.2G back ;-)
-- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
-- patrick brisbin
Patrick Brisbin schrieb:
A while back i wrote a similar script which extracts the .PKGINFO file from each package in one's cache. slow, but I think this is a more accurate way to compare versions.
We had that discussion on arch-dev-public a while ago. makepkg always puts .PKGINFO to the beginning of the tarball. Also, tar and bsdtar both have options to ensure that after the file is found, they stops extracting - therefore, the .PKGINFO file will be extracted instantly, independent of package size. Look at the tar or bsdtar manpages, unfortunately I forgot exactly how it works.
On 08/25/09 at 04:45pm, Thomas Bächler wrote:
Patrick Brisbin schrieb:
A while back i wrote a similar script which extracts the .PKGINFO file from each package in one's cache. slow, but I think this is a more accurate way to compare versions.
We had that discussion on arch-dev-public a while ago. makepkg always puts .PKGINFO to the beginning of the tarball. Also, tar and bsdtar both have options to ensure that after the file is found, they stops extracting - therefore, the .PKGINFO file will be extracted instantly, independent of package size. Look at the tar or bsdtar manpages, unfortunately I forgot exactly how it works.
WOW! tar -xf on a 3 GB cache -> 1m 42s bsdtar -qxf on the same cache -> 0m 9s awesome. -- patrick brisbin
On Tue, Aug 25, 2009 at 5:03 PM, Patrick Brisbin<pbrisbin@gmail.com> wrote:
On 08/25/09 at 04:45pm, Thomas Bächler wrote:
Patrick Brisbin schrieb:
A while back i wrote a similar script which extracts the .PKGINFO file from each package in one's cache. slow, but I think this is a more accurate way to compare versions.
We had that discussion on arch-dev-public a while ago. makepkg always puts .PKGINFO to the beginning of the tarball. Also, tar and bsdtar both have options to ensure that after the file is found, they stops extracting - therefore, the .PKGINFO file will be extracted instantly, independent of package size. Look at the tar or bsdtar manpages, unfortunately I forgot exactly how it works.
WOW!
tar -xf on a 3 GB cache -> 1m 42s
bsdtar -qxf on the same cache -> 0m 9s
awesome.
according to the man pages I just read online, --fast-read should work for both.
On Tuesday 25 August 2009 10:03:13 am Patrick Brisbin wrote:
WOW!
tar -xf on a 3 GB cache -> 1m 42s
bsdtar -qxf on the same cache -> 0m 9s
awesome.
Now that's cool. I can't wait to see what kind of fun we can have with that ;-) Thanks Patrick, Thomas, Xavier. -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
Le mardi 25 à 17:03, Patrick Brisbin a écrit :
tar -xf on a 3 GB cache -> 1m 42s
bsdtar -qxf on the same cache -> 0m 9s
If you ran both of them on the same set of files, BSD tar was certainly quicker because it took advantage of the files being cached thanks to the first run of GNU tar. Now, it may also be that BSD tar /is/ faster --we all know that GNU programs are crippled with too much features for their own good ;-). Do you get the same results if you run the same commands a couple of times in succession, discarding the first run ? -- Fred
On 08/26/09 at 12:02am, Frédéric Perrin wrote:
Le mardi 25 à 17:03, Patrick Brisbin a écrit :
tar -xf on a 3 GB cache -> 1m 42s
bsdtar -qxf on the same cache -> 0m 9s
If you ran both of them on the same set of files, BSD tar was certainly quicker because it took advantage of the files being cached thanks to the first run of GNU tar. Now, it may also be that BSD tar /is/ faster --we all know that GNU programs are crippled with too much features for their own good ;-).
Do you get the same results if you run the same commands a couple of times in succession, discarding the first run ?
FWIW, i ran them again in succession two times each: tar -xv 1m 55s tar -xv 1m 37s bsdtar -qxv 0m 9.5s bsdtar -qxv 0m 9.5s also, --fast-read yielded "invalid option" with tar. -- patrick brisbin
participants (5)
-
David C. Rankin
-
Frédéric Perrin
-
Patrick Brisbin
-
Thomas Bächler
-
Xavier