list of package names, versions[, descriptions]

Greg Minshall minshall at acm.org
Wed Oct 2 15:32:58 UTC 2019


Eli,

> That would sort of slam our server, though.

yes, my whole goal is to not slam the servers.

i'm not sure i can explain why i find having the complete list, with
descriptions, local on my machine useful.  but, i do.  "search locally,
build globally" somehow works well for me.  (one rationalization might
be that searching is inherently more interactive than building, so
random network latencies, etc., during building are less annoying than
during searching.)

anyway, grant me the desire to maintain, offline, a complete list of AUR
packages, version numbers, descriptions.

let's say that i've managed, over a period of a week or so, to download
the entire database (or, at least, the "rows" in which i am interested:
package name, version, description) into my own local database.

then, a week later, i'd like to *update* my local database with what's
changed in the AUR repository.  how would i proceed?  as things
currently stand, iiuc (always a dubious proposition), i'd need to again
download the entire database.

on the other hand, if there were a packages-vers.gz (*), i could
download that, then compare the package names and versions in it with
those in my local database, and schedule to download the database
entries for those packages whose version numbers had changed (as well as
those packages in packages-vers.gz that are new; and at the same time
delete those packages in my local database that are no longer in
packages-vers.gz); one can visualize this code.

my presumption is that this would be much lighter on server resources
than downloading the entire database each week.  and, maybe (you'll know
the "churn" in the repository) would even be very light.

and, i think this could be useful for general use.  i may only care
about descriptions, but if someone cares about dependencies,
maintainers, etc., they would still use the version-number mechanism
(again, see (*) below) to determine which packages have changed, and
only download the information from those changed packages.

thanks again.

cheers, Greg

ps -- thanks for the pointer to expac.  i'll look at converting to that.
no one ever accused me of writing overly-efficient code... :)

(*) NB:

note that, for "true consistency", using "version" depends on the
assumption, likely to be at least occasionally, maybe often, invalid,
that if the *metadata* for a package in the database changes then the
*version* of the package itself also changes.

if "last modified" time in the database is updated when any of the
metadata changes, that would be better to use than package version
number.

if "last modified" time isn't updated when (some) metadata is updated,
one could also run an md5sum(1) over (a textual representation of) each
package's database entry, and provide packages-md5sums.gz, say.  i'll
note that a simple test shows that adding an md5sum to each line
inflates the size of the file considerably
: % ls -skh packages*.gz
: 1.5M packages-md5sums.gz  344K packages.gz

the inflation for version numbers and/or "last modified time" (as
seconds since the epoch) would probably be less, maybe double the size
of packages.gz?


More information about the aur-dev mailing list