[PATCH] paccache: add --age-atime and --age-mtime options
--- I would feel a lot more confident about using the paccache systemd service if it kept packages based on age instead of just the few most recent. This patch adds the functionality to skip candidates that are not older (in terms of atime or mtime) than some specified age. It seems to work, but I'm not exactly a bash expert, so please review with care. I'd appreciate if this could be merged! doc/paccache.8.txt | 5 ++++ src/paccache.sh.in | 58 +++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 62 insertions(+), 1 deletion(-) diff --git a/doc/paccache.8.txt b/doc/paccache.8.txt index db81283..c9e3807 100644 --- a/doc/paccache.8.txt +++ b/doc/paccache.8.txt @@ -38,6 +38,11 @@ Options Scan for packages for a specific architecture. Default is to scan for all architectures. +*\--age-atime <age>*:: +*\--age-mtime <age>*:: + Only consider packages for removal with atime respectively mtime older than + specified. The age can be given as '10d', '1m', '1y', '1y1m' etc. + *-c, \--cachedir <dir>*:: Specify a different cache directory. This option can be used more than once. Default is to use the cache directory configured in 'pacman.conf'. diff --git a/src/paccache.sh.in b/src/paccache.sh.in index 012ba9f..8ee5792 100644 --- a/src/paccache.sh.in +++ b/src/paccache.sh.in @@ -27,6 +27,7 @@ declare -r myver='@PACKAGE_VERSION@' declare -a cachedirs=() candidates=() cmdopts=() whitelist=() blacklist=() declare -i delete=0 dryrun=0 filecount=0 move=0 needsroot=0 totalsaved=0 verbose=0 +declare -i age_atime=0 age_mtime=0 declare delim=$'\n' keep=3 movedir= scanarch= QUIET=0 @@ -40,6 +41,29 @@ die() { exit 1 } +# Parses the age --age-atime and --age-mtime arguments +parse_age() { + declare -i age=0 + if [[ $2 =~ ^[[:space:]]*([0-9]+[dmy][[:space:]]*)+$ ]]; then + # Add spaces to facilitate splitting + temp=${2//d/d } + temp=${temp//m/m } + temp=${temp//y/y } + read -a temp <<< "${temp[*]}" + for a in ${temp[@]}; do + num=${a:0: -1} + case ${a: -1} in + d) age=$(( age + num )) ;; + m) age=$(( age + num * 30 )) ;; + y) age=$(( age + num * 365 )) ;; + esac + done + else + die "argument '%s' to option '%s' must be of the form '([0-9]+[dmy])+'" "$2" "$1" + fi + echo $(( age * 24 * 60 * 60 )) +} + # reads a list of files on stdin and prints out deletion candidates pkgfilter() { # there's whitelist and blacklist parameters passed to this @@ -174,6 +198,10 @@ Usage: ${myname} <operation> [options] [targets...] -r, --remove remove candidate packages. Options: + --age-atime <age> + --age-mtime <age> keep packages with an atime/mtime that is not at least + <age> ago, where <age> is given as '10d', '1m', '1y', + '1y1m' etc. -a, --arch <arch> scan for "arch" (default: all architectures). -c, --cachedir <dir> scan "dir" for packages. can be used more than once. (default: read from @sysconfdir@/pacman.conf). @@ -200,7 +228,8 @@ version() { OPT_SHORT=':a:c:dfhi:k:m:qrsuVvz' OPT_LONG=('arch:' 'cachedir:' 'dryrun' 'force' 'help' 'ignore:' 'keep:' 'move' - 'nocolor' 'quiet' 'remove' 'uninstalled' 'version' 'verbose' 'null') + 'nocolor' 'quiet' 'remove' 'uninstalled' 'version' 'verbose' 'null' + 'age-atime:' 'age-mtime:') if ! parseopts "$OPT_SHORT" "${OPT_LONG[@]}" -- "$@"; then exit 1 @@ -210,6 +239,18 @@ unset OPT_SHORT OPT_LONG OPTRET while :; do case $1 in + --age-atime) + age_atime=$(parse_age "$1" "$2") + if (( $? )); then + exit 1 + fi + shift ;; + --age-mtime) + age_mtime=$(parse_age "$1" "$2") + if (( $? )); then + exit 1 + fi + shift ;; -a|--arch) scanarch=$2 shift ;; @@ -319,6 +360,21 @@ for cachedir in "${cachedirs[@]}"; do popd &>/dev/null done +# remove any candidates that are not old enough yet +if (( $age_atime || $age_mtime )); then + currtime=$(date +%s) + for cand in "${candidates[@]}"; do + IFS=';' read -d '' -a temp <<< $(stat --format '%X;%Y' "$cand") + if (( ( $(( $currtime - ${temp[0]} )) > $age_atime ) && \ + ( $(( $currtime - ${temp[1]} )) > $age_mtime ) \ + )); then + candtemp+=("$cand") + fi + done + candidates=("${candtemp[@]}") + unset candtemp +fi + if (( ! ${#candidates[*]} )); then msg 'no candidate packages found for pruning' exit 0 -- 2.19.0
Hey, Excerpts from wisp3rwind's message of September 15, 2018 0:58:
--- I would feel a lot more confident about using the paccache systemd service if it kept packages based on age instead of just the few most recent.
This patch adds the functionality to skip candidates that are not older (in terms of atime or mtime) than some specified age. It seems to work, but I'm not exactly a bash expert, so please review with care. I'd appreciate if this could be merged!
So I would overall be okay with adding something like this, but there are some changes I would want to have made first. First of all, is there any specific case where you would need both supported? Because only having mtime support sounds like it should be good enough?
doc/paccache.8.txt | 5 ++++ src/paccache.sh.in | 58 +++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 62 insertions(+), 1 deletion(-)
diff --git a/doc/paccache.8.txt b/doc/paccache.8.txt index db81283..c9e3807 100644 --- a/doc/paccache.8.txt +++ b/doc/paccache.8.txt @@ -38,6 +38,11 @@ Options Scan for packages for a specific architecture. Default is to scan for all architectures.
+*\--age-atime <age>*:: +*\--age-mtime <age>*:: + Only consider packages for removal with atime respectively mtime older than + specified. The age can be given as '10d', '1m', '1y', '1y1m' etc. +
I would much rather these be called min-.time, "age-mtime" feels rather non-obvious.
*-c, \--cachedir <dir>*:: Specify a different cache directory. This option can be used more than once. Default is to use the cache directory configured in 'pacman.conf'. diff --git a/src/paccache.sh.in b/src/paccache.sh.in index 012ba9f..8ee5792 100644 --- a/src/paccache.sh.in +++ b/src/paccache.sh.in @@ -27,6 +27,7 @@ declare -r myver='@PACKAGE_VERSION@'
declare -a cachedirs=() candidates=() cmdopts=() whitelist=() blacklist=() declare -i delete=0 dryrun=0 filecount=0 move=0 needsroot=0 totalsaved=0 verbose=0 +declare -i age_atime=0 age_mtime=0 declare delim=$'\n' keep=3 movedir= scanarch=
QUIET=0 @@ -40,6 +41,29 @@ die() { exit 1 }
+# Parses the age --age-atime and --age-mtime arguments +parse_age() { + declare -i age=0 + if [[ $2 =~ ^[[:space:]]*([0-9]+[dmy][[:space:]]*)+$ ]]; then + # Add spaces to facilitate splitting + temp=${2//d/d } + temp=${temp//m/m } + temp=${temp//y/y } + read -a temp <<< "${temp[*]}" + for a in ${temp[@]}; do + num=${a:0: -1} + case ${a: -1} in + d) age=$(( age + num )) ;; + m) age=$(( age + num * 30 )) ;; + y) age=$(( age + num * 365 )) ;; + esac + done + else + die "argument '%s' to option '%s' must be of the form '([0-9]+[dmy])+'" "$2" "$1" + fi + echo $(( age * 24 * 60 * 60 )) +} + # reads a list of files on stdin and prints out deletion candidates pkgfilter() { # there's whitelist and blacklist parameters passed to this @@ -174,6 +198,10 @@ Usage: ${myname} <operation> [options] [targets...] -r, --remove remove candidate packages.
Options: + --age-atime <age> + --age-mtime <age> keep packages with an atime/mtime that is not at least + <age> ago, where <age> is given as '10d', '1m', '1y', + '1y1m' etc. -a, --arch <arch> scan for "arch" (default: all architectures). -c, --cachedir <dir> scan "dir" for packages. can be used more than once. (default: read from @sysconfdir@/pacman.conf). @@ -200,7 +228,8 @@ version() {
OPT_SHORT=':a:c:dfhi:k:m:qrsuVvz' OPT_LONG=('arch:' 'cachedir:' 'dryrun' 'force' 'help' 'ignore:' 'keep:' 'move' - 'nocolor' 'quiet' 'remove' 'uninstalled' 'version' 'verbose' 'null') + 'nocolor' 'quiet' 'remove' 'uninstalled' 'version' 'verbose' 'null' + 'age-atime:' 'age-mtime:')
if ! parseopts "$OPT_SHORT" "${OPT_LONG[@]}" -- "$@"; then exit 1 @@ -210,6 +239,18 @@ unset OPT_SHORT OPT_LONG OPTRET
while :; do case $1 in + --age-atime) + age_atime=$(parse_age "$1" "$2") + if (( $? )); then + exit 1 + fi + shift ;; + --age-mtime) + age_mtime=$(parse_age "$1" "$2") + if (( $? )); then + exit 1 + fi + shift ;; -a|--arch) scanarch=$2 shift ;; @@ -319,6 +360,21 @@ for cachedir in "${cachedirs[@]}"; do popd &>/dev/null done
+# remove any candidates that are not old enough yet +if (( $age_atime || $age_mtime )); then
Variables do not need to be dollar prefixed in arithmetic expansion.
+ currtime=$(date +%s) + for cand in "${candidates[@]}"; do + IFS=';' read -d '' -a temp <<< $(stat --format '%X;%Y' "$cand")
There's no reason to have this be read into an array, something like this would be simpler: IFS=';' read atime mtime < <(stat --format '%X;%Y' "$cand")
+ if (( ( $(( $currtime - ${temp[0]} )) > $age_atime ) && \ + ( $(( $currtime - ${temp[1]} )) > $age_mtime ) \
I think this should work?: if (( currtime - atime > min_atime )) && \ (( currtime - mtime > min_mtime ))
+ )); then + candtemp+=("$cand") + fi + done + candidates=("${candtemp[@]}") + unset candtemp +fi + if (( ! ${#candidates[*]} )); then msg 'no candidate packages found for pruning' exit 0 -- 2.19.0
-- Sincerely, Johannes Löthberg PGP Key ID: 0x50FB9B273A9D0BB5 PGP Key FP: 5134 EF9E AF65 F95B 6BB1 608E 50FB 9B27 3A9D 0BB5 https://theos.kyriasis.com/~kyrias/
--- Hi,
So I would overall be okay with adding something like this, but there are some changes I would want to have made first. First of all, is there any specific case where you would need both supported? Because only having mtime support sounds like it should be good enough?
In fact, my usecase would only involve atime. mtime is probably not a very useful value, since it will usually be the date the package was built. The timestamp I'm actually interested in is installation time. Even for a `noatime`-mounted drive (where atime will be the download time), atime will be close to when the package was first installed. I agree that `--min-{a,m}time` is much more descriptive, changed that. @Eli: Thanks for the pointers at `touch`/`find`. I dropped my own parser in favor of invoking `date` (which uses the same parser as `touch`). However, because I want to combine the retention according to `--keep` and `--min-{a,m}time`, candidate selection cannot be done directly by `find`, but still needs to be done manually (in the awk script). Also, not creating a new `stat` process for every single file appears to improve performance of this feature quite a lot (subjectively, didn't benchmark). doc/paccache.8.txt | 5 ++++ src/paccache.sh.in | 61 ++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 59 insertions(+), 7 deletions(-) diff --git a/doc/paccache.8.txt b/doc/paccache.8.txt index db81283..196bb49 100644 --- a/doc/paccache.8.txt +++ b/doc/paccache.8.txt @@ -38,6 +38,11 @@ Options Scan for packages for a specific architecture. Default is to scan for all architectures. +*\--min-atime <age>*:: +*\--min-mtime <age>*:: + Only consider packages for removal with atime respectively mtime older than + specified. The age can be given as '10d', '1m', '1y', '1y1m' etc. + *-c, \--cachedir <dir>*:: Specify a different cache directory. This option can be used more than once. Default is to use the cache directory configured in 'pacman.conf'. diff --git a/src/paccache.sh.in b/src/paccache.sh.in index 012ba9f..70e30e0 100644 --- a/src/paccache.sh.in +++ b/src/paccache.sh.in @@ -27,6 +27,7 @@ declare -r myver='@PACKAGE_VERSION@' declare -a cachedirs=() candidates=() cmdopts=() whitelist=() blacklist=() declare -i delete=0 dryrun=0 filecount=0 move=0 needsroot=0 totalsaved=0 verbose=0 +declare -i min_atime=0 min_mtime=0 declare delim=$'\n' keep=3 movedir= scanarch= QUIET=0 @@ -45,13 +46,23 @@ pkgfilter() { # there's whitelist and blacklist parameters passed to this # script after the block of awk. - awk -v keep="$1" -v scanarch="$2" ' + awk -v keep="$1" -v scanarch="$2" -v min_atime="$3" -v min_mtime="$4" ' function basename(str) { sub(".*/", "", str); return str; } - function parse_filename(filename, parts, count, i, pkgname, arch) { + function parse_filename(filename, + atime, mtime, parts, count, i, pkgname, arch) { + + if (0 + min_atime + min_mtime != 0) { + # atime and mtime are in the first two columns and the + # separator is a single space + split(filename, parts, " ") + atime = parts[1] + mtime = parts[2] + filename = substr(filename, length(atime) + length(mtime) + 3) + } count = split(basename(filename), parts, "-") @@ -69,8 +80,12 @@ pkgfilter() { if ("" == packages[pkgname,arch]) { packages[pkgname,arch] = filename + atimes[pkgname,arch] = atime + mtimes[pkgname,arch] = mtime } else { packages[pkgname,arch] = packages[pkgname,arch] SUBSEP filename + atimes[pkgname,arch] = atimes[pkgname,arch] SUBSEP atime + mtimes[pkgname,arch] = mtimes[pkgname,arch] SUBSEP mtime } } @@ -101,12 +116,19 @@ pkgfilter() { # enforce architecture match if specified if (!scanarch || scanarch == idx[2]) { count = split(packages[idx[1], idx[2]], pkgs, SUBSEP) + split(atimes[idx[1], idx[2]], atime, SUBSEP) + split(mtimes[idx[1], idx[2]], mtime, SUBSEP) for(i = 1; i <= count - keep; i++) { - print pkgs[i] + # If checking file age, potentially keep more candidates + if ((0 + min_atime == 0 || (strtonum(atime[i]) < 0 + min_atime)) && + (0 + min_mtime == 0 || (strtonum(mtime[i]) < 0 + min_mtime)) \ + ) { + print pkgs[i] + } } } } - }' "${@:3}" + }' "${@:5}" } m4_include(../lib/size_to_human.sh) @@ -174,6 +196,12 @@ Usage: ${myname} <operation> [options] [targets...] -r, --remove remove candidate packages. Options: + --min-atime <time> + --min-mtime <time> keep packages with an atime/mtime that is not older + than the time given, even if this means keeping more + than specified through the '--keep' option. Accepts + arguments according to 'info "Date input formats"', + e.g. '30 days ago'. -a, --arch <arch> scan for "arch" (default: all architectures). -c, --cachedir <dir> scan "dir" for packages. can be used more than once. (default: read from @sysconfdir@/pacman.conf). @@ -200,7 +228,8 @@ version() { OPT_SHORT=':a:c:dfhi:k:m:qrsuVvz' OPT_LONG=('arch:' 'cachedir:' 'dryrun' 'force' 'help' 'ignore:' 'keep:' 'move' - 'nocolor' 'quiet' 'remove' 'uninstalled' 'version' 'verbose' 'null') + 'nocolor' 'quiet' 'remove' 'uninstalled' 'version' 'verbose' 'null' + 'min-atime:' 'min-mtime:') if ! parseopts "$OPT_SHORT" "${OPT_LONG[@]}" -- "$@"; then exit 1 @@ -210,6 +239,18 @@ unset OPT_SHORT OPT_LONG OPTRET while :; do case $1 in + --min-atime) + min_atime=$(date -d "$2" +%s) + if (( $? )); then + die "argument to option --min-atime must be of the form described by 'info \"Date input formats\" '." + fi + shift ;; + --min-mtime) + min_mtime=$(date -d "$2" +%s) + if (( $? )); then + die "argument to option --min-mtime must be of the form described by 'info \"Date input formats\" '." + fi + shift ;; -a|--arch) scanarch=$2 shift ;; @@ -308,8 +349,14 @@ for cachedir in "${cachedirs[@]}"; do # note that these results are returned in an arbitrary order from awk, but # they'll be resorted (in summarize) iff we have a verbosity level set. IFS=$'\n' read -r -d '' -a cand < \ - <(printf '%s\n' "$PWD"/*.pkg.tar!(*.sig) | pacsort --files | - pkgfilter "$keep" "$scanarch" \ + <( if (( min_atime || min_mtime )); then + find "$PWD" -name '*.pkg.tar*.sig' -prune -o \( -name '*.pkg.tar*' -printf '%A@ %T@ %p\n' \) | + pacsort --files --key 3 --separator ' ' + else + printf '%s\n' "$PWD"/*.pkg.tar!(*.sig) | + pacsort --files + fi | + pkgfilter "$keep" "$scanarch" "$min_atime" "$min_mtime" \ "${#whitelist[*]}" "${whitelist[@]}" \ "${#blacklist[*]}" "${blacklist[@]}") -- 2.19.0
On 9/14/18 6:58 PM, wisp3rwind wrote:
--- I would feel a lot more confident about using the paccache systemd service if it kept packages based on age instead of just the few most recent.
This patch adds the functionality to skip candidates that are not older (in terms of atime or mtime) than some specified age. It seems to work, but I'm not exactly a bash expert, so please review with care. I'd appreciate if this could be merged!
doc/paccache.8.txt | 5 ++++ src/paccache.sh.in | 58 +++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 62 insertions(+), 1 deletion(-)
diff --git a/doc/paccache.8.txt b/doc/paccache.8.txt index db81283..c9e3807 100644 --- a/doc/paccache.8.txt +++ b/doc/paccache.8.txt @@ -38,6 +38,11 @@ Options Scan for packages for a specific architecture. Default is to scan for all architectures.
+*\--age-atime <age>*:: +*\--age-mtime <age>*:: + Only consider packages for removal with atime respectively mtime older than + specified. The age can be given as '10d', '1m', '1y', '1y1m' etc. + *-c, \--cachedir <dir>*:: Specify a different cache directory. This option can be used more than once. Default is to use the cache directory configured in 'pacman.conf'. diff --git a/src/paccache.sh.in b/src/paccache.sh.in index 012ba9f..8ee5792 100644 --- a/src/paccache.sh.in +++ b/src/paccache.sh.in @@ -27,6 +27,7 @@ declare -r myver='@PACKAGE_VERSION@'
declare -a cachedirs=() candidates=() cmdopts=() whitelist=() blacklist=() declare -i delete=0 dryrun=0 filecount=0 move=0 needsroot=0 totalsaved=0 verbose=0 +declare -i age_atime=0 age_mtime=0 declare delim=$'\n' keep=3 movedir= scanarch=
QUIET=0 @@ -40,6 +41,29 @@ die() { exit 1 }
+# Parses the age --age-atime and --age-mtime arguments +parse_age() { + declare -i age=0 + if [[ $2 =~ ^[[:space:]]*([0-9]+[dmy][[:space:]]*)+$ ]]; then + # Add spaces to facilitate splitting + temp=${2//d/d } + temp=${temp//m/m } + temp=${temp//y/y } + read -a temp <<< "${temp[*]}" + for a in ${temp[@]}; do + num=${a:0: -1} + case ${a: -1} in + d) age=$(( age + num )) ;; + m) age=$(( age + num * 30 )) ;; + y) age=$(( age + num * 365 )) ;; + esac + done + else + die "argument '%s' to option '%s' must be of the form '([0-9]+[dmy])+'" "$2" "$1" + fi + echo $(( age * 24 * 60 * 60 )) +}
This seems extremely complex, and I can think of two alternatives that would be a lot simpler. First, find -mtime 2, with a suitable glob pattern, by replacing printf '%s\n' "$PWD"/*.pkg.tar!(*.sig) when calculating the "candidates" array. Second: compare_file="$(mktemp -t paccache_timestamp.XXXXXX)" touch -d '2 days ago' "$compare_file" and then instead of this:
+# remove any candidates that are not old enough yet +if (( $age_atime || $age_mtime )); then + currtime=$(date +%s) + for cand in "${candidates[@]}"; do + IFS=';' read -d '' -a temp <<< $(stat --format '%X;%Y' "$cand") + if (( ( $(( $currtime - ${temp[0]} )) > $age_atime ) && \ + ( $(( $currtime - ${temp[1]} )) > $age_mtime ) \ + )); then + candtemp+=("$cand") + fi + done + candidates=("${candtemp[@]}") + unset candtemp +fi
for cand in "${candidates[@]}"; do if [[ $cand -nt $compare_file ]]; then candtemp+=("$cand") fi done The find command accepts a number of days, the touch command accepts a fairly decent "natural language" description. -- Eli Schwartz Bug Wrangler and Trusted User
participants (3)
-
Eli Schwartz
-
Johannes Löthberg
-
wisp3rwind