[arch-projects] [dbscripts] [PATCH 0/3] Fix ambiguous uses of
This was sort of cobbled together and not really tested, so I'm not 100% sure it will work, but it looks okay, so I am posting this to get more eyes on it. I think I've actually gotten this to work properly, which is yay, and support multiple extensions, which is meh but we may need this as Luke said, if we ever decide to switch over. Which means it is probably "more proper" to do so... As a bonus, we get to micro-optimize a few external calls away which saves us a handful of forked processes and should be just as fast not counting a fraction of a second gained for all those forks? Eli Schwartz (3): db-update: replace external find command with bash globbing ftpdir-cleanup,sourceballs: replace external find command with bash globbing Globally set $PKGEXT to a bash extended glob representing valid choices. config | 3 ++- cron-jobs/ftpdir-cleanup | 24 ++++++++++++++++++------ cron-jobs/sourceballs | 21 ++++++++++++++++----- db-functions | 13 +++++++++++-- db-update | 9 ++++++--- 5 files changed, 53 insertions(+), 17 deletions(-) -- 2.16.1
Don't bother emitting errors. bash doesn't show globbing errors if it cannot read a directory to try globbing there. And the former code never aborted on errors anyway, as without `set -o pipefail` the sort command swallowed the return code. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- db-functions | 4 ++++ db-update | 9 ++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/db-functions b/db-functions index e8949d7..f0f8980 100644 --- a/db-functions +++ b/db-functions @@ -2,6 +2,10 @@ . /usr/share/makepkg/util.sh +# global shell options for enhanced bash scripting +shopt -s globstar nullglob + + # Some PKGBUILDs need CARCH to be set CARCH="x86_64" diff --git a/db-update b/db-update index 45755a4..fa024b3 100755 --- a/db-update +++ b/db-update @@ -9,9 +9,12 @@ if (( $# >= 1 )); then fi # Find repos with packages to release -if ! staging_repos=($(find "${STAGING}" -mindepth 1 -type f -name "*${PKGEXT}" -printf '%h\n' | sort -u)); then - die "Could not read %s" "$STAGING" -fi +for f in "${STAGING}"/**/*${PKGEXT}; do + f="${f%/*}" + if [[ -d $f ]] && ! in_array "$f" "${staging_repos[@]}"; then + staging_repos+=("$f") + fi +done repos=() for staging_repo in ${staging_repos[@]##*/}; do -- 2.16.1
This fully removes the use of find from the codebase, leads to a micro-optimization in a couple cases, and ensures that $PKGEXT is consistently treated as a shell globbing character (which is important because it is used as one). Of the eight instances in these files: - One was unnecessary as `cat` can natively consume all files passed to it and no directory traversal was in use. - Two were unnecessary as they were hardcoded to read a single file.... - Another four were only being used to strip leading directory paths, and can be replaced by globstar and ${filepath##*/} - The final two were checking the modification time of the files, and can be replaced with touch(1) and [[ -nt ]]. Although this introduces an additional temporary file, this is not such a big deal. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- cron-jobs/ftpdir-cleanup | 24 ++++++++++++++++++------ cron-jobs/sourceballs | 21 ++++++++++++++++----- 2 files changed, 34 insertions(+), 11 deletions(-) diff --git a/cron-jobs/ftpdir-cleanup b/cron-jobs/ftpdir-cleanup index 2f3d5aa..2d33047 100755 --- a/cron-jobs/ftpdir-cleanup +++ b/cron-jobs/ftpdir-cleanup @@ -38,7 +38,11 @@ for repo in ${PKGREPOS[@]}; do continue fi # get a list of actual available package files - find "${FTP_BASE}/${repo}/os/${arch}" -xtype f -name "*${PKGEXT}" -printf '%f\n' | sort > "${WORKDIR}/repo-${repo}-${arch}" + for f in "${FTP_BASE}"/${repo}/os/${arch}/*${PKGEXT}; do + if [[ -f $f ]]; then + printf '%s\n' "${f##*/}" + fi + done | sort > "${WORKDIR}/repo-${repo}-${arch}" # get a list of package files defined in the repo db bsdtar -xOf "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" | awk '/^%FILENAME%/{getline;print}' | sort > "${WORKDIR}/db-${repo}-${arch}" @@ -61,10 +65,12 @@ for repo in ${PKGREPOS[@]}; do done done -# get a list of all available packages in the pacakge pool -find "$FTP_BASE/${PKGPOOL}" -name "*${PKGEXT}" -printf '%f\n' | sort > "${WORKDIR}/pool" +# get a list of all available packages in the package pool +for f in "$FTP_BASE/${PKGPOOL}"/*${PKGEXT}; do + printf '%s\n' "${f##*/}" +done | sort > "${WORKDIR}/pool" # create a list of packages in our db -find "${WORKDIR}" -maxdepth 1 -type f -name 'db-*' -exec cat {} \; | sort -u > "${WORKDIR}/db" +cat "${WORKDIR}"/db-* 2>/dev/null | sort -u > "${WORKDIR}/db" old_pkgs=($(comm -23 "${WORKDIR}/pool" "${WORKDIR}/db")) if [ ${#old_pkgs[@]} -ge 1 ]; then @@ -75,8 +81,14 @@ if [ ${#old_pkgs[@]} -ge 1 ]; then done fi -old_pkgs=($(find ${CLEANUP_DESTDIR} -type f -name "*${PKGEXT}" -mtime +${CLEANUP_KEEP} -printf '%f\n')) -if [ ${#old_pkgs[@]} -ge 1 ]; then +unset old_pkgs +touch -d "${CLEANUP_KEEP} days ago" "${WORKDIR}/cleanup_timestamp" +for f in "${CLEANUP_DESTDIR}"/**/*${PKGEXT}; do + if [[ ${WORKDIR}/cleanup_timestamp -nt $f ]]; then + old_pkgs+=("${f##*/}") + fi +done +if (( ${#old_pkgs[@]} > 1 )); then msg "Removing old packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do msg2 "${old_pkg}" diff --git a/cron-jobs/sourceballs b/cron-jobs/sourceballs index 9ab4e98..5844817 100755 --- a/cron-jobs/sourceballs +++ b/cron-jobs/sourceballs @@ -46,7 +46,11 @@ for repo in ${PKGREPOS[@]}; do done # Create a list of all available source package file names -find "${FTP_BASE}/${SRCPOOL}" -xtype f -name "*${SRCEXT}" -printf '%f\n' | sort -u > "${WORKDIR}/available-src-pkgs" +for f in "${FTP_BASE}"/${SRCPOOL}/*${SRCEXT}; do + if [[ -f $f ]]; then + printf '%s\n' "${f##*/}" + fi +done | sort -u > "${WORKDIR}/available-src-pkgs" # Check for all packages if we need to build a source package for repo in ${PKGREPOS[@]}; do @@ -117,8 +121,8 @@ for repo in ${PKGREPOS[@]}; do done # Cleanup old source packages -find "${WORKDIR}" -maxdepth 1 -type f -name 'expected-src-pkgs' -exec cat {} \; | sort -u > "${WORKDIR}/expected-src-pkgs.sort" -find "${WORKDIR}" -maxdepth 1 -type f -name 'available-src-pkgs' -exec cat {} \; | sort -u > "${WORKDIR}/available-src-pkgs.sort" +sort -u "${WORKDIR}/expected-src-pkgs" > "${WORKDIR}/expected-src-pkgs.sort" +sort -u "${WORKDIR}/available-src-pkgs" > "${WORKDIR}/available-src-pkgs.sort" old_pkgs=($(comm -23 "${WORKDIR}/available-src-pkgs.sort" "${WORKDIR}/expected-src-pkgs.sort")) if [ ${#old_pkgs[@]} -ge 1 ]; then @@ -133,8 +137,15 @@ if [ ${#old_pkgs[@]} -ge 1 ]; then done fi -old_pkgs=($(find ${SOURCE_CLEANUP_DESTDIR} -type f -name "*${SRCEXT}" -mtime +${SOURCE_CLEANUP_KEEP} -printf '%f\n')) -if [ ${#old_pkgs[@]} -ge 1 ]; then +unset old_pkgs +touch -d "${SOURCE_CLEANUP_KEEP} days ago" "${WORKDIR}/cleanup_timestamp" +for f in "${SOURCE_CLEANUP_DESTDIR}"/*${SRCEXT}; do + if [[ ${WORKDIR}/cleanup_timestamp -nt $f ]]; then + old_pkgs+=("${f##*/}") + fi +done + +if (( ${#old_pkgs[@]} > 1 )); then msg "Removing old source packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do msg2 "${old_pkg}" -- 2.16.1
On Thu, 15 Feb 2018 22:45:03 -0500, Eli Schwartz via arch-projects wrote:
diff --git a/cron-jobs/ftpdir-cleanup b/cron-jobs/ftpdir-cleanup index 2f3d5aa..2d33047 100755 --- a/cron-jobs/ftpdir-cleanup +++ b/cron-jobs/ftpdir-cleanup ... -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} > 1 )); then
That should either be >= 1 or > 0.
diff --git a/cron-jobs/sourceballs b/cron-jobs/sourceballs index 9ab4e98..5844817 100755 --- a/cron-jobs/sourceballs +++ b/cron-jobs/sourceballs ... -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} > 1 )); then
Likewise. -- Happy hacking, ~ Luke Shumaker
On 02/17/2018 02:29 PM, Luke Shumaker wrote:
On Thu, 15 Feb 2018 22:45:03 -0500, Eli Schwartz via arch-projects wrote:
diff --git a/cron-jobs/ftpdir-cleanup b/cron-jobs/ftpdir-cleanup index 2f3d5aa..2d33047 100755 --- a/cron-jobs/ftpdir-cleanup +++ b/cron-jobs/ftpdir-cleanup ... -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} > 1 )); then
That should either be >= 1 or > 0.
diff --git a/cron-jobs/sourceballs b/cron-jobs/sourceballs index 9ab4e98..5844817 100755 --- a/cron-jobs/sourceballs +++ b/cron-jobs/sourceballs ... -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} > 1 )); then
Likewise.
Thanks. I've also noticed this whole patchset terribly breaks the testsuite. Which is sort of expected. We are overloading PKGEXT to mean something dbscripts specific, but that then (I think?) gets imported into makepkg during the testsuite builds. I'm going to rename it to PKGEXTS as that serves a number of purposes: it avoids clashing with makepkg, it is more descriptive of its actual purpose, and it provides a free semantic warning to future readers of the code that this variable is meant to be more than one thing, and extra care *must* be taken when using it. But the real issue is that we then use this variable to complete ${pkgnames[@]/%/${PKGEXT}} which works, sort of, as it coincidentally globs okay with ? but is technically quite wrong for the above mentioned reasons. Really, once pacman 5.1 is released containing my fix that makes --packagelist finally useful for the first time ever, this will automatically be fixed, as the use of print_all_package_names will simply return full filename paths and there will be no need to glob something that matches both filenames from your patch "Update tests to check for glob regression". The testsuite keeps looking for files that match some random dbscripts glob which has nothing to do with the hardcoded .pkg.tar.xz in a stock makepkg.conf, and the testsuite seems to be subtly buggy. Looks like PKGEXT is also used for complicated things in checkPackageDB, with more unquoted [ ] paths as well as grep -q "${pkgfile%${PKGEXT}}" which actually *breaks* with extglob. Because extended globs don't fall back on being a string literal -- which is behavior I approve of. So this needs to use ${pkgname}-${pkgver}-${pkgarch}. We can either have $pkgfile not include $PKGEXT, use more is_globfile (which is not actually available in the testsuite as db-functions is not sourced and will break everything if you try since it runs mktemp with the bats TMPDIR or something), or rename PKGEXT, and have the testsuite use the PKGEXT from makepkg.conf since that is what it will use anyway when running makepkg... -- Eli Schwartz Bug Wrangler and Trusted User
This can be anything makepkg.conf accepts, therefore it needs to be able to match all that. Document the fact that this has *always* been some sort of glob, and update the two cases where this was (not!) being evaluated by bash [[ ... ]], to use a proxy function is_globfile() Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- config | 3 ++- db-functions | 11 ++++++++--- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/config b/config index d2c1942..7a90fc6 100644 --- a/config +++ b/config @@ -25,7 +25,8 @@ TMPDIR="/var/tmp" ARCHES=(x86_64) DBEXT=".db.tar.gz" FILESEXT=".files.tar.gz" -PKGEXT=".pkg.tar.?z" +# bash glob listing allowed extensions. Note that db-functions turns on extglob. +PKGEXT=".pkg.tar.@(gz|bz2|xz|lzo|lrz|Z)" SRCEXT=".src.tar.gz" # Allowed licenses: get sourceballs only for licenses in this array diff --git a/db-functions b/db-functions index f0f8980..7cf8444 100644 --- a/db-functions +++ b/db-functions @@ -3,7 +3,7 @@ . /usr/share/makepkg/util.sh # global shell options for enhanced bash scripting -shopt -s globstar nullglob +shopt -s extglob globstar nullglob # Some PKGBUILDs need CARCH to be set @@ -20,6 +20,11 @@ restore_umask () { umask $UMASK >/dev/null } +# Check if a file exists, even if the file uses wildcards +is_globfile() { + [[ -f $1 ]] +} + # just like mv -f, but we touch the file and then copy the content so # default ACLs in the target dir will be applied mv_acl() { @@ -378,8 +383,8 @@ check_pkgrepos() { local pkgver="$(getpkgver ${pkgfile})" || return 1 local pkgarch="$(getpkgarch ${pkgfile})" || return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT} ]] && return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT}.sig ]] && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT}.sig && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1 -- 2.16.1
Comes with fancy checkmarks from travis saying that the testsuite passed: https://github.com/archlinux/dbscripts/commits/pkgext-real-wildcards Eli Schwartz (5): Use even more bashisms. Fix overloading PKGEXT to mean two things. db-update: replace external find command with bash globbing ftpdir-cleanup,sourceballs: replace external find command with bash globbing Globally set $PKGEXT to a bash extended glob representing valid choices. config | 3 ++- cron-jobs/devlist-mailer | 6 +++--- cron-jobs/ftpdir-cleanup | 36 ++++++++++++++++++++++++------------ cron-jobs/integrity-check | 2 +- cron-jobs/sourceballs | 31 +++++++++++++++++++++---------- cron-jobs/update-web-db | 6 +++--- db-functions | 13 +++++++++++-- db-move | 4 ++-- db-update | 17 +++++++++++------ test/cases/db-repo-add.bats | 6 +++--- test/cases/db-update.bats | 3 +-- test/cases/ftpdir-cleanup.bats | 6 +++--- test/lib/common.bash | 6 ++++-- 13 files changed, 89 insertions(+), 50 deletions(-) -- 2.16.2
Catch some cases that were missed in the previous run. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- This patch is new + refactor some changes from: ftpdir-cleanup,sourceballs: replace external find command with bash globbing cron-jobs/devlist-mailer | 6 +++--- cron-jobs/ftpdir-cleanup | 14 +++++++------- cron-jobs/integrity-check | 2 +- cron-jobs/sourceballs | 12 ++++++------ cron-jobs/update-web-db | 6 +++--- 5 files changed, 20 insertions(+), 20 deletions(-) diff --git a/cron-jobs/devlist-mailer b/cron-jobs/devlist-mailer index ca2e46b..65a5483 100755 --- a/cron-jobs/devlist-mailer +++ b/cron-jobs/devlist-mailer @@ -7,17 +7,17 @@ LIST="arch-dev-public@archlinux.org" FROM="repomaint@archlinux.org" SUBJECT="Repository Maintenance $(date +"%d-%m-%Y")" -if [ $# -ge 1 ]; then +if (( $# >= 1 )); then SUBJECT="$1 $(date +"%d-%m-%Y")" fi -if [ $# -ge 2 ]; then +if (( $# >= 2 )); then LIST="$2" fi stdin="$(cat)" #echo used to strip whitespace for checking for actual data -if [ -n "$(echo $stdin)" ]; then +if [[ -n "$(echo $stdin)" ]]; then echo "Subject: $SUBJECT To: $LIST diff --git a/cron-jobs/ftpdir-cleanup b/cron-jobs/ftpdir-cleanup index 2f3d5aa..c771950 100755 --- a/cron-jobs/ftpdir-cleanup +++ b/cron-jobs/ftpdir-cleanup @@ -9,11 +9,11 @@ clean_pkg() { if ! ${CLEANUP_DRYRUN}; then for pkg in "$@"; do - if [ -h "$pkg" ]; then + if [[ -h $pkg ]]; then rm -f "$pkg" "$pkg.sig" else mv_acl "$pkg" "$CLEANUP_DESTDIR/${pkg##*/}" - if [ -e "$pkg.sig" ]; then + if [[ -e $pkg.sig ]]; then mv_acl "$pkg.sig" "$CLEANUP_DESTDIR/${pkg##*/}.sig" fi touch "${CLEANUP_DESTDIR}/${pkg##*/}" @@ -34,7 +34,7 @@ ${CLEANUP_DRYRUN} && warning 'dry run mode is active' for repo in ${PKGREPOS[@]}; do for arch in ${ARCHES[@]}; do - if [ ! -f "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" ]; then + if [[ ! -f ${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT} ]]; then continue fi # get a list of actual available package files @@ -43,7 +43,7 @@ for repo in ${PKGREPOS[@]}; do bsdtar -xOf "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" | awk '/^%FILENAME%/{getline;print}' | sort > "${WORKDIR}/db-${repo}-${arch}" missing_pkgs=($(comm -13 "${WORKDIR}/repo-${repo}-${arch}" "${WORKDIR}/db-${repo}-${arch}")) - if [ ${#missing_pkgs[@]} -ge 1 ]; then + if (( ${#missing_pkgs[@]} >= 1 )); then error "Missing packages in [%s] (%s)..." "$repo" "$arch" for missing_pkg in ${missing_pkgs[@]}; do msg2 "${missing_pkg}" @@ -51,7 +51,7 @@ for repo in ${PKGREPOS[@]}; do fi old_pkgs=($(comm -23 "${WORKDIR}/repo-${repo}-${arch}" "${WORKDIR}/db-${repo}-${arch}")) - if [ ${#old_pkgs[@]} -ge 1 ]; then + if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old packages from [%s] (%s)..." "$repo" "$arch" for old_pkg in ${old_pkgs[@]}; do msg2 "${old_pkg}" @@ -67,7 +67,7 @@ find "$FTP_BASE/${PKGPOOL}" -name "*${PKGEXT}" -printf '%f\n' | sort > "${WORKDI find "${WORKDIR}" -maxdepth 1 -type f -name 'db-*' -exec cat {} \; | sort -u > "${WORKDIR}/db" old_pkgs=($(comm -23 "${WORKDIR}/pool" "${WORKDIR}/db")) -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old packages from package pool..." for old_pkg in ${old_pkgs[@]}; do msg2 "${old_pkg}" @@ -76,7 +76,7 @@ if [ ${#old_pkgs[@]} -ge 1 ]; then fi old_pkgs=($(find ${CLEANUP_DESTDIR} -type f -name "*${PKGEXT}" -mtime +${CLEANUP_KEEP} -printf '%f\n')) -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do msg2 "${old_pkg}" diff --git a/cron-jobs/integrity-check b/cron-jobs/integrity-check index 211a24b..f1e75ab 100755 --- a/cron-jobs/integrity-check +++ b/cron-jobs/integrity-check @@ -7,7 +7,7 @@ dirname="$(dirname $0)" script_lock -if [ $# -ne 1 ]; then +if (( $# != 1 )); then die "usage: ${0##*/} <mailto>" fi mailto=$1 diff --git a/cron-jobs/sourceballs b/cron-jobs/sourceballs index 9ab4e98..25a8abb 100755 --- a/cron-jobs/sourceballs +++ b/cron-jobs/sourceballs @@ -21,7 +21,7 @@ renice +10 -p $$ > /dev/null for repo in ${PKGREPOS[@]}; do for arch in ${ARCHES[@]}; do # Repo does not exist; skip it - if [ ! -f "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" ]; then + if [[ ! -f ${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT} ]]; then continue fi bsdtar -xOf "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" \ @@ -81,7 +81,7 @@ for repo in ${PKGREPOS[@]}; do mkdir -p -m0770 "${WORKDIR}/pkgbuilds/${repo}-${pkgarch}" arch_svn export -q "${SVNREPO}/${pkgbase}/repos/${repo}-${pkgarch}" \ "${WORKDIR}/pkgbuilds/${repo}-${pkgarch}/${pkgbase}" >/dev/null 2>&1 - if [ $? -ge 1 ]; then + if (( $? >= 1 )); then failedpkgs+=("${pkgbase}-${pkgver}${SRCEXT}") continue fi @@ -89,7 +89,7 @@ for repo in ${PKGREPOS[@]}; do # Build the actual source package pushd "${WORKDIR}/pkgbuilds/${repo}-${pkgarch}/${pkgbase}" >/dev/null makepkg --nocolor --allsource --ignorearch --skippgpcheck --config "${dirname}/makepkg.conf" >"${WORKDIR}/${pkgbase}.log" 2>&1 - if [ $? -eq 0 ] && [ -f "${pkgbase}-${pkgver}${SRCEXT}" ]; then + if (( $? == 0 )) && [[ -f ${pkgbase}-${pkgver}${SRCEXT} ]]; then mv_acl "${pkgbase}-${pkgver}${SRCEXT}" "${FTP_BASE}/${SRCPOOL}/${pkgbase}-${pkgver}${SRCEXT}" # Avoid creating the same source package for every arch echo "${pkgbase}-${pkgver}${SRCEXT}" >> "${WORKDIR}/available-src-pkgs" @@ -121,7 +121,7 @@ find "${WORKDIR}" -maxdepth 1 -type f -name 'expected-src-pkgs' -exec cat {} \; find "${WORKDIR}" -maxdepth 1 -type f -name 'available-src-pkgs' -exec cat {} \; | sort -u > "${WORKDIR}/available-src-pkgs.sort" old_pkgs=($(comm -23 "${WORKDIR}/available-src-pkgs.sort" "${WORKDIR}/expected-src-pkgs.sort")) -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old source packages..." ${SOURCE_CLEANUP_DRYRUN} && warning 'dry run mode is active' for old_pkg in ${old_pkgs[@]}; do @@ -134,7 +134,7 @@ if [ ${#old_pkgs[@]} -ge 1 ]; then fi old_pkgs=($(find ${SOURCE_CLEANUP_DESTDIR} -type f -name "*${SRCEXT}" -mtime +${SOURCE_CLEANUP_KEEP} -printf '%f\n')) -if [ ${#old_pkgs[@]} -ge 1 ]; then +if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old source packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do msg2 "${old_pkg}" @@ -142,7 +142,7 @@ if [ ${#old_pkgs[@]} -ge 1 ]; then done fi -if [ -f "${WORKDIR}/makepkg-fail.log" ]; then +if [[ -f ${WORKDIR}/makepkg-fail.log ]]; then msg "Log of failed packages" cat "${WORKDIR}/makepkg-fail.log" fi diff --git a/cron-jobs/update-web-db b/cron-jobs/update-web-db index 23af067..c859ce0 100755 --- a/cron-jobs/update-web-db +++ b/cron-jobs/update-web-db @@ -27,7 +27,7 @@ renice +5 -p $$ > /dev/null echo "%s: Updating DB at %s" "$cmd" "$(date)" >> "${LOGOUT}" # source our virtualenv if it exists -if [ -f "$ENVPATH" ]; then +if [[ -f "$ENVPATH" ]]; then . "$ENVPATH" fi @@ -47,7 +47,7 @@ for repo in ${REPOS[@]}; do for arch in ${ARCHES[@]}; do repo_lock ${repo} ${arch} || exit 1 dbfile="/srv/ftp/${repo}/os/${arch}/${repo}${dbfileext}" - if [ -f "${dbfile}" ]; then + if [[ -f ${dbfile} ]]; then mkdir -p "${WORKDIR}/${repo}/${arch}" cp "${dbfile}" "${WORKDIR}/${repo}/${arch}/${repo}${dbfileext}" fi @@ -60,7 +60,7 @@ pushd $SPATH >/dev/null for repo in ${REPOS[@]}; do for arch in ${ARCHES[@]}; do dbcopy="${WORKDIR}/${repo}/${arch}/${repo}${dbfileext}" - if [ -f "${dbcopy}" ]; then + if [[ -f ${dbcopy} ]]; then echo "Updating ${repo}-${arch}" >> "${LOGOUT}" ./manage.py reporead ${flags} ${arch} "${dbcopy}" >> "${LOGOUT}" 2>&1 echo "" >> "${LOGOUT}" -- 2.16.2
Hi Eli, Disclaimer: the following is a bit subtle topic, so I hope it doesn't spur a lot of off-topic. On 19 February 2018 at 20:11, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote:
Catch some cases that were missed in the previous run.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> ---
This patch is new + refactor some changes from: ftpdir-cleanup,sourceballs: replace external find command with bash globbing
cron-jobs/devlist-mailer | 6 +++--- cron-jobs/ftpdir-cleanup | 14 +++++++------- cron-jobs/integrity-check | 2 +- cron-jobs/sourceballs | 12 ++++++------ cron-jobs/update-web-db | 6 +++--- 5 files changed, 20 insertions(+), 20 deletions(-)
Is there any performance or other technical benefit to using more bashisms? Reason being, that I am slowly going through different parts of Arch making it zsh friendly. While keeping the code brief and legible, of course. Guessing that I've picked the wrong hobby? Thanks Emil
On Tue, Feb 20, 2018 at 11:59:49AM +0000, Emil Velikov via arch-projects wrote:
Hi Eli,
Disclaimer: the following is a bit subtle topic, so I hope it doesn't spur a lot of off-topic.
On 19 February 2018 at 20:11, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote:
Catch some cases that were missed in the previous run.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> ---
This patch is new + refactor some changes from: ftpdir-cleanup,sourceballs: replace external find command with bash globbing
cron-jobs/devlist-mailer | 6 +++--- cron-jobs/ftpdir-cleanup | 14 +++++++------- cron-jobs/integrity-check | 2 +- cron-jobs/sourceballs | 12 ++++++------ cron-jobs/update-web-db | 6 +++--- 5 files changed, 20 insertions(+), 20 deletions(-)
Is there any performance or other technical benefit to using more bashisms?
The scripts run under bash, so why not take advantage of bash features? For example, bash's [[ and (( are less error prone and more featureful than the POSIX [, and builtins like mapfile and read (POSIX read has an extremely limited featureset) make I/O far simpler tasks. There's plenty more to like... Please don't try to talk about performance and shell in the same sentence. These are not performance-sensitive scripts, and shell is not a language to use when performance (of almost any kind) is relevant.
Reason being, that I am slowly going through different parts of Arch making it zsh friendly. While keeping the code brief and legible, of course.
Feel free to exemplify how conversion from bash to zsh has aided your goals while retaining portability to a supermajority of Arch systems. $ pacman -Q zsh error: package 'zsh' was not found
Guessing that I've picked the wrong hobby?
Almost certainly.
Thanks Emil
On 02/20/2018 06:59 AM, Emil Velikov wrote:
Disclaimer: the following is a bit subtle topic, so I hope it doesn't spur a lot of off-topic.
Eh, I don't mind.
Is there any performance or other technical benefit to using more bashisms?
Reason being, that I am slowly going through different parts of Arch making it zsh friendly. While keeping the code brief and legible, of course. Guessing that I've picked the wrong hobby?
I think you'll probably find that few people write zsh scripts for non-interactive use. I'm not really sure what the point would be, considering it has a nonstandard syntax (bash is ubiquitous, zsh is not), and many people who would know bash would not know zsh (like me for example). AFAIK zsh should more or less run either bash or POSIX sh scripts just fine if you invoke it via a symlink named `sh` or `bash`, because zsh has a bash compatibility mode. I have no idea whether that bash compatibility mode fixes subtle things like the fact that zsh arrays are 1-indexed while bash arrays are 0-indexed, but if I had to guess, probably not. ... I can see some compelling reasons to write scripts targeting POSIX sh as a baseline, which is being *sh* friendly, not zsh friendly. But, for projects that make heavy use of bashisms anyways, I dislike using POSIX because it implies that sh will be supported in any way when it really won't be. Essentially, I prefer to go "all in". As for why you'd want them, bashisms generally look cleaner IMHO, and they add a great deal of power and flexibility to the shell. Things like [[ ... ]] are just a lot more sane in basically every way, shell arithmetic uses proper operators, etc. -- Eli Schwartz Bug Wrangler and Trusted User
On 20 February 2018 at 14:23, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 02/20/2018 06:59 AM, Emil Velikov wrote:
Disclaimer: the following is a bit subtle topic, so I hope it doesn't spur a lot of off-topic.
Eh, I don't mind.
Is there any performance or other technical benefit to using more bashisms?
Reason being, that I am slowly going through different parts of Arch making it zsh friendly. While keeping the code brief and legible, of course. Guessing that I've picked the wrong hobby?
I think you'll probably find that few people write zsh scripts for non-interactive use. I'm not really sure what the point would be, considering it has a nonstandard syntax (bash is ubiquitous, zsh is not), and many people who would know bash would not know zsh (like me for example).
AFAIK zsh should more or less run either bash or POSIX sh scripts just fine if you invoke it via a symlink named `sh` or `bash`, because zsh has a bash compatibility mode. I have no idea whether that bash compatibility mode fixes subtle things like the fact that zsh arrays are 1-indexed while bash arrays are 0-indexed, but if I had to guess, probably not.
...
I can see some compelling reasons to write scripts targeting POSIX sh as a baseline, which is being *sh* friendly, not zsh friendly. But, for projects that make heavy use of bashisms anyways, I dislike using POSIX because it implies that sh will be supported in any way when it really won't be. Essentially, I prefer to go "all in".
As for why you'd want them, bashisms generally look cleaner IMHO, and they add a great deal of power and flexibility to the shell. Things like [[ ... ]] are just a lot more sane in basically every way, shell arithmetic uses proper operators, etc.
Seems like I wasn't clear enough: The goal is not to appease zsh - but a step closer to POSIX sh friendly. I've been staring and writing bash (closer to POSIX sh really) scripts for over a decade, haven't seen what makes X cleaner over Y. Yet that's subjective, unlike the original argument - consistency rules ;-) Thanks Emil
On 02/20/2018 12:24 PM, Emil Velikov wrote:
Seems like I wasn't clear enough: The goal is not to appease zsh - but a step closer to POSIX sh friendly.
I've been staring and writing bash (closer to POSIX sh really) scripts for over a decade, haven't seen what makes X cleaner over Y. Yet that's subjective, unlike the original argument - consistency rules ;-)
If you're working for "POSIX sh friendly", why are you mentioning zsh in the first place? As for targeting POSIX sh, if you can do that then sure. I have personal scripts written for sh when it makes sense (and my /bin/sh is symlinked to dash, so I actually use sh). But yeah, consistency rules. -- Eli Schwartz Bug Wrangler and Trusted User
PKGEXT is a makepkg variable referring to a fixed filename suffix, but we were also using it to mean a bash glob referring to candidate filenames. This is wrong, so rename it to PKGEXTS which is more descriptive of its purpose. Exclude the testsuite from this change, as the testsuite actually uses PKGEXT for its intended purpose. Fix the testsuite to consistently use PKGEXT, as it hardcoded the file extension in several cases, and extract its value from the makepkg.conf we ship. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- This is new, and renders some old things obsolete. I see no need to jump through more hoops than strictly necessary, plus, lots of this was technically broken anyway -- but it worked because again, the testsuite is not using PKGEXT in the dbscripts sense, rather in the makepkg sense. I'm hoping some of the logic there can be replaced with my --packagelist fixes for makepkg 5.1 as well. config | 2 +- cron-jobs/ftpdir-cleanup | 6 +++--- db-functions | 4 ++-- db-move | 4 ++-- db-update | 8 ++++---- test/cases/db-repo-add.bats | 6 +++--- test/cases/db-update.bats | 3 +-- test/cases/ftpdir-cleanup.bats | 6 +++--- test/lib/common.bash | 6 ++++-- 9 files changed, 23 insertions(+), 22 deletions(-) diff --git a/config b/config index d2c1942..5bb3b16 100644 --- a/config +++ b/config @@ -25,7 +25,7 @@ TMPDIR="/var/tmp" ARCHES=(x86_64) DBEXT=".db.tar.gz" FILESEXT=".files.tar.gz" -PKGEXT=".pkg.tar.?z" +PKGEXTS=".pkg.tar.?z" SRCEXT=".src.tar.gz" # Allowed licenses: get sourceballs only for licenses in this array diff --git a/cron-jobs/ftpdir-cleanup b/cron-jobs/ftpdir-cleanup index c771950..630efa8 100755 --- a/cron-jobs/ftpdir-cleanup +++ b/cron-jobs/ftpdir-cleanup @@ -38,7 +38,7 @@ for repo in ${PKGREPOS[@]}; do continue fi # get a list of actual available package files - find "${FTP_BASE}/${repo}/os/${arch}" -xtype f -name "*${PKGEXT}" -printf '%f\n' | sort > "${WORKDIR}/repo-${repo}-${arch}" + find "${FTP_BASE}/${repo}/os/${arch}" -xtype f -name "*${PKGEXTS}" -printf '%f\n' | sort > "${WORKDIR}/repo-${repo}-${arch}" # get a list of package files defined in the repo db bsdtar -xOf "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" | awk '/^%FILENAME%/{getline;print}' | sort > "${WORKDIR}/db-${repo}-${arch}" @@ -62,7 +62,7 @@ for repo in ${PKGREPOS[@]}; do done # get a list of all available packages in the pacakge pool -find "$FTP_BASE/${PKGPOOL}" -name "*${PKGEXT}" -printf '%f\n' | sort > "${WORKDIR}/pool" +find "$FTP_BASE/${PKGPOOL}" -name "*${PKGEXTS}" -printf '%f\n' | sort > "${WORKDIR}/pool" # create a list of packages in our db find "${WORKDIR}" -maxdepth 1 -type f -name 'db-*' -exec cat {} \; | sort -u > "${WORKDIR}/db" @@ -75,7 +75,7 @@ if (( ${#old_pkgs[@]} >= 1 )); then done fi -old_pkgs=($(find ${CLEANUP_DESTDIR} -type f -name "*${PKGEXT}" -mtime +${CLEANUP_KEEP} -printf '%f\n')) +old_pkgs=($(find ${CLEANUP_DESTDIR} -type f -name "*${PKGEXTS}" -mtime +${CLEANUP_KEEP} -printf '%f\n')) if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do diff --git a/db-functions b/db-functions index e8949d7..e8eb2bc 100644 --- a/db-functions +++ b/db-functions @@ -374,8 +374,8 @@ check_pkgrepos() { local pkgver="$(getpkgver ${pkgfile})" || return 1 local pkgarch="$(getpkgarch ${pkgfile})" || return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT} ]] && return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT}.sig ]] && return 1 + [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS} ]] && return 1 + [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS}.sig ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1 diff --git a/db-move b/db-move index 37a9884..fb7ebac 100755 --- a/db-move +++ b/db-move @@ -49,7 +49,7 @@ for pkgbase in ${args[@]:2}; do for pkgname in ${pkgnames[@]}; do for tarch in ${tarches[@]}; do - getpkgfile "${ftppath_from}/${tarch}/"${pkgname}-${pkgver}-${pkgarch}${PKGEXT} >/dev/null + getpkgfile "${ftppath_from}/${tarch}/"${pkgname}-${pkgver}-${pkgarch}${PKGEXTS} >/dev/null done done continue 2 @@ -95,7 +95,7 @@ for pkgbase in ${args[@]:2}; do for pkgname in ${pkgnames[@]}; do for tarch in ${tarches[@]}; do - pkgpath=$(getpkgfile "${ftppath_from}/${tarch}/"${pkgname}-${pkgver}-${pkgarch}${PKGEXT}) + pkgpath=$(getpkgfile "${ftppath_from}/${tarch}/"${pkgname}-${pkgver}-${pkgarch}${PKGEXTS}) pkgfile="${pkgpath##*/}" ln -s "../../../${PKGPOOL}/${pkgfile}" ${ftppath_to}/${tarch}/ diff --git a/db-update b/db-update index 45755a4..a8d885a 100755 --- a/db-update +++ b/db-update @@ -9,7 +9,7 @@ if (( $# >= 1 )); then fi # Find repos with packages to release -if ! staging_repos=($(find "${STAGING}" -mindepth 1 -type f -name "*${PKGEXT}" -printf '%h\n' | sort -u)); then +if ! staging_repos=($(find "${STAGING}" -mindepth 1 -type f -name "*${PKGEXTS}" -printf '%h\n' | sort -u)); then die "Could not read %s" "$STAGING" fi @@ -32,7 +32,7 @@ for repo in ${repos[@]}; do if ! check_repo_permission "${repo}"; then die "You don't have permission to update packages in %s" "$repo" fi - pkgs=($(getpkgfiles "${STAGING}/${repo}/"*${PKGEXT})) + pkgs=($(getpkgfiles "${STAGING}/${repo}/"*${PKGEXTS})) if (( $? == 0 )); then for pkg in ${pkgs[@]}; do if [[ -h ${pkg} ]]; then @@ -70,10 +70,10 @@ done for repo in ${repos[@]}; do msg "Updating [%s]..." "$repo" - any_pkgs=($(getpkgfiles "${STAGING}/${repo}/"*-any${PKGEXT} 2>/dev/null)) + any_pkgs=($(getpkgfiles "${STAGING}/${repo}/"*-any${PKGEXTS} 2>/dev/null)) for pkgarch in ${ARCHES[@]}; do add_pkgs=() - arch_pkgs=($(getpkgfiles "${STAGING}/${repo}/"*-${pkgarch}${PKGEXT} 2>/dev/null)) + arch_pkgs=($(getpkgfiles "${STAGING}/${repo}/"*-${pkgarch}${PKGEXTS} 2>/dev/null)) for pkg in ${arch_pkgs[@]} ${any_pkgs[@]}; do pkgfile="${pkg##*/}" msg2 "${pkgfile} (${pkgarch})" diff --git a/test/cases/db-repo-add.bats b/test/cases/db-repo-add.bats index 9fde381..ac91058 100644 --- a/test/cases/db-repo-add.bats +++ b/test/cases/db-repo-add.bats @@ -31,7 +31,7 @@ __movePackageToRepo() { releasePackage extra ${pkgbase} for arch in ${arches[@]}; do __movePackageToRepo extra ${pkgbase} ${arch} - db-repo-add extra ${arch} ${pkgbase}-1-1-${arch}.pkg.tar.xz + db-repo-add extra ${arch} ${pkgbase}-1-1-${arch}${PKGEXT} done done @@ -54,7 +54,7 @@ __movePackageToRepo() { add_pkgs=() for pkgbase in ${pkgs[@]}; do __movePackageToRepo extra ${pkgbase} ${arch} - add_pkgs+=("${pkgbase}-1-1-${arch}.pkg.tar.xz") + add_pkgs+=("${pkgbase}-1-1-${arch}${PKGEXT}") done db-repo-add extra ${arch} ${add_pkgs[@]} done @@ -73,7 +73,7 @@ __movePackageToRepo() { for pkgbase in ${pkgs[@]}; do releasePackage extra ${pkgbase} __movePackageToRepo extra ${pkgbase} any - db-repo-add extra any ${pkgbase}-1-1-any.pkg.tar.xz + db-repo-add extra any ${pkgbase}-1-1-any${PKGEXT} done for pkgbase in ${pkgs[@]}; do diff --git a/test/cases/db-update.bats b/test/cases/db-update.bats index 1da7eef..349b195 100644 --- a/test/cases/db-update.bats +++ b/test/cases/db-update.bats @@ -151,8 +151,7 @@ load ../lib/common local p releasePackage extra 'pkg-any-a' for p in "${STAGING}"/extra/*${PKGEXT}; do - unxz $p - xz -0 ${p%%.xz} + printf '%s\n' "Not a real package" | gpg -v --detach-sign --no-armor --use-agent - > "${p}.sig" done run db-update [ "$status" -ne 0 ] diff --git a/test/cases/ftpdir-cleanup.bats b/test/cases/ftpdir-cleanup.bats index 6280ce0..fd485f3 100644 --- a/test/cases/ftpdir-cleanup.bats +++ b/test/cases/ftpdir-cleanup.bats @@ -82,7 +82,7 @@ __checkRepoRemovedPackage() { db-remove extra any pkg-any-a ftpdir-cleanup - local pkg1='pkg-any-a-1-1-any.pkg.tar.xz' + local pkg1="pkg-any-a-1-1-any${PKGEXT}" checkRemovedPackage extra 'pkg-any-a' for arch in ${arches[@]}; do __checkRepoRemovedPackage extra 'pkg-any-a' ${arch} @@ -138,8 +138,8 @@ __checkRepoRemovedPackage() { ftpdir-cleanup - local pkgfilea="pkg-simple-a-1-1-${arch}.pkg.tar.xz" - local pkgfileb="pkg-simple-b-1-1-${arch}.pkg.tar.xz" + local pkgfilea="pkg-simple-a-1-1-${arch}${PKGEXT}" + local pkgfileb="pkg-simple-b-1-1-${arch}${PKGEXT}" for arch in ${arches[@]}; do touch -d "-$(expr ${CLEANUP_KEEP} + 1)days" ${CLEANUP_DESTDIR}/${pkgfilea}{,.sig} done diff --git a/test/lib/common.bash b/test/lib/common.bash index 540e403..1b257ab 100644 --- a/test/lib/common.bash +++ b/test/lib/common.bash @@ -34,9 +34,9 @@ __buildPackage() { pkgarches=($(. PKGBUILD; echo ${arch[@]})) for tarch in ${pkgarches[@]}; do if [ "${tarch}" == 'any' ]; then - PKGDEST=${pkgdest} makepkg -c + PKGDEST=${pkgdest} makepkg --config "$MAKEPKG_CONF" -c else - PKGDEST=${pkgdest} CARCH=${tarch} makepkg -c + PKGDEST=${pkgdest} CARCH=${tarch} makepkg --config "$MAKEPKG_CONF" -c fi done @@ -82,6 +82,8 @@ setup() { local pkg local r local a + MAKEPKG_CONF="$PWD/../cron-jobs/makepkg.conf" + PKGEXT="$(. "$MAKEPKG_CONF"; printf '%s' "$PKGEXT")" TMP="$(mktemp -d)" -- 2.16.2
Don't bother emitting errors. bash doesn't show globbing errors if it cannot read a directory to try globbing there. And the former code never aborted on errors anyway, as without `set -o pipefail` the sort command swallowed the return code. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- v2: use mapfile as suggested by Luke, rather than running in_array in a loop. db-functions | 4 ++++ db-update | 11 ++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/db-functions b/db-functions index e8eb2bc..394c7a2 100644 --- a/db-functions +++ b/db-functions @@ -2,6 +2,10 @@ . /usr/share/makepkg/util.sh +# global shell options for enhanced bash scripting +shopt -s globstar nullglob + + # Some PKGBUILDs need CARCH to be set CARCH="x86_64" diff --git a/db-update b/db-update index a8d885a..db12df8 100755 --- a/db-update +++ b/db-update @@ -9,9 +9,14 @@ if (( $# >= 1 )); then fi # Find repos with packages to release -if ! staging_repos=($(find "${STAGING}" -mindepth 1 -type f -name "*${PKGEXTS}" -printf '%h\n' | sort -u)); then - die "Could not read %s" "$STAGING" -fi +mapfile -t -d '' staging_repos < <( + for f in "${STAGING}"/**/*${PKGEXTS}; do + f="${f%/*}" + if [[ -d $f ]]; then + printf '%s\0' "$f" + fi + done | sort -uz +) repos=() for staging_repo in ${staging_repos[@]##*/}; do -- 2.16.2
On Mon, 19 Feb 2018 15:11:43 -0500, Eli Schwartz via arch-projects wrote:
--- a/db-update +++ b/db-update @@ -9,9 +9,14 @@ if (( $# >= 1 )); then fi
# Find repos with packages to release -if ! staging_repos=($(find "${STAGING}" -mindepth 1 -type f -name "*${PKGEXTS}" -printf '%h\n' | sort -u)); then - die "Could not read %s" "$STAGING" -fi +mapfile -t -d '' staging_repos < <( + for f in "${STAGING}"/**/*${PKGEXTS}; do + f="${f%/*}" + if [[ -d $f ]]; then + printf '%s\0' "$f" + fi + done | sort -uz +)
repos=() for staging_repo in ${staging_repos[@]##*/}; do
Isn't [[ -d ]] there redundant? If globbing gave us $dir/file, of course $dir is a directory! Meanwhile, this dropped the `-type f` check, though I'm not sure how important that was. Shouldn't this be written as: mapfile -t -d '' staging_repos < <( for f in "${STAGING}"/**/*${PKGEXTS}; do if [[ -f $f && ! -h $f ]]; then printf '%s\0' "${f/*}" fi done | sort -uz ) The original `find` command rejected symlinks; I don't know if that's an important property; but that's what the `&& ! -h $f` bit is for. -- Happy hacking, ~ Luke Shumaker
On 02/19/2018 04:53 PM, Luke Shumaker wrote:
Isn't [[ -d ]] there redundant? If globbing gave us $dir/file, of course $dir is a directory!
True. I think I still had that in from some point where I hadn't enabled nullglob yet.
Meanwhile, this dropped the `-type f` check, though I'm not sure how important that was.
Shouldn't this be written as:
mapfile -t -d '' staging_repos < <( for f in "${STAGING}"/**/*${PKGEXTS}; do if [[ -f $f && ! -h $f ]]; then printf '%s\0' "${f/*}" fi done | sort -uz )
The original `find` command rejected symlinks; I don't know if that's an important property; but that's what the `&& ! -h $f` bit is for.
It is not important, the find command only checked if the file itself was a symlink but if there is another package file in the same directory then we still add those staging repos. Meanwhile, we check later on for `die "Package %s is a symbolic link"`. So I guess technically it would make more sense to stage the package and then utilize the explicit error message rather than silently dropping the package altogether (but only sometimes) simply because we didn't think to use -xtype. At this stage in the game, we're just trying to assemble a list of the packages that the uploader is asserting they want to db-update. We perform all actual validation later on. -- Eli Schwartz Bug Wrangler and Trusted User
Don't bother emitting errors. bash doesn't show globbing errors if it cannot read a directory to try globbing there. And the former code never aborted on errors anyway, as without `set -o pipefail` the sort command swallowed the return code. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- v3: remove extraneous use of [[ -d ... ]] db-functions | 4 ++++ db-update | 8 +++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/db-functions b/db-functions index e8eb2bc..394c7a2 100644 --- a/db-functions +++ b/db-functions @@ -2,6 +2,10 @@ . /usr/share/makepkg/util.sh +# global shell options for enhanced bash scripting +shopt -s globstar nullglob + + # Some PKGBUILDs need CARCH to be set CARCH="x86_64" diff --git a/db-update b/db-update index a8d885a..4e17184 100755 --- a/db-update +++ b/db-update @@ -9,9 +9,11 @@ if (( $# >= 1 )); then fi # Find repos with packages to release -if ! staging_repos=($(find "${STAGING}" -mindepth 1 -type f -name "*${PKGEXTS}" -printf '%h\n' | sort -u)); then - die "Could not read %s" "$STAGING" -fi +mapfile -t -d '' staging_repos < <( + for f in "${STAGING}"/**/*${PKGEXTS}; do + printf '%s\0' "${f%/*}" + done | sort -uz +) repos=() for staging_repo in ${staging_repos[@]##*/}; do -- 2.16.2
This fully removes the use of find from the codebase, leads to a micro-optimization in a couple cases, and ensures that $PKGEXT is consistently treated as a shell globbing character (which is important because it is used as one). Of the eight instances in these files: - One was unnecessary as `cat` can natively consume all files passed to it and no directory traversal was in use. - Two were unnecessary as they were hardcoded to read a single file.... - Another four were only being used to strip leading directory paths, and can be replaced by globstar and ${filepath##*/} - The final two were checking the modification time of the files, and can be replaced with touch(1) and [[ -nt ]]. Although this introduces an additional temporary file, this is not such a big deal. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- v2: split out unrelated bash cleanup changes into patch 1/5 cron-jobs/ftpdir-cleanup | 22 +++++++++++++++++----- cron-jobs/sourceballs | 19 +++++++++++++++---- 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/cron-jobs/ftpdir-cleanup b/cron-jobs/ftpdir-cleanup index 630efa8..ff65d46 100755 --- a/cron-jobs/ftpdir-cleanup +++ b/cron-jobs/ftpdir-cleanup @@ -38,7 +38,11 @@ for repo in ${PKGREPOS[@]}; do continue fi # get a list of actual available package files - find "${FTP_BASE}/${repo}/os/${arch}" -xtype f -name "*${PKGEXTS}" -printf '%f\n' | sort > "${WORKDIR}/repo-${repo}-${arch}" + for f in "${FTP_BASE}"/${repo}/os/${arch}/*${PKGEXTS}; do + if [[ -f $f ]]; then + printf '%s\n' "${f##*/}" + fi + done | sort > "${WORKDIR}/repo-${repo}-${arch}" # get a list of package files defined in the repo db bsdtar -xOf "${FTP_BASE}/${repo}/os/${arch}/${repo}${DBEXT}" | awk '/^%FILENAME%/{getline;print}' | sort > "${WORKDIR}/db-${repo}-${arch}" @@ -61,10 +65,12 @@ for repo in ${PKGREPOS[@]}; do done done -# get a list of all available packages in the pacakge pool -find "$FTP_BASE/${PKGPOOL}" -name "*${PKGEXTS}" -printf '%f\n' | sort > "${WORKDIR}/pool" +# get a list of all available packages in the package pool +for f in "$FTP_BASE/${PKGPOOL}"/*${PKGEXTS}; do + printf '%s\n' "${f##*/}" +done | sort > "${WORKDIR}/pool" # create a list of packages in our db -find "${WORKDIR}" -maxdepth 1 -type f -name 'db-*' -exec cat {} \; | sort -u > "${WORKDIR}/db" +cat "${WORKDIR}"/db-* 2>/dev/null | sort -u > "${WORKDIR}/db" old_pkgs=($(comm -23 "${WORKDIR}/pool" "${WORKDIR}/db")) if (( ${#old_pkgs[@]} >= 1 )); then @@ -75,7 +81,13 @@ if (( ${#old_pkgs[@]} >= 1 )); then done fi -old_pkgs=($(find ${CLEANUP_DESTDIR} -type f -name "*${PKGEXTS}" -mtime +${CLEANUP_KEEP} -printf '%f\n')) +unset old_pkgs +touch -d "${CLEANUP_KEEP} days ago" "${WORKDIR}/cleanup_timestamp" +for f in "${CLEANUP_DESTDIR}"/**/*${PKGEXTS}; do + if [[ ${WORKDIR}/cleanup_timestamp -nt $f ]]; then + old_pkgs+=("${f##*/}") + fi +done if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do diff --git a/cron-jobs/sourceballs b/cron-jobs/sourceballs index 25a8abb..8f089b3 100755 --- a/cron-jobs/sourceballs +++ b/cron-jobs/sourceballs @@ -46,7 +46,11 @@ for repo in ${PKGREPOS[@]}; do done # Create a list of all available source package file names -find "${FTP_BASE}/${SRCPOOL}" -xtype f -name "*${SRCEXT}" -printf '%f\n' | sort -u > "${WORKDIR}/available-src-pkgs" +for f in "${FTP_BASE}"/${SRCPOOL}/*${SRCEXT}; do + if [[ -f $f ]]; then + printf '%s\n' "${f##*/}" + fi +done | sort -u > "${WORKDIR}/available-src-pkgs" # Check for all packages if we need to build a source package for repo in ${PKGREPOS[@]}; do @@ -117,8 +121,8 @@ for repo in ${PKGREPOS[@]}; do done # Cleanup old source packages -find "${WORKDIR}" -maxdepth 1 -type f -name 'expected-src-pkgs' -exec cat {} \; | sort -u > "${WORKDIR}/expected-src-pkgs.sort" -find "${WORKDIR}" -maxdepth 1 -type f -name 'available-src-pkgs' -exec cat {} \; | sort -u > "${WORKDIR}/available-src-pkgs.sort" +sort -u "${WORKDIR}/expected-src-pkgs" > "${WORKDIR}/expected-src-pkgs.sort" +sort -u "${WORKDIR}/available-src-pkgs" > "${WORKDIR}/available-src-pkgs.sort" old_pkgs=($(comm -23 "${WORKDIR}/available-src-pkgs.sort" "${WORKDIR}/expected-src-pkgs.sort")) if (( ${#old_pkgs[@]} >= 1 )); then @@ -133,7 +137,14 @@ if (( ${#old_pkgs[@]} >= 1 )); then done fi -old_pkgs=($(find ${SOURCE_CLEANUP_DESTDIR} -type f -name "*${SRCEXT}" -mtime +${SOURCE_CLEANUP_KEEP} -printf '%f\n')) +unset old_pkgs +touch -d "${SOURCE_CLEANUP_KEEP} days ago" "${WORKDIR}/cleanup_timestamp" +for f in "${SOURCE_CLEANUP_DESTDIR}"/*${SRCEXT}; do + if [[ ${WORKDIR}/cleanup_timestamp -nt $f ]]; then + old_pkgs+=("${f##*/}") + fi +done + if (( ${#old_pkgs[@]} >= 1 )); then msg "Removing old source packages from the cleanup directory..." for old_pkg in ${old_pkgs[@]}; do -- 2.16.2
The current glob `*.pkg.tar.?z` is both less restrictive and more restrictive than makepkg, as it accepts any valid unicode character. To be more exact, it's almost completely orthogonal to the one in makepkg. makepkg only accepts .tar.gz, .tar.bz2, .tar.xz, .tar.lzo, .tar.lrz, and .tar.Z and most of those fail to match against a two-char compression type. dbscripts accepts .pkg.tar.💩z which incidentally is what I think of cherry-picking xz and gz as supported methods. Since this can be anything makepkg.conf accepts, it needs to be able to match all that, unless we decide to perform additional restrictions in which case we should still explicitly list each allowed extension. Using bash extended globbing allows us to do this relatively painlessly. Document the fact that this has *always* been some sort of glob, and update the two cases where this was (not!) being evaluated by bash [[ ... ]], to use a not-elegant-at-all proxy function is_globfile() to evaluate globs *before* testing if they exist. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- No changes, just a reworded commit message. config | 3 ++- db-functions | 11 ++++++++--- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/config b/config index 5bb3b16..0d33de0 100644 --- a/config +++ b/config @@ -25,7 +25,8 @@ TMPDIR="/var/tmp" ARCHES=(x86_64) DBEXT=".db.tar.gz" FILESEXT=".files.tar.gz" -PKGEXTS=".pkg.tar.?z" +# bash glob listing allowed extensions. Note that db-functions turns on extglob. +PKGEXTS=".pkg.tar.@(gz|bz2|xz|lzo|lrz|Z)" SRCEXT=".src.tar.gz" # Allowed licenses: get sourceballs only for licenses in this array diff --git a/db-functions b/db-functions index 394c7a2..e36d43b 100644 --- a/db-functions +++ b/db-functions @@ -3,7 +3,7 @@ . /usr/share/makepkg/util.sh # global shell options for enhanced bash scripting -shopt -s globstar nullglob +shopt -s extglob globstar nullglob # Some PKGBUILDs need CARCH to be set @@ -20,6 +20,11 @@ restore_umask () { umask $UMASK >/dev/null } +# Check if a file exists, even if the file uses wildcards +is_globfile() { + [[ -f $1 ]] +} + # just like mv -f, but we touch the file and then copy the content so # default ACLs in the target dir will be applied mv_acl() { @@ -378,8 +383,8 @@ check_pkgrepos() { local pkgver="$(getpkgver ${pkgfile})" || return 1 local pkgarch="$(getpkgarch ${pkgfile})" || return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS} ]] && return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS}.sig ]] && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXTS} && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXTS}.sig && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1 -- 2.16.2
On Mon, 19 Feb 2018 15:11:45 -0500, Eli Schwartz via arch-projects wrote:
Document the fact that this has *always* been some sort of glob, and update the two cases where this was (not!) being evaluated by bash [[ ... ]], to use a not-elegant-at-all proxy function is_globfile() to evaluate globs *before* testing if they exist. ... @@ -378,8 +383,8 @@ check_pkgrepos() { local pkgver="$(getpkgver ${pkgfile})" || return 1 local pkgarch="$(getpkgarch ${pkgfile})" || return 1
- [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS} ]] && return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS}.sig ]] && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXTS} && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXTS}.sig && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1
It's not a big deal, but I'd rather that be a separate commit, as it's fixing breakage that's unrelated to switching it from a plain glob to an extglob.
diff --git a/config b/config index 5bb3b16..0d33de0 100644 --- a/config +++ b/config @@ -25,7 +25,8 @@ TMPDIR="/var/tmp" ARCHES=(x86_64) DBEXT=".db.tar.gz" FILESEXT=".files.tar.gz" -PKGEXTS=".pkg.tar.?z" +# bash glob listing allowed extensions. Note that db-functions turns on extglob. +PKGEXTS=".pkg.tar.@(gz|bz2|xz|lzo|lrz|Z)" SRCEXT=".src.tar.gz"
Is there a reason you reject '.pkg.tar' (no compression, which makepkg accepts)? (I also found it curious that you swapped lzo and lrz from the order the extensions are in in the makepkg source.) (Also, I'd move it down a line, so that it's more obvious that the comment doesn't apply to SRCEXT.) -- Happy hacking, ~ Luke Shumaker
On 02/19/2018 04:59 PM, Luke Shumaker wrote:
Is there a reason you reject '.pkg.tar' (no compression, which makepkg accepts)?
I don't think there is any utility in supporting uncompressed packages in dbscripts. Anyone who wants to customize this in a non-Arch Linux deployment is free to do so... If someone wants to use some deviant compression type because they're positive it works better on those packaged files, I cannot think of a compelling reason to say "no you're wrong", which is why I listed everything else.
(I also found it curious that you swapped lzo and lrz from the order the extensions are in in the makepkg source.)
makepkg is inconsistent here, I pulled that from the makepkg.conf(5) source. :D -- Eli Schwartz Bug Wrangler and Trusted User
On Mon, 19 Feb 2018 15:11:45 -0500, Eli Schwartz via arch-projects wrote:
+# Check if a file exists, even if the file uses wildcards +is_globfile() { + [[ -f $1 ]] +} +
Dave's comment on my version of this patchset applies equally to this version:
Frankly, this function name and comment sucks, because it says nothing about quoting. As I read the comment, I'm lead to believe that given a file "foobar" existing, I can call: __isGlobfile "foo*", and this will succeed. To the naive reader, you might even believe this claim based on the unquotedness of $1 within the -f test.
(I had the same function, with the same comment, as is_globfile in db-functions and as __isGlobfile in common.bash) -- Happy hacking, ~ Luke Shumaker
The current glob `*.pkg.tar.?z` is both less restrictive and more restrictive than makepkg, as it accepts any valid unicode character. To be more exact, it's almost completely orthogonal to the one in makepkg. makepkg only accepts .tar.gz, .tar.bz2, .tar.xz, .tar.lzo, .tar.lrz, and .tar.Z and most of those fail to match against a two-char compression type. dbscripts accepts .pkg.tar.💩z which incidentally is what I think of cherry-picking xz and gz as supported methods. Since this can be anything makepkg.conf accepts, it needs to be able to match all that, unless we decide to perform additional restrictions in which case we should still explicitly list each allowed extension. Using bash extended globbing allows us to do this relatively painlessly. Document the fact that this has *always* been some sort of glob, and update the two cases where this was (not!) being evaluated by bash [[ ... ]], to use a not-elegant-at-all proxy function is_globfile() to evaluate globs *before* testing if they exist. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- v3: add comment describing the purpose of is_globfile() config | 3 ++- db-functions | 14 +++++++++++--- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/config b/config index 5bb3b16..0d33de0 100644 --- a/config +++ b/config @@ -25,7 +25,8 @@ TMPDIR="/var/tmp" ARCHES=(x86_64) DBEXT=".db.tar.gz" FILESEXT=".files.tar.gz" -PKGEXTS=".pkg.tar.?z" +# bash glob listing allowed extensions. Note that db-functions turns on extglob. +PKGEXTS=".pkg.tar.@(gz|bz2|xz|lzo|lrz|Z)" SRCEXT=".src.tar.gz" # Allowed licenses: get sourceballs only for licenses in this array diff --git a/db-functions b/db-functions index 394c7a2..8b71cae 100644 --- a/db-functions +++ b/db-functions @@ -3,7 +3,7 @@ . /usr/share/makepkg/util.sh # global shell options for enhanced bash scripting -shopt -s globstar nullglob +shopt -s extglob globstar nullglob # Some PKGBUILDs need CARCH to be set @@ -20,6 +20,14 @@ restore_umask () { umask $UMASK >/dev/null } +# Proxy function to check if a file exists. Using [[ -f ... ]] directly is not +# always wanted because we might want to expand bash globs first. This way we +# can pass unquoted globs to is_globfile() and have them expanded as function +# arguments before being checked. +is_globfile() { + [[ -f $1 ]] +} + # just like mv -f, but we touch the file and then copy the content so # default ACLs in the target dir will be applied mv_acl() { @@ -378,8 +386,8 @@ check_pkgrepos() { local pkgver="$(getpkgver ${pkgfile})" || return 1 local pkgarch="$(getpkgarch ${pkgfile})" || return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS} ]] && return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXTS}.sig ]] && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXTS} && return 1 + is_globfile "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXTS}.sig && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1 -- 2.16.2
participants (4)
-
Dave Reisner
-
Eli Schwartz
-
Emil Velikov
-
Luke Shumaker