[PATCH] paccache: Support cleaning many thousands of candidates

Daniel M. Capella polyzen at archlinux.org
Tue Jun 2 04:05:14 UTC 2020


On June 1, 2020 10:47:21 PM EDT, Leonid Bloch <lb.workbox at gmail.com> wrote:
> In situations when there are many thousands of candidates for
> deletion,
> paccache returns an "Argument list too long" error, because the
> expansion of the ${candidates[@]} array becomes too long for Bash.
> 
> This commit fixes that problem, by calculating the freed size and
> (re)moving the candidates one by one. The cost is much longer
> operation,
> but on a laptop with 38143 candidates (no cache cleaning since 2013)
> it
> still takes ~18 minutes, which seems to be better than throwing an
> error.
> 
> A more efficient way to achieve this would be to break the
> ${candidates[@]}
> array into several sub-arrays and operating on them, but it will
> introduce more complexity and thus might be more error-prone.
> 
> Signed-off-by: Leonid Bloch <lb.workbox at gmail.com>
> ---
>  src/paccache.sh.in | 24 ++++++++++++++++--------
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/src/paccache.sh.in b/src/paccache.sh.in
> index af901f4..212e20a 100644
> --- a/src/paccache.sh.in
> +++ b/src/paccache.sh.in
> @@ -29,7 +29,7 @@ LIBRARY=${LIBRARY:-'@libmakepkgdir@'}
>  
> declare -a cachedirs=() candidates=() cmdopts=() whitelist=()
> blacklist=()
> declare -i delete=0 dryrun=0 filecount=0 move=0 needsroot=0
> totalsaved=0 verbose=0
> -declare -i min_atime=0 min_mtime=0
> +declare -i min_atime=0 min_mtime=0 first_escalation=1
>  declare    delim=$'\n' keep=3 movedir= scanarch=
>  
>  QUIET=0
> @@ -138,7 +138,10 @@ m4_include(../lib/size_to_human.sh)
>  
>  runcmd() {
>  	if (( needsroot && EUID != 0 )); then
> -		msg "Privilege escalation required"
> +		(( first_escalation == 0 )) || {
> +			msg "Privilege escalation required"
> +			first_escalation=0
> +		}
>  		if sudo -v &>/dev/null && sudo -l &>/dev/null; then
>  			sudo "$@"
>  		else
> @@ -385,23 +388,28 @@ pkgcount=${#candidates[*]}
>  # copy the list, merging in any found sigs
>  for cand in "${candidates[@]}"; do
>  	candtemp+=("$cand")
> -	[[ -f $cand.sig ]] && candtemp+=("$cand.sig")
> +	totalsaved="$(( totalsaved + "$(stat -c %s "${cand}")" ))"
> +	if [[ -f "${cand}.sig" ]]; then
> +		candtemp+=("${cand}.sig")
> +		totalsaved="$(( totalsaved + "$(stat -c %s "${cand}.sig")" ))"
> +	fi
>  done
>  candidates=("${candtemp[@]}")
>  unset candtemp
>  
> -# do this before we destroy anything
> -totalsaved=$(@SIZECMD@ "${candidates[@]}" | awk '{ sum += $1 } END {
> print sum }')
> -
>  # Exit immediately if a pipeline returns non-zero.
>  set -o errexit
>  
>  # crush. kill. destroy.
>  (( verbose )) && cmdopts+=(-v)
>  if (( delete )); then
> -	printf '%s\0' "${candidates[@]}" | runcmd xargs -0 rm
> "${cmdopts[@]}"
> +	for cand in "${candidates[@]}"; do
> +		runcmd rm "${cmdopts[@]}" "${cand}"
> +	done
>  elif (( move )); then
> -	printf '%s\0' "${candidates[@]}" | runcmd xargs -0 mv
> "${cmdopts[@]}" -t "$movedir"
> +	for cand in "${candidates[@]}"; do
> +		runcmd mv "${cmdopts[@]}" -t "${movedir}" "${cand}"
> +	done
>  fi
>  
>  summarize "$pkgcount" "${candidates[@]}"

Thanks for the patch. Cleaning that many packages seems fairly niche. I'd prefer a more efficient solution than to slow this down for everybody.

--
Best,
Daniel <https://danielcapella.com>


More information about the pacman-contrib mailing list