[pacman-dev] [PATCH 0/2] Correct package size calculation in makepkg
From: Ronan Pigott
From: Ronan Pigott
On 10/26/19 11:11 PM, Ronan Pigott wrote:
From: Ronan Pigott
--- configure.ac | 6 +++--- meson.build | 6 +++--- scripts/libmakepkg/tidy/zipman.sh.in | 9 +++++---- 3 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/configure.ac b/configure.ac index d68b3591..fbb3b83b 100644 --- a/configure.ac +++ b/configure.ac @@ -370,18 +370,18 @@ GCC_VISIBILITY_CC
# Host-dependant definitions DEFAULT_SEDINPLACEFLAGS=" --follow-symlinks -i" -INODECMD="stat -c '%i %n'" +INODECMD="stat -c '%i'" STRIP_BINARIES="--strip-all" STRIP_SHARED="--strip-unneeded" STRIP_STATIC="--strip-debug" case "${host_os}" in *bsd*) - INODECMD="stat -f '%i %N'" + INODECMD="stat -f '%i'" DEFAULT_SEDINPLACEFLAGS=" -i \"\"" ;; darwin*) host_os_darwin=yes - INODECMD="/usr/bin/stat -f '%i %N'" + INODECMD="/usr/bin/stat -f '%i'" DEFAULT_SEDINPLACEFLAGS=" -i ''" STRIP_BINARIES="" STRIP_SHARED="-S" diff --git a/meson.build b/meson.build index c5704efe..648dc80e 100644 --- a/meson.build +++ b/meson.build @@ -222,7 +222,7 @@ add_project_arguments('-include', 'config.h', language : 'c')
filecmd = 'file' default_sedinplaceflags = ' --follow-symlinks -i' -inodecmd = 'stat -c \'%i %n\'' +inodecmd = 'stat -c \'%i\'' strip_binaries = '--strip-all' strip_shared = '--strip-unneeded' strip_static = '--strip-debug' @@ -236,13 +236,13 @@ endif
os = host_machine.system() if os.startswith('darwin') - inodecmd = '/usr/bin/stat -f \'%i %n\'' + inodecmd = '/usr/bin/stat -f \'%i\'' default_sedinplaceflags = ' -i \'\'' strip_binaries = '' strip_shared = '-s' strip_static = '-s' elif os.contains('bsd') or os == 'dragonfly' - inodecmd = 'stat -f \'%i %n\'' + inodecmd = 'stat -f \'%i\'' default_sedinplaceflags = ' -i \'\'' endif
diff --git a/scripts/libmakepkg/tidy/zipman.sh.in b/scripts/libmakepkg/tidy/zipman.sh.in index 3c2e261e..dd36ebef 100644 --- a/scripts/libmakepkg/tidy/zipman.sh.in +++ b/scripts/libmakepkg/tidy/zipman.sh.in @@ -33,9 +33,10 @@ tidy_modify+=('tidy_zipman') tidy_zipman() { if check_option "zipman" "y" && [[ -n ${MAN_DIRS[*]} ]]; then msg2 "$(gettext "Compressing man and info pages...")" - local file files inode link - while read -rd ' ' inode; do - read file + local file inode link + declare -A files + while read -rd $'\0' file; do + inode=$( @INODECMD@ "$file" )
There's really no need for this change, the stat command works fine as is. You mention:
To make supporting whitespace in filenames easier, I changed INODECMD to output only the inode and instead rely on find's -print0 to reliably parse complete filenames. This fixes some errors for files that begin or end in whitespace. It was noted in discussion that -print0 is not
But your replacement has the same issue. Whitespace is trimmed not by the -d ' ' in use (which reads in the first line of input until it hits the first space character, deletes the space character, and then declares the line to be over and assigns the line to $inode), but by the IFS in the second read (since IFS is used to split the line, even if there is only one variable to assign it will still strip leading and trailing $IFS). -- Eli Schwartz Bug Wrangler and Trusted User
On Sun Oct 27, 2019 at 2:42 AM Eli Schwartz wrote:
You mention:
To make supporting whitespace in filenames easier, I changed INODECMD to output only the inode and instead rely on find's -print0 to reliably parse complete filenames. This fixes some errors for files that begin or end in whitespace. It was noted in discussion that -print0 is not
But your replacement has the same issue. Whitespace is trimmed not by the -d ' ' in use (which reads in the first line of input until it hits the first space character, deletes the space character, and then declares the line to be over and assigns the line to $inode), but by the IFS in the second read (since IFS is used to split the line, even if there is only one variable to assign it will still strip leading and trailing $IFS).
Ah that does seem to be the case. Good catch. I'll remove this change and use the existing INODECMD format.
From: Ronan Pigott
On 27/10/19 1:11 pm, Ronan Pigott wrote:
From: Ronan Pigott
--- scripts/makepkg.sh.in | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index 997c8668..0725f582 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -584,7 +584,15 @@ write_kv_pair() { }
write_pkginfo() { - local size="$(find . -type f -exec cat {} + 2>/dev/null | wc -c)" + local inode size=0 + declare -A files + while read -rd $'\0' file; do + inode=$( @INODECMD@ "$file" ) + if [[ -z "${files[$inode]}" ]]; then + files[$inode]=$(wc -c < "$file") + size=$((size + ${files[$inode]})) + fi + done < <(find . -type f -print0)
I'm going to request a couple of changes... 1) can you put this function in a separate file like in the patch I submitted (just use that patch and adjust the function). Not that I expect this to be reused, but it will be a bit long to sit in write_pkginfo after... 2) we have some packages approaching 100,000 files! 67220 texlive-fontsextra-2019.50876-1/files 76595 papirus-icon-theme-20191009-1/files 80166 rocksndiamonds-data-4.1.3.0-1/files 82228 ceph-mgr-14.2.1-2/files 97821 nodejs-material-design-icons-3.0.1-2/files Most of those have no hard links, so a two pass approach has been discussed: find . -type f -links 1 ... with no requesting and storing inodes and then find . -type f -links +1 ... Having just checked out a couple of very large packages, this appears to be worth the effort. Thanks, Allan
participants (3)
-
Allan McRae
-
Eli Schwartz
-
Ronan Pigott