[pacman-dev] [PATCH 0/2] [WIP] Fix du filesize computation
This patchset is a WIP that tries to address FS#61717[1]. This is my first patch for pacman, so I may be missing a couple of the heuristics of proper pacman develoment. Namely, I'm worried about the use of xargs (I went down this road because I wanted to have the minimal set of changes). I'm not sure what're the heuristics for what I can assume (e.g., can findutils be assumed to exist?). As a followup of this, I removed the configure.ac stuff related to DUPATH and DUFLAGS. I'm not sure if we'd want to add XARGSPATH or so. I opted to not do that just yet. Thanks, -Santiago. [1] https://bugs.archlinux.org/task/61717 Santiago Torres (2): Make makepkg compute sizes properly configure.ac: drop DU* config configure.ac | 10 ---------- scripts/makepkg.sh.in | 3 +-- 2 files changed, 1 insertion(+), 12 deletions(-) -- 2.21.0
Makepkg used to use du --apparent-size to compute the size of the package. Unfortunately, this would result in different sizes depending on the filesystem used (e.g., btrfs vs ext4), which would affect reproducible builds. Use a wc-based approach to compute sizes Signed-off-by: Santiago Torres <santiago@archlinux.org> --- scripts/makepkg.sh.in | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index e12826af..899acae2 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -588,8 +588,7 @@ write_kv_pair() { } write_pkginfo() { - local size="$(@DUPATH@ @DUFLAGS@)" - size="$(( ${size%%[^0-9]*} * 1024 ))" + local size="$(list_package_files| xargs --null cat 2>/dev/null | wc -c)" merge_arch_attrs -- 2.21.0
Since DUFLAGS and DUPATH are not needed anymore remove them from the configuration script Signed-off-by: Santiago Torres <santiago@archlinux.org> --- configure.ac | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/configure.ac b/configure.ac index 2f345b5d..ac51fafa 100644 --- a/configure.ac +++ b/configure.ac @@ -345,7 +345,6 @@ AC_CHECK_MEMBERS([struct statfs.f_flags],,,[[#include <sys/param.h> GCC_VISIBILITY_CC # Host-dependant definitions -DEFAULT_DUFLAGS=" -sk --apparent-size" DEFAULT_SEDINPLACEFLAGS=" --follow-symlinks -i" INODECMD="stat -c '%i %n'" OWNERCMD="stat -c '%u:%g'" @@ -359,7 +358,6 @@ case "${host_os}" in OWNERCMD="stat -f '%u:%g'" MODECMD="stat -f '%Lp'" DEFAULT_SEDINPLACEFLAGS=" -i \"\"" - DEFAULT_DUFLAGS=" -sk" ;; darwin*) host_os_darwin=yes @@ -367,14 +365,12 @@ case "${host_os}" in OWNERCMD="/usr/bin/stat -f '%u:%g'" MODECMD="/usr/bin/stat -f '%Lp'" DEFAULT_SEDINPLACEFLAGS=" -i ''" - DEFAULT_DUFLAGS=" -sk" STRIP_BINARIES="" STRIP_SHARED="-S" STRIP_STATIC="-S" ;; esac AM_CONDITIONAL([DARWIN], test "x$host_os_darwin" = "xyes") -AC_PATH_PROGS([DUPATH], [du], [du], [/usr/bin$PATH_SEPARATOR/bin] ) AC_PATH_PROGS([SEDPATH], [sed], [sed], [/usr/bin$PATH_SEPARATOR/bin] ) AC_SUBST(INODECMD) AC_SUBST(OWNERCMD) @@ -383,12 +379,6 @@ AC_SUBST(STRIP_BINARIES) AC_SUBST(STRIP_SHARED) AC_SUBST(STRIP_STATIC) -# Flags for du -if test "${DUFLAGS+set}" != "set"; then - DUFLAGS="${DEFAULT_DUFLAGS}" -fi -AC_ARG_VAR(DUFLAGS, [flags for du, overriding the default]) - # Flags for sed in place if test "${SEDINPLACEFLAGS+set}" != "set"; then SEDINPLACEFLAGS="${DEFAULT_SEDINPLACEFLAGS}" -- 2.21.0
On 3/7/19 9:36 PM, Santiago Torres wrote:
This patchset is a WIP that tries to address FS#61717[1].
This is my first patch for pacman, so I may be missing a couple of the heuristics of proper pacman develoment. Namely, I'm worried about the use of xargs (I went down this road because I wanted to have the minimal set of changes). I'm not sure what're the heuristics for what I can assume (e.g., can findutils be assumed to exist?).
We explicitly use find -exec elsewhere, so that should be fine. xargs --null is a GNU-ism and does not work on supported targets: busybox xargs, and BSD xargs. Both support -0, but it is anyways a better idea to simply stick with POSIX-defined find -exec ... {} + (I think you can generally assume that long options will not work by default, actually. Neither busybox nor BSD utilities typically seem to implement them.)
As a followup of this, I removed the configure.ac stuff related to DUPATH and DUFLAGS. I'm not sure if we'd want to add XARGSPATH or so. I opted to not do that just yet.
We do not need this; the only reason du is configurable rather than relying on the PATH, is because people had wrapper programs that pretended to be du, but were broken: https://bugs.archlinux.org/task/19932 No one tries to colorize the output of find -exec or of xargs, I hope...
Thanks, -Santiago.
[1] https://bugs.archlinux.org/task/61717
Santiago Torres (2): Make makepkg compute sizes properly configure.ac: drop DU* config
configure.ac | 10 ---------- scripts/makepkg.sh.in | 3 +-- 2 files changed, 1 insertion(+), 12 deletions(-)
-- 2.21.0
-- Eli Schwartz Bug Wrangler and Trusted User
On Thu, Mar 07, 2019 at 10:07:12PM -0500, Eli Schwartz wrote:
On 3/7/19 9:36 PM, Santiago Torres wrote:
This patchset is a WIP that tries to address FS#61717[1].
This is my first patch for pacman, so I may be missing a couple of the heuristics of proper pacman develoment. Namely, I'm worried about the use of xargs (I went down this road because I wanted to have the minimal set of changes). I'm not sure what're the heuristics for what I can assume (e.g., can findutils be assumed to exist?).
We explicitly use find -exec elsewhere, so that should be fine.
xargs --null is a GNU-ism and does not work on supported targets: busybox xargs, and BSD xargs. Both support -0, but it is anyways a better idea to simply stick with POSIX-defined find -exec ... {} +
Oh ok, Ill move to a find-exec construct then.
(I think you can generally assume that long options will not work by default, actually. Neither busybox nor BSD utilities typically seem to implement them.)
TIL, thanks!
As a followup of this, I removed the configure.ac stuff related to DUPATH and DUFLAGS. I'm not sure if we'd want to add XARGSPATH or so. I opted to not do that just yet.
We do not need this; the only reason du is configurable rather than relying on the PATH, is because people had wrapper programs that pretended to be du, but were broken: https://bugs.archlinux.org/task/19932
Aha, noted!
No one tries to colorize the output of find -exec or of xargs, I hope...
Lol I hope so too. Let me follow up with a v2 using find -exec cat {} \;... Thanks! -Santiago.
On 3/7/19 10:18 PM, Santiago Torres-Arias wrote:
On Thu, Mar 07, 2019 at 10:07:12PM -0500, Eli Schwartz wrote:
On 3/7/19 9:36 PM, Santiago Torres wrote:
This patchset is a WIP that tries to address FS#61717[1].
This is my first patch for pacman, so I may be missing a couple of the heuristics of proper pacman develoment. Namely, I'm worried about the use of xargs (I went down this road because I wanted to have the minimal set of changes). I'm not sure what're the heuristics for what I can assume (e.g., can findutils be assumed to exist?).
We explicitly use find -exec elsewhere, so that should be fine.
xargs --null is a GNU-ism and does not work on supported targets: busybox xargs, and BSD xargs. Both support -0, but it is anyways a better idea to simply stick with POSIX-defined find -exec ... {} +
Oh ok, Ill move to a find-exec construct then.
(I think you can generally assume that long options will not work by default, actually. Neither busybox nor BSD utilities typically seem to implement them.)
TIL, thanks!
As a followup of this, I removed the configure.ac stuff related to DUPATH and DUFLAGS. I'm not sure if we'd want to add XARGSPATH or so. I opted to not do that just yet.
We do not need this; the only reason du is configurable rather than relying on the PATH, is because people had wrapper programs that pretended to be du, but were broken: https://bugs.archlinux.org/task/19932
Aha, noted!
No one tries to colorize the output of find -exec or of xargs, I hope...
Lol I hope so too. Let me follow up with a v2 using find -exec cat {} \;...
find . -type f -exec cat {} + Please use the + format, not \; as the former will be able to accept many filenames at once until it reaches the maximum length of a command line (and cat supports this mode), while \; will fork a new cat command for each file. This is functionally comparable to xargs, which defaults to being like find -exec {} +, but when using xargs -L 1, acts like find -exec {} \; -- Eli Schwartz Bug Wrangler and Trusted User
Makepkg used to use du --apparent-size to compute the size of the package. Unfortunately, this would result in different sizes depending on the filesystem used (e.g., btrfs vs ext4), which would affect reproducible builds. Use a wc-based approach to compute sizes Signed-off-by: Santiago Torres <santiago@archlinux.org> --- scripts/makepkg.sh.in | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index e12826af..d1724064 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -588,8 +588,8 @@ write_kv_pair() { } write_pkginfo() { - local size="$(@DUPATH@ @DUFLAGS@)" - size="$(( ${size%%[^0-9]*} * 1024 ))" + + local size="$(find . -type f -exec cat {} + 2>/dev/null | wc -c)" merge_arch_attrs @@ -1347,7 +1347,7 @@ if (( INFAKEROOT )); then else run_split_packaging fi - + create_debug_package msg "$(gettext "Leaving %s environment.")" "fakeroot" -- 2.21.0
Since DUFLAGS and DUPATH are not needed anymore remove them from the configuration script Signed-off-by: Santiago Torres <santiago@archlinux.org> --- configure.ac | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/configure.ac b/configure.ac index 2f345b5d..ac51fafa 100644 --- a/configure.ac +++ b/configure.ac @@ -345,7 +345,6 @@ AC_CHECK_MEMBERS([struct statfs.f_flags],,,[[#include <sys/param.h> GCC_VISIBILITY_CC # Host-dependant definitions -DEFAULT_DUFLAGS=" -sk --apparent-size" DEFAULT_SEDINPLACEFLAGS=" --follow-symlinks -i" INODECMD="stat -c '%i %n'" OWNERCMD="stat -c '%u:%g'" @@ -359,7 +358,6 @@ case "${host_os}" in OWNERCMD="stat -f '%u:%g'" MODECMD="stat -f '%Lp'" DEFAULT_SEDINPLACEFLAGS=" -i \"\"" - DEFAULT_DUFLAGS=" -sk" ;; darwin*) host_os_darwin=yes @@ -367,14 +365,12 @@ case "${host_os}" in OWNERCMD="/usr/bin/stat -f '%u:%g'" MODECMD="/usr/bin/stat -f '%Lp'" DEFAULT_SEDINPLACEFLAGS=" -i ''" - DEFAULT_DUFLAGS=" -sk" STRIP_BINARIES="" STRIP_SHARED="-S" STRIP_STATIC="-S" ;; esac AM_CONDITIONAL([DARWIN], test "x$host_os_darwin" = "xyes") -AC_PATH_PROGS([DUPATH], [du], [du], [/usr/bin$PATH_SEPARATOR/bin] ) AC_PATH_PROGS([SEDPATH], [sed], [sed], [/usr/bin$PATH_SEPARATOR/bin] ) AC_SUBST(INODECMD) AC_SUBST(OWNERCMD) @@ -383,12 +379,6 @@ AC_SUBST(STRIP_BINARIES) AC_SUBST(STRIP_SHARED) AC_SUBST(STRIP_STATIC) -# Flags for du -if test "${DUFLAGS+set}" != "set"; then - DUFLAGS="${DEFAULT_DUFLAGS}" -fi -AC_ARG_VAR(DUFLAGS, [flags for du, overriding the default]) - # Flags for sed in place if test "${SEDINPLACEFLAGS+set}" != "set"; then SEDINPLACEFLAGS="${DEFAULT_SEDINPLACEFLAGS}" -- 2.21.0
On 12/3/19 11:18 am, Santiago Torres wrote:
Since DUFLAGS and DUPATH are not needed anymore remove them from the configuration script
Signed-off-by: Santiago Torres <santiago@archlinux.org> ---
These variables had references throughout other build files. I have removed them and expanded your patch. Allan
Makepkg used to use du --apparent-size to compute the size of the package. Unfortunately, this would result in different sizes depending on the filesystem used (e.g., btrfs vs ext4), which would affect reproducible builds. Use a wc-based approach to compute sizes Signed-off-by: Santiago Torres <santiago@archlinux.org> --- scripts/makepkg.sh.in | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index e12826af..4f096a36 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -588,8 +588,7 @@ write_kv_pair() { } write_pkginfo() { - local size="$(@DUPATH@ @DUFLAGS@)" - size="$(( ${size%%[^0-9]*} * 1024 ))" + local size="$(find . -type f -exec cat {} + 2>/dev/null | wc -c)" merge_arch_attrs -- 2.21.0
participants (4)
-
Allan McRae
-
Eli Schwartz
-
Santiago Torres
-
Santiago Torres-Arias