[pacman-dev] [PATCH 2/2] makepkg: do not count hard linked file sizes multiple times
Allan McRae
allan at archlinux.org
Sun Oct 27 04:32:10 UTC 2019
On 27/10/19 1:11 pm, Ronan Pigott wrote:
> From: Ronan Pigott <rpigott at berkeley.edu>
>
> ---
> scripts/makepkg.sh.in | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
> index 997c8668..0725f582 100644
> --- a/scripts/makepkg.sh.in
> +++ b/scripts/makepkg.sh.in
> @@ -584,7 +584,15 @@ write_kv_pair() {
> }
>
> write_pkginfo() {
> - local size="$(find . -type f -exec cat {} + 2>/dev/null | wc -c)"
> + local inode size=0
> + declare -A files
> + while read -rd $'\0' file; do
> + inode=$( @INODECMD@ "$file" )
> + if [[ -z "${files[$inode]}" ]]; then
> + files[$inode]=$(wc -c < "$file")
> + size=$((size + ${files[$inode]}))
> + fi
> + done < <(find . -type f -print0)
>
I'm going to request a couple of changes...
1) can you put this function in a separate file like in the patch I
submitted (just use that patch and adjust the function). Not that I
expect this to be reused, but it will be a bit long to sit in
write_pkginfo after...
2) we have some packages approaching 100,000 files!
67220 texlive-fontsextra-2019.50876-1/files
76595 papirus-icon-theme-20191009-1/files
80166 rocksndiamonds-data-4.1.3.0-1/files
82228 ceph-mgr-14.2.1-2/files
97821 nodejs-material-design-icons-3.0.1-2/files
Most of those have no hard links, so a two pass approach has been discussed:
find . -type f -links 1 ...
with no requesting and storing inodes and then
find . -type f -links +1 ...
Having just checked out a couple of very large packages, this appears to
be worth the effort.
Thanks,
Allan
More information about the pacman-dev
mailing list