On 27/10/19 1:11 pm, Ronan Pigott wrote:
From: Ronan Pigott <rpigott@berkeley.edu>
--- scripts/makepkg.sh.in | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index 997c8668..0725f582 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -584,7 +584,15 @@ write_kv_pair() { }
write_pkginfo() { - local size="$(find . -type f -exec cat {} + 2>/dev/null | wc -c)" + local inode size=0 + declare -A files + while read -rd $'\0' file; do + inode=$( @INODECMD@ "$file" ) + if [[ -z "${files[$inode]}" ]]; then + files[$inode]=$(wc -c < "$file") + size=$((size + ${files[$inode]})) + fi + done < <(find . -type f -print0)
I'm going to request a couple of changes... 1) can you put this function in a separate file like in the patch I submitted (just use that patch and adjust the function). Not that I expect this to be reused, but it will be a bit long to sit in write_pkginfo after... 2) we have some packages approaching 100,000 files! 67220 texlive-fontsextra-2019.50876-1/files 76595 papirus-icon-theme-20191009-1/files 80166 rocksndiamonds-data-4.1.3.0-1/files 82228 ceph-mgr-14.2.1-2/files 97821 nodejs-material-design-icons-3.0.1-2/files Most of those have no hard links, so a two pass approach has been discussed: find . -type f -links 1 ... with no requesting and storing inodes and then find . -type f -links +1 ... Having just checked out a couple of very large packages, this appears to be worth the effort. Thanks, Allan