Xavier wrote:
On Wed, Dec 16, 2009 at 11:07 PM, Xavier <shiningxc@gmail.com> wrote:
The results are slightly different, mostly because makepkg overestimates uncompressed size by not using -a with du, while namcap lotsofdocs computes the real size.
Now that I think about it, maybe the trick/hack I used in my first script would actually be a portable way to get the real uncompressed size : bsdtar tvf foo.pkg.tar.gz 2>/dev/null | awk '{ SUM += $5 } END { print SUM }'
But this is offtopic, I should find again the makepkg bug report we had about this that we probably rejected :P
http://bugs.archlinux.org/task/11225
I see, so we made a patch for repo-add : http://bugs.archlinux.org/task/11225?getfile=2429 I also made one for makepkg : http://bugs.archlinux.org/task/11225?getfile=2426
But Dan rejected it with this reason : "Note that I did not touch makepkg because our size there is not critical- a size to the nearest K is just fine, and switching to a find/stat way of doing it would cause all hard links to get double-counted."
It is a size to the nearest K for each file, and we accumulate the errors ? I did not realize the difference was so big until today, when playing with docsize stuff.
e.g. for libsigc++-2.0 : du -s = 12040 K du -s --apparent-size = 10723 K
So that's 1300K. and 12% error if I am not mistaken.
So.... anyone want to do the analysis of how much counting hardlinks twice biases the size versus how much bias there is using what we currently do? As long as the bias is making the package appear bigger than it is and it is not orders of magnitude different, I really do not care that much. Allan