[pacman-dev] some more delta stats
I generated deltas for the last 20 updates on my system. The total delta size is 12M, while the total package size is 73M. So the overall ratio is pretty good imo. Afaik, only one delta on the 20 didn't reach the 0.7 ratio : attr The ghostscript one is unbelievable : 5,7K vs 29M ! Here are the packages, deltas and their size : 56K attr-2.4.41-1_to_2.4.43-1-x86_64.delta 79K attr-2.4.43-1-x86_64.pkg.tar.gz 3,6K crda-0.9.5-2_to_1.0.1-1-x86_64.delta 13K crda-1.0.1-1-x86_64.pkg.tar.gz 254K curl-7.19.3-1_to_7.19.4-1-x86_64.delta 555K curl-7.19.4-1-x86_64.pkg.tar.gz 444K e2fsprogs-1.41.3-2_to_1.41.4-1-x86_64.delta 1,3M e2fsprogs-1.41.4-1-x86_64.pkg.tar.gz 5,7K ghostscript-8.64-2_to_8.64-3-x86_64.delta 29M ghostscript-8.64-3-x86_64.pkg.tar.gz 2,5M git-1.6.1.3-1_to_1.6.2-1-x86_64.delta 6,4M git-1.6.2-1-x86_64.pkg.tar.gz 39K hdparm-9.10-1_to_9.12-1-x86_64.delta 56K hdparm-9.12-1-x86_64.pkg.tar.gz 6,1K iw-0.9.8-1_to_0.9.9-1-x86_64.delta 15K iw-0.9.9-1-x86_64.pkg.tar.gz 369K kbd-1.14.1.20080309-2_to_1.15-1-x86_64.delta 1013K kbd-1.15-1-x86_64.pkg.tar.gz 569 libmad-0.15.1b-3_to_0.15.1b-4-x86_64.delta 132K libmad-0.15.1b-4-x86_64.pkg.tar.gz 89K libsndfile-1.0.18-1_to_1.0.19-1-x86_64.delta 423K libsndfile-1.0.19-1-x86_64.pkg.tar.gz 17K libv4l-0.5.7-1_to_0.5.8-1-x86_64.delta 56K libv4l-0.5.8-1-x86_64.pkg.tar.gz 79K libxi-1.1.4-2-x86_64.pkg.tar.gz 47K libxi-1.2.0-1_to_1.1.4-2-x86_64.delta 4,1K man-db-2.5.4-1_to_2.5.4-2-x86_64.delta 959K man-db-2.5.4-2-x86_64.pkg.tar.gz 1,2M man-pages-3.18-1_to_3.19-1-x86_64.delta 4,4M man-pages-3.19-1-x86_64.pkg.tar.gz 1,5M ncurses-5.7-2.1-x86_64.pkg.tar.gz 263K ncurses-5.7-2_to_5.7-2.1-x86_64.delta 951K nvidia-180.22-1_to_180.29-3-x86_64.delta 2,6M nvidia-180.29-3-x86_64.pkg.tar.gz 6,1M nvidia-utils-180.22-1_to_180.29-3-x86_64.delta 11M nvidia-utils-180.29-3-x86_64.pkg.tar.gz 1,2K pm-utils-1.2.4-1_to_1.2.4-3-x86_64.delta 39K pm-utils-1.2.4-3-x86_64.pkg.tar.gz 5,6K run-parts-2.30-1_to_2.31-1-x86_64.delta 8,9K run-parts-2.31-1-x86_64.pkg.tar.gz
Am Freitag 06 März 2009 09:24:17 schrieb Xavier:
The ghostscript one is unbelievable : 5,7K vs 29M !
Thats because I only changed a dep and just repackaged it (instead of recompiling) -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
On Fri, Mar 6, 2009 at 11:01 AM, Pierre Schmitz <pierre@archlinux.de> wrote:
Am Freitag 06 März 2009 09:24:17 schrieb Xavier:
The ghostscript one is unbelievable : 5,7K vs 29M !
Thats because I only changed a dep and just repackaged it (instead of recompiling)
Ah that's interesting to know :) So probably this makes the result a bit wrong, without this package the ratio would be 12/44 instead of 12/73, but that's still nice. And this means that a minor repackaging change can be made to a big package without causing a big download for every user.
Xavier wrote:
So probably this makes the result a bit wrong...
a minor repackaging change ... without causing a big download for every user.
I wouldn't consider this as a skew in the statistics for that exact reason. There are users who downloaded 29MB of updates for a (possibly) one-line change. __________ Brendan Hide
[Xav, sorry for my direct reply.]
I generated deltas for the last 20 updates on my system. The total delta size is 12M, while the total package size is 73M. So the overall ratio is pretty good imo. Afaik, only one delta on the 20 didn't reach the 0.7 ratio : attr The ghostscript one is unbelievable : 5,7K vs 29M !
Here are the packages, deltas and their size :
Well, these results are cool. If deltas finally work perfectly in alpm we may want to factorize (move) it to a stand-alone xdelta client-server(?) stuff, which is something similar to rsync (but we need a vcs-like stuff here). Well, I know this idea is very ambitious, but I have the impression that we (== Nathan and Xavier:) did something, from which the whole open source community could profit. (Moreover, our alpm code would be cleaner :) At least I didn't find anything which is similar to our low-bandwidth delta-based "vcs system" on the net. Bye P.S: This is of course an utopian idea. :) ------------------------------------------------------ SZTE Egyetemi Konyvtar - http://www.bibl.u-szeged.hu This message was sent using IMP: http://horde.org/imp/
On Fri, Mar 6, 2009 at 3:38 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Well, these results are cool. If deltas finally work perfectly in alpm we may want to factorize (move) it to a stand-alone xdelta client-server(?) stuff, which is something similar to rsync (but we need a vcs-like stuff here). Well, I know this idea is very ambitious, but I have the impression that we (== Nathan and Xavier:) did something, from which the whole open source community could profit. (Moreover, our alpm code would be cleaner :) At least I didn't find anything which is similar to our low-bandwidth delta-based "vcs system" on the net.
There would be a huge drawback with this idea, that we couldn't simply re-use all the existing http/ftp mirrors. However, I considered writing standalone apps in python : 1) delta-add.py : analog of repo-add but for delta. delta-add delta-database foo1.delta foo2.delta ... 2) delta-download.py : have a delta.conf file, analog to pacman.conf, to specify delta mirrors delta-download url : either use deltas from the delta mirrors to generate the wanted package, or download the url directly I was motivated by this solution, when I realized that the repository system implemented by pacman is quite nice : * pacman.conf specifying for each repo multiple possible urls (mirroring system) * refresh only when needed using .lastupdate So that I would have to reimplement / duplicate this system. Also by moving this to an external app, it means the refreshing of the database has to be made for each download (for each download, the external client app has to be called). So I lost my motivation and this project didn't go far.
On Fri, Mar 6, 2009 at 4:19 PM, Xavier <shiningxc@gmail.com> wrote:
On Fri, Mar 6, 2009 at 3:38 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Well, these results are cool. If deltas finally work perfectly in alpm we may want to factorize (move) it to a stand-alone xdelta client-server(?) stuff, which is something similar to rsync (but we need a vcs-like stuff here). Well, I know this idea is very ambitious, but I have the impression that we (== Nathan and Xavier:) did something, from which the whole open source community could profit. (Moreover, our alpm code would be cleaner :) At least I didn't find anything which is similar to our low-bandwidth delta-based "vcs system" on the net.
There would be a huge drawback with this idea, that we couldn't simply re-use all the existing http/ftp mirrors.
However, I considered writing standalone apps in python :
1) delta-add.py : analog of repo-add but for delta. delta-add delta-database foo1.delta foo2.delta ...
2) delta-download.py : have a delta.conf file, analog to pacman.conf, to specify delta mirrors delta-download url : either use deltas from the delta mirrors to generate the wanted package, or download the url directly
I was motivated by this solution, when I realized that the repository system implemented by pacman is quite nice : * pacman.conf specifying for each repo multiple possible urls (mirroring system) * refresh only when needed using .lastupdate
So that I would have to reimplement / duplicate this system. Also by moving this to an external app, it means the refreshing of the database has to be made for each download (for each download, the external client app has to be called).
So I lost my motivation and this project didn't go far.
Hm I felt by chance on this : http://tonelli.sns.it/pub/mennucc1/debdelta/README It might be worth looking more in details how it works.
Nagy Gabor wrote:
... factorize (move) it to a stand-alone xdelta client-server(?) ... P.S: This is of course an utopian idea. :) I'd like to try convince the powers-that-be here to sponsor a server in Atlanta.
Who would I need to speak to to get some real statistics on the project's resource requirements so I can put together a proper proposal? __________ Brendan Hide
participants (4)
-
Brendan Hide
-
Nagy Gabor
-
Pierre Schmitz
-
Xavier