On Mon, Feb 23, 2009 at 6:48 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Questions which make the implementation complex: * When do we generate deltas? As part of the db scripts?
Well I think that would be practical. When a new package is being added, grab the old one, generate a delta, and add it to the database. This could be doable.
* How long do we keep them? 10 previous versions? 5?
I would think 5 is more than enough. Allan suggested more complicated ways of cleaning deltas, but we could indeed just use a simple limit like that. There is still the problem of finding which are the 5 newest deltas to be kept.
* How much additional space is this going to take? How do we set it up so that space-constrained mirrors can opt-out of the deltas?
That's a very good question I didn't consider. But well, I didn't expect to figure out and answer all the problems alone. I know nothing about mirror setup. And it seems there are quite a few users interested by delta though, so maybe some could help to provide some results about how much space it could take.
I'm sure there's more, but that's just "off the cuff". In my eyes, this is a complex change that doesn't really seem to benefit too many people. If you download 3megs instead of 7, it's not that big of a deal and has so many more points of failure to contend with.
The benefit can be much greater than that. I just wrote a quick hack so that will generate a delta for each package upgrade on my box, and stores them in a database. The first package that came in : 2,8M openjdk6-1.4-2_to_1.4.1-1-x86_64.delta 67M openjdk6-1.4.1-1-x86_64.pkg.tar.gz On a decent 1MB/s line, that's a 1 minute difference for a single package. But yes, it is clearly more complex and there is clearly many more points of failure.