[pacman-dev] delta support in libalpm

Xavier shiningxc at gmail.com
Mon Feb 23 05:02:09 EST 2009


On Mon, Feb 23, 2009 at 9:57 AM, Brendan Hide <brendan at swiftspirit.co.za> wrote:
> Xavier wrote:
>>
>> There has never been any real official interests for delta. This seems
>> to make a requirement the ability to make a separate delta server.
>> This seems to require a separate delta database. This implies a new
>> level of complexity and code bloat in pacman. Now maybe it is worth
>> it, I don't know, it still makes me wondering why we put all this
>> delta stuff in pacman to begin with. What was the problem with
>> XferCommand, it seemed like it was a great idea. Now that
>> wget-xdelta.sh script is just a toy, but a much more powerful python
>> script could be written that has basically the same logic as pacman
>> currently has + the ability to fetch and parse a separate delta
>> database.
>
> Unless the server is out of disk space, I'm not too sure exactly why there's
> a requirement for a separate server. If pacman is distributed with the delta
> option turned on by default, the server doing the actual "serving" of the
> updates is probably going to have 60 to 85% less work to do.
>
> I will grant that there would be a new level of complexity involved, for
> example, if I've missed 4 updates, we'd have to "chain link" the tar.gz in
> my cache via 4 delta patches to get the current tar.gz.
>
> I believe that the following would be the simplest implementation both in
> terms of how much implementation work is needed and the probable
> effectiveness:
> Put delta files into a separate folder (thus also avoiding a snapshot from
> containing the deltas):
> http://archlinux.mirror.ac.za/delta/core/os/x86_64/kernel26-2.6.28.4-1-x86_64.kernel26-2.6.28.5-1.pkg.xd3.tar.gz
> Thus, I could do the following (bash pseudocode)
> curl http://archlinux.mirror.ac.za/delta/core/os/x86_64/ > tmpfile
> grep $pkgname < tmpfile > listing
> failed=false
> cat listing | while read delta
> do
>  [ $pkgname-$currentpkgversion-$pkgarch.xd3.tar.gz *within* $delta ] &&
> start=true
> if [ start=true ]
> then while read delta
>  do
>  wget http://archlinux.mirror.ac.za/delta/core/os/x86_64/$delta &&
> applydelta $delta $curfile
>  [ $output=$pkgname-$newpkgversion-$pkgarch.tar.gz ] && break
>  curfile=`ls -rt | tail -n 1`
>  done
> fi
> [ $output=$pkgname-$newpkgversion-$pkgarch.tar.gz ] && break
> done
>
> The above requires no db implementation at all and can work well even using
> the above very simple logic.
> And yes, by my own standards, the above is very bad bash pseudo-code. :P
>
> Of the above, what is already implemented in pacman?
>

Everything is already implemented in pacman, with a more complex logic
(which might be totally useless after all)
For each package in a sync db, there is a deltas file besides the
depends and desc one which basically contains the list of deltas for
that package and their size. With this information, and the contents
of the filecache, it computes the shortest path (in term of download
size) to the final package.
That logic applied to an example :
if you have file v1 in your cache, you want to upgrade to v3, and
there are three deltas for this package : v1tov2 , v2tov3 and v1tov3
If v1tov2 + v2tov3 is smaller than v1tov3, it will download the first
two deltas and apply them to get v3. Otherwise it will download the
third one.

The problem of this implementation (besides being probably overkill)
is that it requires information in the sync databases. So either it
requires a big official effort to integrate this stuff and add deltas
to all the official databases. Otherwise, I don't know. You need to
fully mirror the repository you want to add deltas to, then you need
to generate deltas (maybe during mirror sync) and to add the deltas to
your database, and then host everything somewhere (the packages + the
deltas + the database with delta info).


More information about the pacman-dev mailing list