Re: [pacman-dev] Add delta creation to repo-add.

29 Feb 2008

      On Thu, Feb 28, 2008 at 09:21:57PM +0100, Xavier wrote:
...
On Fri, Feb 15, 2008 at 08:34:52PM -0500, Nathan Jones wrote:
...
For the first patch, I added the -n flag to gzip in order to prevent
the md5sum problem. The apply-delta script is not as efficient as it
could be. There is an extra gunzip+gzip step that could be avoided
if we create a delta on the .tar rather than the .tar.gz. Doing so
would complicate makepkg a bit, so I stuck with the extra step.
Hm well, I have a stupid question : why is this better than the
previous way of recreating the package?
Before your patch, we only had overhead once at package / delta
creation time, while after, we have it every time the delta is
applied, right?
The problem is that the file that xdelta creates after compression
differs from the gzipped version. Example:

$ xdelta delta old.gz new.gz delta
$ xdelta patch delta old.gz new2.gz
$ md5sum new*
2b77f5103345b4bd09ab09b4710d1a9d  new.gz
48df25a60c1c689e9392b086a2d40e5c  new2.gz
$ gunzip *.gz
$ md5sum new*
aa898bcbb59cdf9066b5fb4645440e2c  new
aa898bcbb59cdf9066b5fb4645440e2c  new2

The -n parameter is necessary because by default gzip will store the
mtime in the header leading to different md5sums after the user applies
the delta. 

This wasn't really a problem before because the package would receive
the new md5sum during the makepkg step, and the old md5sum would never
be seen. Now, a packager would create the package with one md5sum, and
repo-add would change it.
...
My comment here is rather on the previous code that you refactored
than your patch itself. I really wish things could be much simpler
here, it's all pretty complicated.
What about dropping all this filenames parsing / version comparisons
stuff, and try to reimplement this in a much more basic way in
repo-add?  Since repo-add removes an entry and add a new one in the
database, we should be able to easily fetch the informations from
these entries, and generate a delta between both. What do you think?
This would take away the ability for makepkg to create deltas. If that
won't be a problem then this sounds like a decent idea.
...
I think this should work perfectly when using core / extra / community
repos.  Using testing repo complicates everything though, because
packages are moved between repos. But is it really a big problem if
the tools don't handle this situation automatically?
I don't know exactly how testing works so I can't say for sure, but I
would guess it would work. Testing users would get 1.0-1.1, 1.1-1.2,
1.2-1.3, and core users would get 1.0-1.3 once the package moved.
...
Maybe we could allow a way to let the packager specify from which
packages deltas should be created.
I think the automated way should work in most all cases. Do you have a
scenario in mind where this would be useful?

If these patches are accepted, the md5sum problem will be gone which
means a packager can manually run the xdelta command if it is truly
needed.