[pacman-dev] Add delta creation to repo-add.

Nathan Jones nathanj at insightbb.com
Thu Feb 28 19:31:39 EST 2008


On Thu, Feb 28, 2008 at 09:21:57PM +0100, Xavier wrote:
> On Fri, Feb 15, 2008 at 08:34:52PM -0500, Nathan Jones wrote:
> > For the first patch, I added the -n flag to gzip in order to prevent
> > the md5sum problem. The apply-delta script is not as efficient as it
> > could be. There is an extra gunzip+gzip step that could be avoided
> > if we create a delta on the .tar rather than the .tar.gz. Doing so
> > would complicate makepkg a bit, so I stuck with the extra step.
> > 
> 
> Hm well, I have a stupid question : why is this better than the
> previous way of recreating the package?
> Before your patch, we only had overhead once at package / delta
> creation time, while after, we have it every time the delta is
> applied, right?

The problem is that the file that xdelta creates after compression
differs from the gzipped version. Example:

$ xdelta delta old.gz new.gz delta
$ xdelta patch delta old.gz new2.gz
$ md5sum new*
2b77f5103345b4bd09ab09b4710d1a9d  new.gz
48df25a60c1c689e9392b086a2d40e5c  new2.gz
$ gunzip *.gz
$ md5sum new*
aa898bcbb59cdf9066b5fb4645440e2c  new
aa898bcbb59cdf9066b5fb4645440e2c  new2

The -n parameter is necessary because by default gzip will store the
mtime in the header leading to different md5sums after the user applies
the delta. 

This wasn't really a problem before because the package would receive
the new md5sum during the makepkg step, and the old md5sum would never
be seen. Now, a packager would create the package with one md5sum, and
repo-add would change it.

> My comment here is rather on the previous code that you refactored
> than your patch itself. I really wish things could be much simpler
> here, it's all pretty complicated.
> What about dropping all this filenames parsing / version comparisons
> stuff, and try to reimplement this in a much more basic way in
> repo-add?  Since repo-add removes an entry and add a new one in the
> database, we should be able to easily fetch the informations from
> these entries, and generate a delta between both. What do you think?

This would take away the ability for makepkg to create deltas. If that
won't be a problem then this sounds like a decent idea.

> I think this should work perfectly when using core / extra / community
> repos.  Using testing repo complicates everything though, because
> packages are moved between repos. But is it really a big problem if
> the tools don't handle this situation automatically?

I don't know exactly how testing works so I can't say for sure, but I
would guess it would work. Testing users would get 1.0-1.1, 1.1-1.2,
1.2-1.3, and core users would get 1.0-1.3 once the package moved.

> Maybe we could allow a way to let the packager specify from which
> packages deltas should be created.

I think the automated way should work in most all cases. Do you have a
scenario in mind where this would be useful?

If these patches are accepted, the md5sum problem will be gone which
means a packager can manually run the xdelta command if it is truly
needed.




More information about the pacman-dev mailing list