[pacman-dev] Add delta creation to repo-add.

Xavier shiningxc at gmail.com
Tue Mar 4 17:28:51 EST 2008


On Thu, Feb 28, 2008 at 07:31:39PM -0500, Nathan Jones wrote:
> 
> The problem is that the file that xdelta creates after compression
> differs from the gzipped version. Example:
> 
> $ xdelta delta old.gz new.gz delta
> $ xdelta patch delta old.gz new2.gz
> $ md5sum new*
> 2b77f5103345b4bd09ab09b4710d1a9d  new.gz
> 48df25a60c1c689e9392b086a2d40e5c  new2.gz
> $ gunzip *.gz
> $ md5sum new*
> aa898bcbb59cdf9066b5fb4645440e2c  new
> aa898bcbb59cdf9066b5fb4645440e2c  new2
> 
> The -n parameter is necessary because by default gzip will store the
> mtime in the header leading to different md5sums after the user applies
> the delta. 
> 
> This wasn't really a problem before because the package would receive
> the new md5sum during the makepkg step, and the old md5sum would never
> be seen. Now, a packager would create the package with one md5sum, and
> repo-add would change it.
> 

Oh ok, I see, that makes sense. Well, it might be a good idea to do it that
way for now, and then later see if we rather want to work directly on the
tar.

> > My comment here is rather on the previous code that you refactored
> > than your patch itself. I really wish things could be much simpler
> > here, it's all pretty complicated.
> > What about dropping all this filenames parsing / version comparisons
> > stuff, and try to reimplement this in a much more basic way in
> > repo-add?  Since repo-add removes an entry and add a new one in the
> > database, we should be able to easily fetch the informations from
> > these entries, and generate a delta between both. What do you think?
> 
> This would take away the ability for makepkg to create deltas. If that
> won't be a problem then this sounds like a decent idea.
> 
> > I think this should work perfectly when using core / extra / community
> > repos.  Using testing repo complicates everything though, because
> > packages are moved between repos. But is it really a big problem if
> > the tools don't handle this situation automatically?
> 
> I don't know exactly how testing works so I can't say for sure, but I
> would guess it would work. Testing users would get 1.0-1.1, 1.1-1.2,
> 1.2-1.3, and core users would get 1.0-1.3 once the package moved.
> 
> > Maybe we could allow a way to let the packager specify from which
> > packages deltas should be created.
> 
> I think the automated way should work in most all cases. Do you have a
> scenario in mind where this would be useful?
> 
> If these patches are accepted, the md5sum problem will be gone which
> means a packager can manually run the xdelta command if it is truly
> needed.

Sorry if it wasn't clear, my two last comments were only in the case where
xdelta support is only in repo-add.

I am still thinking this is a good idea. In my opinion, it would allow a much
simpler and cleaner implementation, while still having a decent behavior. And
finally, there might not be any manual work required from the packager.

Suppose we have the following files in a repo/ :
repo.db.tar.gz (foo-2/{depends,desc,delta})
foo-2.pkg.tar.gz
foo_1_to_2.delta

Now, we put a new foo-3.pkg.tar.gz package in there.
We create the db entry with : repo-add repo.db.tar.gz foo-3.pkg.tar.gz
repo-add could do the following steps :
1) find the current foo-2 entry
2) get the foo-2 filename from foo-2/desc : foo-2.pkg.tar.gz
3) if foo-2.pkg.tar.gz exists, generate the foo_2_to_3.delta
4) clean up the foo-2/delta file by removing obsolete entries (delta files
that are no longer available in the current repo directory)
5) add the new foo_2_to_3.delta to foo-2/delta, then save foo-2/delta
somewhere
6) remove the old foo-2/ entry and create the new foo-3/ entry
7) restore the delta metainfo file to foo-3/

So at each package addition, we create and add at most one delta, if an old
package existed. And we keep the old deltas. Also, in the code added for delta
support, there is no need for any filenames guessing.

Well, I admit I am not sure it all makes sense yet, there might be things I
overlooked, or maybe it won't be as simple as I think it will. I would need
to try implementing it to be sure.




More information about the pacman-dev mailing list