[pacman-dev] Add delta creation to repo-add.
For the first patch, I added the -n flag to gzip in order to prevent the md5sum problem. The apply-delta script is not as efficient as it could be. There is an extra gunzip+gzip step that could be avoided if we create a delta on the .tar rather than the .tar.gz. Doing so would complicate makepkg a bit, so I stuck with the extra step. For the second patch, I created a second function, create_xdelta_latest, which creates a delta for the latest package with a given package name. I did this because otherwise a delta would be created for every package given on the command line which is annoying for me because I use *.pkg.tar.gz. Problems: * gzip upgrade could produce different md5sums - could have had the problem before if a zlib upgrade did the same thing * all delta files that currently exist will not work (probably not a problem)
Without the -n flag, gzip stores the modification time inside the header which can produce different md5sums for the same file contents. By using this flag, it will be possible for the package to be generated using a delta and have the same md5sum. A new script, apply-delta, is created to apply a delta file and recompress the resulting package. This will produce a package with the correct md5sum. Signed-off-by: Nathan Jones <nathanj@insightbb.com> --- lib/libalpm/sync.c | 8 ++++---- scripts/.gitignore | 1 + scripts/Makefile.am | 3 +++ scripts/apply-delta.sh.in | 30 ++++++++++++++++++++++++++++++ scripts/makepkg.sh.in | 24 ++++++------------------ 5 files changed, 44 insertions(+), 22 deletions(-) create mode 100644 scripts/apply-delta.sh.in diff --git a/lib/libalpm/sync.c b/lib/libalpm/sync.c index 4101158..e2e4ccb 100644 --- a/lib/libalpm/sync.c +++ b/lib/libalpm/sync.c @@ -738,14 +738,14 @@ static int apply_deltas(pmtrans_t *trans, alpm_list_t *patches) } /* an example of the patch command: (using /cache for cachedir) - * xdelta patch /cache/pacman_3.0.0-1_to_3.0.1-1-i686.delta \ - * /cache/pacman-3.0.0-1-i686.pkg.tar.gz \ - * /cache/pacman-3.0.1-1-i686.pkg.tar.gz + * apply-delta /cache/pacman_3.0.0-1_to_3.0.1-1-i686.delta \ + * /cache/pacman-3.0.0-1-i686.pkg.tar.gz \ + * /cache/pacman-3.0.1-1-i686.pkg.tar.gz */ /* build the patch command */ snprintf(command, PATH_MAX, - "xdelta patch" /* the command */ + "apply-delta" /* the command */ " %s/%s" /* the delta */ " %s/%s-%s-%s" PKGEXT /* the 'from' package */ " %s/%s-%s-%s" PKGEXT, /* the 'to' package */ diff --git a/scripts/.gitignore b/scripts/.gitignore index f2f19fd..214c23d 100644 --- a/scripts/.gitignore +++ b/scripts/.gitignore @@ -1,3 +1,4 @@ +apply-delta makepkg pacman-optimize rankmirrors diff --git a/scripts/Makefile.am b/scripts/Makefile.am index 3185a47..ddcbd7f 100644 --- a/scripts/Makefile.am +++ b/scripts/Makefile.am @@ -2,6 +2,7 @@ AUTOMAKE_OPTIONS = std-options bin_SCRIPTS = \ + apply-delta \ makepkg \ pacman-optimize \ rankmirrors \ @@ -9,6 +10,7 @@ bin_SCRIPTS = \ repo-remove EXTRA_DIST = \ + apply-delta.sh.in \ makepkg.sh.in \ pacman-optimize.sh.in \ rankmirrors.py.in \ @@ -44,6 +46,7 @@ $(bin_SCRIPTS): Makefile chmod a-w $@.tmp mv $@.tmp $@ +apply-delta: $(srcdir)/apply-delta.sh.in makepkg: $(srcdir)/makepkg.sh.in pacman-optimize: $(srcdir)/pacman-optimize.sh.in rankmirrors: $(srcdir)/rankmirrors.py.in diff --git a/scripts/apply-delta.sh.in b/scripts/apply-delta.sh.in new file mode 100644 index 0000000..9dca177 --- /dev/null +++ b/scripts/apply-delta.sh.in @@ -0,0 +1,30 @@ +#!/bin/bash -e +# +# apply-delta - apply xdelta patch and then recompress using gzip +# @configure_input@ +# +# Copyright (c) 2008 by Judd Vinet <jvinet@zeroflux.org> +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +withoutgz="$(echo $3 | sed -e 's/.gz$//')" + +# patch +xdelta patch "$1" "$2" "$3" + +# now re-gzip +gzip -d "$3" +gzip -f -n "$withoutgz" + +# vim: set ts=2 sw=2 noet: diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index cecda1d..b9b3ed7 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -869,7 +869,7 @@ create_package() { local pkg_file="$PKGDEST/${pkgname}-${pkgver}-${pkgrel}-${CARCH}${PKGEXT}" - if ! bsdtar -czf "$pkg_file" $comp_files $(ls); then + if ! bsdtar -cf - $comp_files $(ls) | gzip -n > "$pkg_file"; then error "$(gettext "Failed to create package file.")" exit 1 # TODO: error code fi @@ -910,28 +910,16 @@ create_xdelta() { if [ "$base_file" != "" ]; then msg "$(gettext "Making delta from version %s...")" "$latest_version" + msg2 "$(gettext "NOTE: the delta should ONLY be distributed with this tarball")" + local delta_file="$PKGDEST/$pkgname-${latest_version}_to_$pkgver-$pkgrel-$CARCH.delta" local ret=0 - # xdelta will decompress base_file & pkg_file into TMP_DIR (or /tmp if - # TMP_DIR is unset) then perform the delta on the resulting tars + # xdelta will decompress base_file & pkg_file into TMPDIR (or /tmp if + # TMPDIR is unset) then perform the delta on the resulting tars xdelta delta "$base_file" "$pkg_file" "$delta_file" || ret=$? - if [ $ret -eq 0 -o $ret -eq 1 ]; then - # Generate the final gz using xdelta for compression. xdelta will be our - # common denominator compression utility between the packager and the - # users. makepkg and pacman must use the same compression algorithm or - # the delta generated package may not match, producing md5 checksum - # errors. - msg2 "$(gettext "Recreating package tarball from delta to match md5 signatures")" - msg2 "$(gettext "NOTE: the delta should ONLY be distributed with this tarball")" - ret=0 - xdelta patch "$delta_file" "$base_file" "$pkg_file" || ret=$? - if [ $ret -ne 0 ]; then - error "$(gettext "Could not generate the package from the delta.")" - exit 1 - fi - else + if [ $ret -ne 0 -a $ret -ne 1 ]; then warning "$(gettext "Delta was not able to be created.")" fi else -- 1.5.4
This adds a -d|--delta flag to repo-add which will create delta files for the packages being added to the repository. Deltas will only be created for the package with the latest version, so that `repo-add db *.pkg.tar.gz' will not generate a delta for older packages. The delta creation code was refactored out of makepkg into a new file create-delta. This new file is sourced into both makepkg and repo-add. Signed-off-by: Nathan Jones <nathanj@insightbb.com> --- doc/repo-add.8.txt | 10 +++- scripts/.gitignore | 1 + scripts/Makefile.am | 6 ++ scripts/create-delta.sh.in | 141 ++++++++++++++++++++++++++++++++++++++++++++ scripts/makepkg.sh.in | 59 ++---------------- scripts/repo-add.sh.in | 54 ++++++++++++++++- 6 files changed, 213 insertions(+), 58 deletions(-) create mode 100644 scripts/create-delta.sh.in diff --git a/doc/repo-add.8.txt b/doc/repo-add.8.txt index 80faef4..283aa2c 100644 --- a/doc/repo-add.8.txt +++ b/doc/repo-add.8.txt @@ -16,7 +16,7 @@ repo-add - package database maintenance utility Synopsis -------- -repo-add <path-to-db> <package> ... +repo-add [options] <path-to-db> <package> ... repo-remove <path-to-db> <packagename> ... @@ -34,6 +34,14 @@ specified on the command line. Multiple packages to remove can be specified on the command line. +Options +------- +*-d, --delta* (repo-add only):: + Generate delta files for added packages. Delta files will only be + generated for the latest version of the package. A delta file + will only be generated if it does not already exist. + + See Also -------- linkman:makepkg[8], linkman:pacman[8] diff --git a/scripts/.gitignore b/scripts/.gitignore index 214c23d..5c0bf5a 100644 --- a/scripts/.gitignore +++ b/scripts/.gitignore @@ -1,4 +1,5 @@ apply-delta +create-delta makepkg pacman-optimize rankmirrors diff --git a/scripts/Makefile.am b/scripts/Makefile.am index ddcbd7f..e6194b9 100644 --- a/scripts/Makefile.am +++ b/scripts/Makefile.am @@ -1,8 +1,11 @@ # enforce that all scripts have a --help and --version option AUTOMAKE_OPTIONS = std-options +cachedir = ${localstatedir}/cache/pacman/pkg/ + bin_SCRIPTS = \ apply-delta \ + create-delta \ makepkg \ pacman-optimize \ rankmirrors \ @@ -11,6 +14,7 @@ bin_SCRIPTS = \ EXTRA_DIST = \ apply-delta.sh.in \ + create-delta.sh.in \ makepkg.sh.in \ pacman-optimize.sh.in \ rankmirrors.py.in \ @@ -30,6 +34,7 @@ edit = sed \ -e 's|@PACKAGE_BUGREPORT[@]|$(PACKAGE_BUGREPORT)|g' \ -e 's|@PACKAGE_NAME[@]|$(PACKAGE_NAME)|g' \ -e 's|@DBEXT[@]|$(DBEXT)|g' \ + -e 's|@CACHEDIR[@]|$(cachedir)|g' \ -e 's|@configure_input[@]|Generated from $@.in; do not edit by hand.|g' ## All the scripts depend on Makefile so that they are rebuilt when the @@ -47,6 +52,7 @@ $(bin_SCRIPTS): Makefile mv $@.tmp $@ apply-delta: $(srcdir)/apply-delta.sh.in +create-delta: $(srcdir)/create-delta.sh.in makepkg: $(srcdir)/makepkg.sh.in pacman-optimize: $(srcdir)/pacman-optimize.sh.in rankmirrors: $(srcdir)/rankmirrors.py.in diff --git a/scripts/create-delta.sh.in b/scripts/create-delta.sh.in new file mode 100644 index 0000000..38711d3 --- /dev/null +++ b/scripts/create-delta.sh.in @@ -0,0 +1,141 @@ +#!/bin/bash -e +# +# create-delta - create xdelta patches for packages +# @configure_input@ +# +# Copyright (c) 2008 by Judd Vinet <jvinet@zeroflux.org> +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +# This file is meant to be sourced into other scripts. The calling script must +# define the functions msg, msg2, warning, and error. + +# create_xdelta_latest - will find the latest package with the given pkgname +# and create a delta for it. +# +# params: +# $1 - the name of the package +# $2 - the arch of the package +# $3 - the directory where the package is located +# $4 - the extension of packages +create_xdelta_latest() { + local pkgname=$1 + local arch=$2 + local pkgdest=$3 + local pkgext=$4 + local cache_dir="@CACHEDIR@" + local pkginfo="$(mktemp -t xdelta-pkginfo.XXXXXXXXX)" + + local old_file old_version latest_version latest_file prev_version prev_file + for old_file in $(ls {"$cache_dir","$pkgdest"}/${pkgname}-*-*$pkgext 2>/dev/null); do + bsdtar -xOf "$old_file" .PKGINFO > "$pkginfo" || continue + if [ "$(cat "$pkginfo" | grep '^pkgname = ')" != "pkgname = $pkgname" ]; then + continue # Package name does not match. + elif [ "$(cat "$pkginfo" | grep '^arch = ')" != "arch = $arch" ] ; then + continue # Not same arch. + fi + + old_version="$(cat "$pkginfo" | grep '^pkgver = ' | sed 's/^pkgver = //')" + + local vercmp=$(vercmp "$old_version" "$latest_version") + if [ $vercmp -gt 0 ]; then + prev_version=$latest_version + prev_file=$latest_file + latest_version=$old_version + latest_file=$old_file + fi + done + + rm -f "$pkginfo" + + if [ -n "$latest_file" -a -n "$prev_file" ]; then + create_xdelta_file "$latest_file" "$arch" "$latest_version" "$pkgdest" "$pkgext" 0 "$prev_file" "$prev_version" + fi +} + + +# create_xdelta_file - will create a delta for the package filename given. +# +# params: +# $1 - the filename of the package +# $2 - the arch of the package +# $3 - the version and release of the package +# $4 - the directory where the package is located +# $5 - the extension of packages +# $6 - 0 if an existing delta file should not be overwritten +# $7 - the filename of the previous package (blank if not known) +# $8 - the version of the previous package (blank if not known) +create_xdelta_file() { + if [ ! "$(type -p xdelta)" ]; then + error "$(gettext "Cannot find the xdelta binary! Is xdelta installed?")" + return + fi + + local pkg_file=$1 + local arch=$2 + local pkgverrel=$3 + local pkgdest=$4 + local pkgext=$5 + local overwrite=$6 + local prev_file=$7 + local prev_version=$8 + local cache_dir="@CACHEDIR@" + local pkginfo="$(mktemp -t xdelta-pkginfo.XXXXXXXXX)" + + if [ -z "$prev_file" -o -z "$prev_version" ]; then + local old_file old_version + for old_file in $(ls {"$cache_dir","$pkgdest"}/${pkgname}-*-*$pkgext 2>/dev/null); do + bsdtar -xOf "$old_file" .PKGINFO > "$pkginfo" || continue + if [ "$(cat "$pkginfo" | grep '^pkgname = ')" != "pkgname = $pkgname" ]; then + continue # Package name does not match. + elif [ "$(cat "$pkginfo" | grep '^arch = ')" != "arch = $arch" ] ; then + continue # Not same arch. + fi + + old_version="$(cat "$pkginfo" | grep '^pkgver = ' | sed 's/^pkgver = //')" + + # old_version may include the target package, only use the old versions + local vercmp=$(vercmp "$old_version" "$prev_version") + if [ "$old_version" != "$pkgverrel" -a $vercmp -gt 0 ]; then + prev_version=$old_version + prev_file=$old_file + fi + done + fi + + rm -f "$pkginfo" + + if [ -n "$prev_file" ]; then + local delta_file="$pkgdest/$pkgname-${prev_version}_to_$pkgverrel-$arch.delta" + local ret=0 + + # return if not overwriting an existing delta + if [ $overwrite -eq 0 -a -e "$delta_file" ]; then + return + fi + + msg "$(gettext "Creating delta from %s to %s...")" "$pkgname-$prev_version" "$pkgname-$pkgverrel" + msg2 "$(gettext "NOTE: the delta should ONLY be distributed with this tarball")" + + # xdelta will decompress prev_file & pkg_file into TMPDIR (or /tmp if + # TMPDIR is unset) then perform the delta on the resulting tars + xdelta delta "$prev_file" "$pkg_file" "$delta_file" || ret=$? + + if [ $ret -ne 0 -a $ret -ne 1 ]; then + warning "$(gettext "Delta was not able to be created.")" + fi + fi +} + +# vim: set ts=2 sw=2 noet: diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in index b9b3ed7..ce4d960 100644 --- a/scripts/makepkg.sh.in +++ b/scripts/makepkg.sh.in @@ -65,6 +65,9 @@ FORCE_VER="" PACMAN_OPTS= +# load the delta generation functions +source "$(dirname $0)/create-delta" + ### SUBROUTINES ### plain() { @@ -875,58 +878,6 @@ create_package() { fi } -create_xdelta() { - if [ "$(check_buildenv xdelta)" != "y" ]; then - return - elif [ ! "$(type -p xdelta)" ]; then - error "$(gettext "Cannot find the xdelta binary! Is xdelta installed?")" - return - fi - - local pkg_file=$1 - local cache_dir="/var/cache/pacman/pkg" # TODO: autoconf me - local pkginfo="$(mktemp "$startdir"/xdelta-pkginfo.XXXXXXXXX)" - - local old_file old_version - for old_file in $(ls {"$cache_dir","$PKGDEST"}/${pkgname}-*-*{,-$CARCH}$PKGEXT 2>/dev/null); do - bsdtar -xOf "$old_file" .PKGINFO > "$pkginfo" || continue - if [ "$(cat "$pkginfo" | grep '^pkgname = ')" != "pkgname = $pkgname" ]; then - continue # Package name does not match. - elif [ "$(cat "$pkginfo" | grep '^arch = ')" != "arch = $CARCH" ] ; then - continue # Not same arch. - fi - - old_version="$(cat "$pkginfo" | grep '^pkgver = ' | sed 's/^pkgver = //')" - - # old_version may include the target package, only use the old versions - local vercmp=$(vercmp "$old_version" "$latest_version") - if [ "$old_version" != "$pkgver-$pkgrel" -a $vercmp -gt 0 ]; then - local latest_version=$old_version - local base_file=$old_file - fi - done - - rm -f "$pkginfo" - - if [ "$base_file" != "" ]; then - msg "$(gettext "Making delta from version %s...")" "$latest_version" - msg2 "$(gettext "NOTE: the delta should ONLY be distributed with this tarball")" - - local delta_file="$PKGDEST/$pkgname-${latest_version}_to_$pkgver-$pkgrel-$CARCH.delta" - local ret=0 - - # xdelta will decompress base_file & pkg_file into TMPDIR (or /tmp if - # TMPDIR is unset) then perform the delta on the resulting tars - xdelta delta "$base_file" "$pkg_file" "$delta_file" || ret=$? - - if [ $ret -ne 0 -a $ret -ne 1 ]; then - warning "$(gettext "Delta was not able to be created.")" - fi - else - warning "$(gettext "No previous version found, skipping xdelta.")" - fi -} - create_srcpackage() { cd "$startdir" msg "$(gettext "Creating source package...")" @@ -1468,7 +1419,9 @@ else fi fi - create_xdelta "$PKGDEST/${pkgname}-${pkgver}-${pkgrel}-${CARCH}${PKGEXT}" + if [ "$(check_buildenv xdelta)" = "y" ]; then + create_xdelta_file "$PKGDEST/${pkgname}-${pkgver}-${pkgrel}-${CARCH}${PKGEXT}" "$CARCH" "$pkgver-$pkgrel" "$PKGDEST" "$PKGEXT" 1 "" "" + fi fi msg "$(gettext "Finished making: %s")" "$pkgname ($(date))" diff --git a/scripts/repo-add.sh.in b/scripts/repo-add.sh.in index 00eec7e..6d61aa4 100644 --- a/scripts/repo-add.sh.in +++ b/scripts/repo-add.sh.in @@ -27,6 +27,9 @@ confdir='@sysconfdir@' REPO_DB_FILE="" +# load the delta generation functions +source "$(dirname $0)/create-delta" + # ensure we have a sane umask set umask 0022 @@ -53,10 +56,12 @@ error() { # print usage instructions usage() { printf "repo-add (pacman) %s\n\n" "$myver" - printf "$(gettext "Usage: %s <path-to-db> <package> ...\n\n")" "$0" + printf "$(gettext "Usage: %s [options] <path-to-db> <package> ...\n\n")" "$0" printf "$(gettext "\ repo-add will update a package database by reading a package file.\n\ Multiple packages to add can be specified on the command line.\n\n")" + printf "$(gettext "Options:\n")" + printf "$(gettext " -d, --delta Generate delta files for packages\n\n")" echo "$(gettext "Example: repo-add /path/to/repo.db.tar.gz pacman-3.0.0.pkg.tar.gz")" } @@ -123,6 +128,38 @@ db_write_delta() echo -e "$fromver $tover $csize $filename $md5sum" >>deltas } # end db_write_delta +# create a delta for the latest package +# arg1 - path to package +create_xdelta() { + # blank out all variables and set pkgfile + local pkgfile=$(readlink -f "$1") + local pkgname pkgver pkgdesc url builddate packager csize size \ + group depend backup arch license replaces provides conflict force \ + _groups _depends _backups _licenses _replaces _provides _conflicts \ + pkgdir + + local OLDIFS="$IFS" + # IFS (field separator) is only the newline character + IFS=" +" + + # read info from the zipped package + local line + for line in $(bsdtar -xOf "$pkgfile" .PKGINFO | \ + grep -v "^#" | sed 's|\(\w*\)\s*=\s*\(.*\)|\1="\2"|'); do + eval "$line" + done + + IFS=$OLDIFS + + # ensure $pkgname and $pkgver variables were found + if [ -z "$pkgname" -o -z "$pkgver" ]; then + return 1 + fi + + pkgdir=$(pwd) + create_xdelta_latest "$pkgname" "$arch" "$pkgdir" "$PKGEXT" +} # write an entry to the pacman database # arg1 - path to package @@ -131,7 +168,7 @@ db_write_entry() # blank out all variables and set pkgfile local pkgfile=$(readlink -f "$1") local pkgname pkgver pkgdesc url builddate packager csize size \ - group depend backup license replaces provides conflict force \ + group depend backup arch license replaces provides conflict force \ _groups _depends _backups _licenses _replaces _provides _conflicts \ startdir @@ -229,9 +266,9 @@ db_write_entry() # write this delta entry if db_write_delta "$delta"; then - msg2 "$(gettext "Added delta '%s'")" "$(basename "$delta")" + msg2 "$(gettext "Adding delta '%s'...")" "$(basename "$delta")" else - msg2 "$(gettext "Could not add delta '%s'")" "$(basename "$delta")" + msg2 "$(gettext "Could not add delta '%s'.")" "$(basename "$delta")" fi fi done @@ -283,11 +320,18 @@ gstmpdir=$(mktemp -d /tmp/repo-add.XXXXXXXXXX) || (\ exit 1) success=0 +optdelta="" # parse arguments for arg in "$@"; do if [ "$arg" == "--force" -o "$arg" == "-f" ]; then warning "$(gettext "the -f and --force options are no longer recognized")" msg2 "$(gettext "use options=(force) in the PKGBUILD instead")" + elif [ "$arg" == "--delta" -o "$arg" == "-d" ]; then + if [ ! "$(type -p xdelta)" ]; then + error "$(gettext "Cannot find the xdelta binary! Is xdelta installed?")" + exit 1 + fi + optdelta="y" elif [ -z "$REPO_DB_FILE" ]; then REPO_DB_FILE=$(readlink -f "$arg") if ! test_repo_db_file; then @@ -302,6 +346,8 @@ for arg in "$@"; do if ! bsdtar -tf "$arg" .PKGINFO 2>&1 >/dev/null; then error "$(gettext "'%s' is not a package file, skipping")" "$arg" else + [ -n "$optdelta" ] && create_xdelta "$arg" + msg "$(gettext "Adding package '%s'")" "$arg" if db_write_entry "$arg"; then -- 1.5.4
On Fri, Feb 15, 2008 at 08:34:52PM -0500, Nathan Jones wrote:
For the first patch, I added the -n flag to gzip in order to prevent the md5sum problem. The apply-delta script is not as efficient as it could be. There is an extra gunzip+gzip step that could be avoided if we create a delta on the .tar rather than the .tar.gz. Doing so would complicate makepkg a bit, so I stuck with the extra step.
Hm well, I have a stupid question : why is this better than the previous way of recreating the package? Before your patch, we only had overhead once at package / delta creation time, while after, we have it every time the delta is applied, right?
For the second patch, I created a second function, create_xdelta_latest, which creates a delta for the latest package with a given package name. I did this because otherwise a delta would be created for every package given on the command line which is annoying for me because I use *.pkg.tar.gz.
My comment here is rather on the previous code that you refactored than your patch itself. I really wish things could be much simpler here, it's all pretty complicated. What about dropping all this filenames parsing / version comparisons stuff, and try to reimplement this in a much more basic way in repo-add? Since repo-add removes an entry and add a new one in the database, we should be able to easily fetch the informations from these entries, and generate a delta between both. What do you think? I think this should work perfectly when using core / extra / community repos. Using testing repo complicates everything though, because packages are moved between repos. But is it really a big problem if the tools don't handle this situation automatically? Maybe we could allow a way to let the packager specify from which packages deltas should be created.
On Thu, Feb 28, 2008 at 09:21:57PM +0100, Xavier wrote:
On Fri, Feb 15, 2008 at 08:34:52PM -0500, Nathan Jones wrote:
For the first patch, I added the -n flag to gzip in order to prevent the md5sum problem. The apply-delta script is not as efficient as it could be. There is an extra gunzip+gzip step that could be avoided if we create a delta on the .tar rather than the .tar.gz. Doing so would complicate makepkg a bit, so I stuck with the extra step.
Hm well, I have a stupid question : why is this better than the previous way of recreating the package? Before your patch, we only had overhead once at package / delta creation time, while after, we have it every time the delta is applied, right?
The problem is that the file that xdelta creates after compression differs from the gzipped version. Example: $ xdelta delta old.gz new.gz delta $ xdelta patch delta old.gz new2.gz $ md5sum new* 2b77f5103345b4bd09ab09b4710d1a9d new.gz 48df25a60c1c689e9392b086a2d40e5c new2.gz $ gunzip *.gz $ md5sum new* aa898bcbb59cdf9066b5fb4645440e2c new aa898bcbb59cdf9066b5fb4645440e2c new2 The -n parameter is necessary because by default gzip will store the mtime in the header leading to different md5sums after the user applies the delta. This wasn't really a problem before because the package would receive the new md5sum during the makepkg step, and the old md5sum would never be seen. Now, a packager would create the package with one md5sum, and repo-add would change it.
My comment here is rather on the previous code that you refactored than your patch itself. I really wish things could be much simpler here, it's all pretty complicated. What about dropping all this filenames parsing / version comparisons stuff, and try to reimplement this in a much more basic way in repo-add? Since repo-add removes an entry and add a new one in the database, we should be able to easily fetch the informations from these entries, and generate a delta between both. What do you think?
This would take away the ability for makepkg to create deltas. If that won't be a problem then this sounds like a decent idea.
I think this should work perfectly when using core / extra / community repos. Using testing repo complicates everything though, because packages are moved between repos. But is it really a big problem if the tools don't handle this situation automatically?
I don't know exactly how testing works so I can't say for sure, but I would guess it would work. Testing users would get 1.0-1.1, 1.1-1.2, 1.2-1.3, and core users would get 1.0-1.3 once the package moved.
Maybe we could allow a way to let the packager specify from which packages deltas should be created.
I think the automated way should work in most all cases. Do you have a scenario in mind where this would be useful? If these patches are accepted, the md5sum problem will be gone which means a packager can manually run the xdelta command if it is truly needed.
On Thu, Feb 28, 2008 at 07:31:39PM -0500, Nathan Jones wrote:
The problem is that the file that xdelta creates after compression differs from the gzipped version. Example:
$ xdelta delta old.gz new.gz delta $ xdelta patch delta old.gz new2.gz $ md5sum new* 2b77f5103345b4bd09ab09b4710d1a9d new.gz 48df25a60c1c689e9392b086a2d40e5c new2.gz $ gunzip *.gz $ md5sum new* aa898bcbb59cdf9066b5fb4645440e2c new aa898bcbb59cdf9066b5fb4645440e2c new2
The -n parameter is necessary because by default gzip will store the mtime in the header leading to different md5sums after the user applies the delta.
This wasn't really a problem before because the package would receive the new md5sum during the makepkg step, and the old md5sum would never be seen. Now, a packager would create the package with one md5sum, and repo-add would change it.
Oh ok, I see, that makes sense. Well, it might be a good idea to do it that way for now, and then later see if we rather want to work directly on the tar.
My comment here is rather on the previous code that you refactored than your patch itself. I really wish things could be much simpler here, it's all pretty complicated. What about dropping all this filenames parsing / version comparisons stuff, and try to reimplement this in a much more basic way in repo-add? Since repo-add removes an entry and add a new one in the database, we should be able to easily fetch the informations from these entries, and generate a delta between both. What do you think?
This would take away the ability for makepkg to create deltas. If that won't be a problem then this sounds like a decent idea.
I think this should work perfectly when using core / extra / community repos. Using testing repo complicates everything though, because packages are moved between repos. But is it really a big problem if the tools don't handle this situation automatically?
I don't know exactly how testing works so I can't say for sure, but I would guess it would work. Testing users would get 1.0-1.1, 1.1-1.2, 1.2-1.3, and core users would get 1.0-1.3 once the package moved.
Maybe we could allow a way to let the packager specify from which packages deltas should be created.
I think the automated way should work in most all cases. Do you have a scenario in mind where this would be useful?
If these patches are accepted, the md5sum problem will be gone which means a packager can manually run the xdelta command if it is truly needed.
Sorry if it wasn't clear, my two last comments were only in the case where xdelta support is only in repo-add. I am still thinking this is a good idea. In my opinion, it would allow a much simpler and cleaner implementation, while still having a decent behavior. And finally, there might not be any manual work required from the packager. Suppose we have the following files in a repo/ : repo.db.tar.gz (foo-2/{depends,desc,delta}) foo-2.pkg.tar.gz foo_1_to_2.delta Now, we put a new foo-3.pkg.tar.gz package in there. We create the db entry with : repo-add repo.db.tar.gz foo-3.pkg.tar.gz repo-add could do the following steps : 1) find the current foo-2 entry 2) get the foo-2 filename from foo-2/desc : foo-2.pkg.tar.gz 3) if foo-2.pkg.tar.gz exists, generate the foo_2_to_3.delta 4) clean up the foo-2/delta file by removing obsolete entries (delta files that are no longer available in the current repo directory) 5) add the new foo_2_to_3.delta to foo-2/delta, then save foo-2/delta somewhere 6) remove the old foo-2/ entry and create the new foo-3/ entry 7) restore the delta metainfo file to foo-3/ So at each package addition, we create and add at most one delta, if an old package existed. And we keep the old deltas. Also, in the code added for delta support, there is no need for any filenames guessing. Well, I admit I am not sure it all makes sense yet, there might be things I overlooked, or maybe it won't be as simple as I think it will. I would need to try implementing it to be sure.
participants (2)
-
Nathan Jones
-
Xavier