[pacman-dev] [PATCH] Write mtree of package files to local database

Dan McGee dpmcgee at gmail.com
Thu Feb 23 02:09:38 EST 2012


On Tue, Feb 21, 2012 at 10:02 PM, Allan McRae <allan at archlinux.org> wrote:
> When installing a package, write an mtree of the package files into
> the local database.  This will be useful for doing validation of all
> files on a system.
>
> Signed-off-by: Allan McRae <allan at archlinux.org>
> ---
>
> Query: should we keep the info on .INSTALL and .CHANGELOG files?  Changing a
> .INSTALL file would be an interesting tactic, but if someone is doing that then
> they can already adjust the mtree file...
>
> Also, from http://goo.gl/Uq6X5 it appears that this could be made more efficient
> by reusing the file descriptor, but I could not get that working after many, many,
> many attempts.
Did you rewind the file descriptor? You should just have to call
`lseek(fd, 0, SEEK_SET)` first. Of course, since the current version
of _alpm_open_archive does both the open() and archive_read_new()
business, the abstraction there would have to change.

With that said, not having to decompress everything twice would also
be a win; I saw some chatter about this on IRC but I would definitely
prefer to not iterate again; removing the iteration from the diskspace
sped it up enough that I enabled that by default; I don't want to lose
those gains.

-Dan

>  lib/libalpm/add.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 68 insertions(+), 0 deletions(-)
>
> diff --git a/lib/libalpm/add.c b/lib/libalpm/add.c
> index b007766..31313a7 100644
> --- a/lib/libalpm/add.c
> +++ b/lib/libalpm/add.c
> @@ -449,6 +449,68 @@ needbackup_cleanup:
>        return errors;
>  }
>
> +static int write_package_mtree(alpm_handle_t *handle, alpm_pkg_t *pkg, const char *file)
> +{
> +       char *path;
> +       struct archive *archive, *mtree;
> +       struct archive_entry *entry;
> +       struct stat buf;
> +       char buffer[ALPM_BUFFER_SIZE];
> +       int fd, ret = 0, read = 0;
> +
> +       _alpm_log(handle, ALPM_LOG_DEBUG, "creating package mtree\n");
> +
> +       fd = _alpm_open_archive(handle, file, &buf,
> +                       &archive, ALPM_ERR_PKG_OPEN);
> +       if(fd < 0) {
> +               ret = ALPM_ERR_PKG_OPEN;
> +               goto error;
> +       }
> +
> +       if((mtree = archive_write_new()) == NULL) {
> +               ret = ALPM_ERR_LIBARCHIVE;
> +               archive_read_finish(archive);
> +               goto error;
You're leaking the read archive here.

> +       }
> +
> +       archive_write_set_format_mtree(mtree);
> +       archive_write_set_compression_gzip(mtree);
> +
> +       /* output the type, uid, gid, mode, size, time, md5 and link fields */
> +       archive_write_set_options(mtree, "use-set,!device,!flags,!gname,!nlink,!uname,md5");
Did 'use-set' end up being a net-win on size and/or speed?

> +
> +       path = _alpm_local_db_pkgpath(handle->db_local, pkg, "files.mtree");
> +
> +       if(archive_write_open_filename(mtree, path) != ARCHIVE_OK) {
> +               _alpm_log(handle, ALPM_LOG_ERROR, _("could not open file %s: %s\n"),
> +                               path, archive_error_string(archive));
> +               ret =  ALPM_ERR_DB_WRITE;
Extra space snuck in here.

> +               goto cleanup;
> +       }
> +
> +       while(archive_read_next_header(archive, &entry) == ARCHIVE_OK) {
> +               const char *entryname = archive_entry_pathname(entry);
> +               if(*entryname != '.') {
> +                       archive_write_header(mtree, entry);
> +                       while ((read = archive_read_data(archive, &buffer, ALPM_BUFFER_SIZE)) > 0) {
> +                               archive_write_data(mtree, buffer, read);
Something tells me we should be checking the return value of this call
and handling it accordingly. We check the read calls every time, but I
would expect the write calls to have a higher chance of failing in
real life.
> +                       }
> +               }
> +       }
> +
> +cleanup:
> +       archive_read_finish(archive);
> +       archive_write_finish(mtree);
> +       free(path);
You're leaking the fd.

> +
> +       if(ret == 0) {
> +               return 0;
> +       }
> +
> +error:
> +       RET_ERR(handle, ret, -1);
This had me confused for a bit; I wouldn't call the variable that ends
up being stashed on pm_errno "ret"; when in fact -1 is always returned
from here, not ret.
> +}
> +
>  static int commit_single_pkg(alpm_handle_t *handle, alpm_pkg_t *newpkg,
>                size_t pkg_current, size_t pkg_count)
>  {
> @@ -589,6 +651,12 @@ static int commit_single_pkg(alpm_handle_t *handle, alpm_pkg_t *newpkg,
>                archive_read_finish(archive);
>                CLOSE(fd);
>
> +               if(write_package_mtree(handle, newpkg, pkgfile) == -1)
> +               {
> +                       _alpm_log(handle, ALPM_LOG_WARNING,
> +                               _("could not create mtree file (%s)\n"), alpm_strerror(handle->pm_errno));
> +               }
> +
>                /* restore the old cwd if we have it */
>                if(cwdfd >= 0) {
>                        if(fchdir(cwdfd) != 0) {
> --
> 1.7.9.1
>
>


More information about the pacman-dev mailing list