[pacman-dev] [PATCH 0/2] Remove REQUIREDBY usage
This is mainly an RFC on the whole idea. Nagy, you made me think enough about it that I just went ahead and whipped something up. I'd first like to say that I did this in the quickest way possible, so there is no guarantee I did everything right. I also did nothing to optimize for efficiency. In some cases, we could be calling the alpm_pkg_compute_requiredby() function way more than we should, because I basically replaced any call to alpm_pkg_get_requiredby() with a corresponding alpm_pkg_compute_requiredby() and free(). Note that we should definitely find a way to pactest or otherwise test the computed requiredby entries. Some quick observations: $ time pacman -Qt > /dev/null (pacman 3, latest GIT release) real 0m0.084s user 0m0.020s sys 0m0.043s $ time ./src/pacman/pacman -Qt > /dev/null (with compute_requiredby switch) real 0m1.893s user 0m1.800s sys 0m0.037s Obviously a slowdown, but is it all that bad in the big scheme of things when corrupted requiredby entries no longer hurt us? Also: $ pacman -Qt > orphans-old $ ./src/pacman/pacman -Qt > orphans-new $ diff orphans-old orphans-new 5a6
agg 2.5-2 121a123 libnet 1.1.2.1-1 144a147 mysql 5.0.45-1 184a188 python-eyed3 0.6.14-1 237a242 ttf-bitstream-vera 1.10-5 306a312 xmlsec 1.2.10-3
At first, I was worried. Then I see that the new code is much better! openoffice-base has at least two stale entries in the requiredby fields of agg and xmlsec, while libnet is stale from being an old depend of ettercap. So yeah, this new code helps us in a lot of places (everywhere but speed). -Dan
From: Dan McGee <dan@archlinux.org> Instead of using the often-busted REQUIREDBY entries in the pacman database, compute them each time they are required. This should help many things: 1. Simplify the codebase 2. Prevent future database corruption 3. Ensure when we do use requiredby, it is always correct 4. Shrink the pmpkg_t memory overhead Signed-off-by: Dan McGee <dan@archlinux.org> --- lib/libalpm/add.c | 7 ----- lib/libalpm/alpm.h | 1 - lib/libalpm/be_files.c | 11 -------- lib/libalpm/db.c | 1 - lib/libalpm/deps.c | 23 ++++++++++++---- lib/libalpm/package.c | 23 ----------------- lib/libalpm/package.h | 2 - lib/libalpm/remove.c | 9 +------ lib/libalpm/sync.c | 49 ------------------------------------ lib/libalpm/trans.c | 65 ------------------------------------------------ lib/libalpm/trans.h | 1 - src/pacman/package.c | 4 ++- src/pacman/query.c | 5 +++- src/util/testdb.c | 8 +++--- 14 files changed, 29 insertions(+), 180 deletions(-) diff --git a/lib/libalpm/add.c b/lib/libalpm/add.c index 277293f..d242134 100644 --- a/lib/libalpm/add.c +++ b/lib/libalpm/add.c @@ -804,10 +804,6 @@ static int commit_single_pkg(pmpkg_t *newpkg, int pkg_current, int pkg_count, } } - /* Update the requiredby field by scanning the whole database - * looking for packages depending on the package to add */ - _alpm_pkg_update_requiredby(newpkg); - /* make an install date (in UTC) */ newpkg->installdate = time(NULL); @@ -827,9 +823,6 @@ static int commit_single_pkg(pmpkg_t *newpkg, int pkg_current, int pkg_count, alpm_pkg_get_name(newpkg)); } - /* update dependency packages' REQUIREDBY fields */ - _alpm_trans_update_depends(trans, newpkg); - if(is_upgrade) { PROGRESS(trans, PM_TRANS_PROGRESS_UPGRADE_START, alpm_pkg_get_name(newpkg), 100, pkg_count, pkg_current); diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h index cdece2d..98d7ba1 100644 --- a/lib/libalpm/alpm.h +++ b/lib/libalpm/alpm.h @@ -211,7 +211,6 @@ alpm_list_t *alpm_pkg_get_licenses(pmpkg_t *pkg); alpm_list_t *alpm_pkg_get_groups(pmpkg_t *pkg); alpm_list_t *alpm_pkg_get_depends(pmpkg_t *pkg); alpm_list_t *alpm_pkg_get_optdepends(pmpkg_t *pkg); -alpm_list_t *alpm_pkg_get_requiredby(pmpkg_t *pkg); alpm_list_t *alpm_pkg_get_conflicts(pmpkg_t *pkg); alpm_list_t *alpm_pkg_get_provides(pmpkg_t *pkg); alpm_list_t *alpm_pkg_get_deltas(pmpkg_t *pkg); diff --git a/lib/libalpm/be_files.c b/lib/libalpm/be_files.c index 2f88a16..a6cde68 100644 --- a/lib/libalpm/be_files.c +++ b/lib/libalpm/be_files.c @@ -461,10 +461,6 @@ int _alpm_db_read(pmdb_t *db, pmpkg_t *info, pmdbinfrq_t inforeq) while(fgets(line, 512, fp) && strlen(_alpm_strtrim(line))) { info->optdepends = alpm_list_add(info->optdepends, strdup(line)); } - } else if(!strcmp(line, "%REQUIREDBY%")) { - while(fgets(line, 512, fp) && strlen(_alpm_strtrim(line))) { - info->requiredby = alpm_list_add(info->requiredby, strdup(line)); - } } else if(!strcmp(line, "%CONFLICTS%")) { while(fgets(line, 512, fp) && strlen(_alpm_strtrim(line))) { info->conflicts = alpm_list_add(info->conflicts, strdup(line)); @@ -685,13 +681,6 @@ int _alpm_db_write(pmdb_t *db, pmpkg_t *info, pmdbinfrq_t inforeq) } fprintf(fp, "\n"); } - if(local && info->requiredby) { - fputs("%REQUIREDBY%\n", fp); - for(lp = info->requiredby; lp; lp = lp->next) { - fprintf(fp, "%s\n", (char *)lp->data); - } - fprintf(fp, "\n"); - } if(info->conflicts) { fputs("%CONFLICTS%\n", fp); for(lp = info->conflicts; lp; lp = lp->next) { diff --git a/lib/libalpm/db.c b/lib/libalpm/db.c index 599d24d..150b365 100644 --- a/lib/libalpm/db.c +++ b/lib/libalpm/db.c @@ -475,7 +475,6 @@ alpm_list_t SYMEXPORT *alpm_db_get_upgrades(void) pm_errno = PM_ERR_MEMORY; goto error; } - dummy->requiredby = alpm_list_strdup(alpm_pkg_get_requiredby(lpkg)); pmsyncpkg_t *syncpkg; syncpkg = _alpm_sync_find(syncpkgs, alpm_pkg_get_name(spkg)); diff --git a/lib/libalpm/deps.c b/lib/libalpm/deps.c index cefffe5..aa05dd2 100644 --- a/lib/libalpm/deps.c +++ b/lib/libalpm/deps.c @@ -254,12 +254,13 @@ alpm_list_t *_alpm_checkdeps(pmdb_t *db, pmtranstype_t op, } if(op == PM_TRANS_TYPE_UPGRADE) { - /* PM_TRANS_TYPE_UPGRADE handles the backwards dependencies, ie, the packages - * listed in the requiredby field. + /* PM_TRANS_TYPE_UPGRADE handles the backwards dependencies, ie, + * the packages listed in the requiredby field. */ for(i = packages; i; i = i->next) { pmpkg_t *newpkg = i->data; pmpkg_t *oldpkg; + alpm_list_t *requiredby; if(newpkg == NULL) { _alpm_log(PM_LOG_DEBUG, "null package found in package list\n"); continue; @@ -272,7 +273,9 @@ alpm_list_t *_alpm_checkdeps(pmdb_t *db, pmtranstype_t op, alpm_pkg_get_name(newpkg)); continue; } - for(j = alpm_pkg_get_requiredby(oldpkg); j; j = j->next) { + + requiredby = alpm_pkg_compute_requiredby(oldpkg); + for(j = requiredby; j; j = j->next) { pmpkg_t *p; found = 0; @@ -340,6 +343,7 @@ alpm_list_t *_alpm_checkdeps(pmdb_t *db, pmtranstype_t op, FREE(depend); } } + FREE(requiredby); } } if(op == PM_TRANS_TYPE_ADD || op == PM_TRANS_TYPE_UPGRADE) { @@ -394,12 +398,15 @@ alpm_list_t *_alpm_checkdeps(pmdb_t *db, pmtranstype_t op, /* check requiredby fields */ for(i = packages; i; i = i->next) { pmpkg_t *rmpkg = alpm_list_getdata(i); + alpm_list_t *requiredby; if(rmpkg == NULL) { _alpm_log(PM_LOG_DEBUG, "null package found in package list\n"); continue; } - for(j = alpm_pkg_get_requiredby(rmpkg); j; j = j->next) { + + requiredby = alpm_pkg_compute_requiredby(rmpkg); + for(j = requiredby; j; j = j->next) { pmpkg_t *p; found = 0; if(_alpm_pkg_find(j->data, packages)) { @@ -446,6 +453,7 @@ alpm_list_t *_alpm_checkdeps(pmdb_t *db, pmtranstype_t op, FREE(depend); } } + FREE(requiredby); } } @@ -559,7 +567,7 @@ pmdepend_t SYMEXPORT *alpm_splitdep(const char *depstring) static int can_remove_package(pmdb_t *db, pmpkg_t *pkg, alpm_list_t *targets, int include_explicit) { - alpm_list_t *i; + alpm_list_t *i, *requiredby; if(_alpm_pkg_find(alpm_pkg_get_name(pkg), targets)) { return(0); @@ -581,12 +589,15 @@ static int can_remove_package(pmdb_t *db, pmpkg_t *pkg, alpm_list_t *targets, * if checkdeps detected it would break something */ /* see if other packages need it */ - for(i = alpm_pkg_get_requiredby(pkg); i; i = i->next) { + requiredby = alpm_pkg_compute_requiredby(pkg); + for(i = requiredby; i; i = i->next) { pmpkg_t *reqpkg = _alpm_db_get_pkgfromcache(db, i->data); if(reqpkg && !_alpm_pkg_find(alpm_pkg_get_name(reqpkg), targets)) { + FREE(requiredby); return(0); } } + FREE(requiredby); /* it's ok to remove */ return(1); diff --git a/lib/libalpm/package.c b/lib/libalpm/package.c index 7c2870c..89785a1 100644 --- a/lib/libalpm/package.c +++ b/lib/libalpm/package.c @@ -405,20 +405,6 @@ alpm_list_t SYMEXPORT *alpm_pkg_get_optdepends(pmpkg_t *pkg) return pkg->optdepends; } -alpm_list_t SYMEXPORT *alpm_pkg_get_requiredby(pmpkg_t *pkg) -{ - ALPM_LOG_FUNC; - - /* Sanity checks */ - ASSERT(handle != NULL, return(NULL)); - ASSERT(pkg != NULL, return(NULL)); - - if(pkg->origin == PKG_FROM_CACHE && !(pkg->infolevel & INFRQ_DEPENDS)) { - _alpm_db_read(pkg->origin_data.db, pkg, INFRQ_DEPENDS); - } - return pkg->requiredby; -} - alpm_list_t SYMEXPORT *alpm_pkg_get_conflicts(pmpkg_t *pkg) { ALPM_LOG_FUNC; @@ -695,7 +681,6 @@ pmpkg_t *_alpm_pkg_dup(pmpkg_t *pkg) memcpy(newpkg, pkg, sizeof(pmpkg_t)); newpkg->licenses = alpm_list_strdup(alpm_pkg_get_licenses(pkg)); - newpkg->requiredby = alpm_list_strdup(alpm_pkg_get_requiredby(pkg)); newpkg->conflicts = alpm_list_strdup(alpm_pkg_get_conflicts(pkg)); newpkg->files = alpm_list_strdup(alpm_pkg_get_files(pkg)); newpkg->backup = alpm_list_strdup(alpm_pkg_get_backup(pkg)); @@ -729,7 +714,6 @@ void _alpm_pkg_free(pmpkg_t *pkg) FREELIST(pkg->depends); FREELIST(pkg->optdepends); FREELIST(pkg->conflicts); - FREELIST(pkg->requiredby); FREELIST(pkg->groups); FREELIST(pkg->provides); FREELIST(pkg->replaces); @@ -1098,13 +1082,6 @@ pmpkg_t *_alpm_pkg_find(const char *needle, alpm_list_t *haystack) return(NULL); } -/* fill in requiredby field of package, - * used when we want to install or add a package */ -void _alpm_pkg_update_requiredby(pmpkg_t *pkg) -{ - pkg->requiredby = alpm_pkg_compute_requiredby(pkg); -} - /* TODO this should either be public, or done somewhere else */ int _alpm_pkg_istoonew(pmpkg_t *pkg) { diff --git a/lib/libalpm/package.h b/lib/libalpm/package.h index ab93fd8..b125e06 100644 --- a/lib/libalpm/package.h +++ b/lib/libalpm/package.h @@ -72,7 +72,6 @@ struct __pmpkg_t { alpm_list_t *backup; alpm_list_t *depends; alpm_list_t *optdepends; - alpm_list_t *requiredby; alpm_list_t *conflicts; alpm_list_t *provides; alpm_list_t *deltas; @@ -98,7 +97,6 @@ int alpm_pkg_compare_versions(pmpkg_t *local_pkg, pmpkg_t *pkg); pmpkg_t *_alpm_pkg_load(const char *pkgfile, unsigned short full); pmpkg_t *_alpm_pkg_find(const char *needle, alpm_list_t *haystack); int _alpm_pkg_istoonew(pmpkg_t *pkg); -void _alpm_pkg_update_requiredby(pmpkg_t *pkg); int _alpm_pkg_should_ignore(pmpkg_t *pkg); #endif /* _ALPM_PACKAGE_H */ diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index c4e4439..cf4fbff 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -249,7 +249,7 @@ static void unlink_file(pmpkg_t *info, alpm_list_t *lp, pmtrans_t *trans) int _alpm_remove_commit(pmtrans_t *trans, pmdb_t *db) { - pmpkg_t *info, *infodup; + pmpkg_t *info; alpm_list_t *targ, *lp; int pkg_count; @@ -331,9 +331,6 @@ int _alpm_remove_commit(pmtrans_t *trans, pmdb_t *db) } } - /* duplicate the package so we can remove the requiredby fields later */ - infodup = _alpm_pkg_dup(info); - /* remove the package from the database */ _alpm_log(PM_LOG_DEBUG, "updating database\n"); _alpm_log(PM_LOG_DEBUG, "removing database entry '%s'\n", pkgname); @@ -347,10 +344,6 @@ int _alpm_remove_commit(pmtrans_t *trans, pmdb_t *db) pkgname); } - /* update dependency packages' REQUIREDBY fields */ - _alpm_trans_update_depends(trans, infodup); - _alpm_pkg_free(infodup); - /* call a done event if this isn't an upgrade */ if(trans->type != PM_TRANS_TYPE_REMOVEUPGRADE) { EVENT(trans, PM_TRANS_EVT_REMOVE_DONE, info, NULL); diff --git a/lib/libalpm/sync.c b/lib/libalpm/sync.c index 336b584..e5748e6 100644 --- a/lib/libalpm/sync.c +++ b/lib/libalpm/sync.c @@ -137,7 +137,6 @@ static int find_replacements(pmtrans_t *trans, pmdb_t *db_local, pm_errno = PM_ERR_MEMORY; goto error; } - dummy->requiredby = alpm_list_strdup(alpm_pkg_get_requiredby(lpkg)); /* check if spkg->name is already in the packages list. */ sync = _alpm_sync_find(trans->packages, alpm_pkg_get_name(spkg)); if(sync) { @@ -563,7 +562,6 @@ int _alpm_sync_prepare(pmtrans_t *trans, pmdb_t *db_local, alpm_list_t *dbs_sync asked = alpm_list_add(asked, strdup(miss->depend.name)); if(doremove) { pmpkg_t *q = _alpm_pkg_dup(local); - q->requiredby = alpm_list_strdup(alpm_pkg_get_requiredby(local)); if(sync->type != PM_SYNC_TYPE_REPLACE) { /* switch this sync type to REPLACE */ sync->type = PM_SYNC_TYPE_REPLACE; @@ -1241,53 +1239,6 @@ int _alpm_sync_commit(pmtrans_t *trans, pmdb_t *db_local, alpm_list_t **data) _alpm_trans_free(tr); tr = NULL; - /* propagate replaced packages' requiredby fields to their new owners */ - if(replaces) { - _alpm_log(PM_LOG_DEBUG, "updating database for replaced packages' dependencies\n"); - for(i = trans->packages; i; i = i->next) { - pmsyncpkg_t *sync = i->data; - if(sync->type == PM_SYNC_TYPE_REPLACE) { - alpm_list_t *j; - pmpkg_t *new = _alpm_db_get_pkgfromcache(db_local, alpm_pkg_get_name(sync->pkg)); - for(j = sync->data; j; j = j->next) { - alpm_list_t *k; - pmpkg_t *old = j->data; - /* merge lists */ - for(k = alpm_pkg_get_requiredby(old); k; k = k->next) { - if(!alpm_list_find_str(alpm_pkg_get_requiredby(new), k->data)) { - /* replace old's name with new's name in the requiredby's dependency list */ - alpm_list_t *m; - pmpkg_t *depender = _alpm_db_get_pkgfromcache(db_local, k->data); - if(depender == NULL) { - /* If the depending package no longer exists in the local db, - * then it must have ALSO conflicted with sync->pkg. If - * that's the case, then we don't have anything to propagate - * here. */ - continue; - } - for(m = alpm_pkg_get_depends(depender); m; m = m->next) { - if(!strcmp(m->data, alpm_pkg_get_name(old))) { - FREE(m->data); - m->data = strdup(alpm_pkg_get_name(new)); - } - } - if(_alpm_db_write(db_local, depender, INFRQ_DEPENDS) == -1) { - _alpm_log(PM_LOG_ERROR, _("could not update requiredby for database entry %s-%s\n"), - alpm_pkg_get_name(new), alpm_pkg_get_version(new)); - } - /* add the new requiredby */ - new->requiredby = alpm_list_add(alpm_pkg_get_requiredby(new), strdup(k->data)); - } - } - } - if(_alpm_db_write(db_local, new, INFRQ_DEPENDS) == -1) { - _alpm_log(PM_LOG_ERROR, _("could not update new database entry %s-%s\n"), - alpm_pkg_get_name(new), alpm_pkg_get_version(new)); - } - } - } - } - return(0); error: diff --git a/lib/libalpm/trans.c b/lib/libalpm/trans.c index a5fba84..bcd0707 100644 --- a/lib/libalpm/trans.c +++ b/lib/libalpm/trans.c @@ -438,71 +438,6 @@ int _alpm_trans_commit(pmtrans_t *trans, alpm_list_t **data) return(0); } -/* A depends on B through n depends <=> A listed in B's requiredby n times - * n == 0 or 1 in almost all cases */ -int _alpm_trans_update_depends(pmtrans_t *trans, pmpkg_t *pkg) -{ - alpm_list_t *i, *j; - alpm_list_t *depends = NULL; - const char *pkgname; - pmdb_t *localdb; - - ALPM_LOG_FUNC; - - /* Sanity checks */ - ASSERT(trans != NULL, RET_ERR(PM_ERR_TRANS_NULL, -1)); - ASSERT(pkg != NULL, RET_ERR(PM_ERR_PKG_INVALID, -1)); - - pkgname = alpm_pkg_get_name(pkg); - depends = alpm_pkg_get_depends(pkg); - - if(depends) { - _alpm_log(PM_LOG_DEBUG, "updating dependency packages 'requiredby' fields for %s-%s\n", - pkgname, pkg->version); - } else { - _alpm_log(PM_LOG_DEBUG, "package has no dependencies, no other packages to update\n"); - } - - localdb = alpm_option_get_localdb(); - for(i = depends; i; i = i->next) { - if(!i->data) { - continue; - } - pmdepend_t* dep = alpm_splitdep(i->data); - if(dep == NULL) { - continue; - } - for(j = _alpm_db_get_pkgcache(localdb); j; j = j->next) { - pmpkg_t *deppkg = j->data; - if(deppkg && alpm_depcmp(deppkg, dep)) { - /* this is cheating... we call this function to populate the package */ - alpm_list_t *rqdby = alpm_pkg_get_requiredby(deppkg); - - _alpm_log(PM_LOG_DEBUG, "updating 'requiredby' field for package '%s'\n", - alpm_pkg_get_name(deppkg)); - - if(trans->type == PM_TRANS_TYPE_REMOVE - || trans->type == PM_TRANS_TYPE_REMOVEUPGRADE) { - void *data = NULL; - rqdby = alpm_list_remove(rqdby, pkgname, _alpm_str_cmp, &data); - FREE(data); - deppkg->requiredby = rqdby; - } else { - rqdby = alpm_list_add(rqdby, strdup(pkgname)); - deppkg->requiredby = rqdby; - } - - if(_alpm_db_write(localdb, deppkg, INFRQ_DEPENDS)) { - _alpm_log(PM_LOG_ERROR, _("could not update 'requiredby' database entry %s-%s\n"), - alpm_pkg_get_name(deppkg), alpm_pkg_get_version(deppkg)); - } - } - } - FREE(dep); - } - return(0); -} - /* A cheap grep for text files, returns 1 if a substring * was found in the text file fn, 0 if it wasn't */ diff --git a/lib/libalpm/trans.h b/lib/libalpm/trans.h index f357589..79b8c20 100644 --- a/lib/libalpm/trans.h +++ b/lib/libalpm/trans.h @@ -78,7 +78,6 @@ int _alpm_trans_sysupgrade(pmtrans_t *trans); int _alpm_trans_addtarget(pmtrans_t *trans, char *target); int _alpm_trans_prepare(pmtrans_t *trans, alpm_list_t **data); int _alpm_trans_commit(pmtrans_t *trans, alpm_list_t **data); -int _alpm_trans_update_depends(pmtrans_t *trans, pmpkg_t *pkg); int _alpm_runscriptlet(const char *root, const char *installfn, const char *script, const char *ver, const char *oldver, pmtrans_t *trans); diff --git a/src/pacman/package.c b/src/pacman/package.c index caaed46..7b52b5b 100644 --- a/src/pacman/package.c +++ b/src/pacman/package.c @@ -82,7 +82,9 @@ void dump_pkg_full(pmpkg_t *pkg, int level) list_display(_("Optional Deps :"), alpm_pkg_get_optdepends(pkg)); /* Only applicable if installed */ if(level > 0) { - list_display(_("Required By :"), alpm_pkg_get_requiredby(pkg)); + alpm_list_t *requiredby = alpm_pkg_compute_requiredby(pkg); + list_display(_("Required By :"), requiredby); + free(requiredby); } list_display(_("Conflicts With :"), alpm_pkg_get_conflicts(pkg)); list_display(_("Replaces :"), alpm_pkg_get_replaces(pkg)); diff --git a/src/pacman/query.c b/src/pacman/query.c index b55221a..df98a29 100644 --- a/src/pacman/query.c +++ b/src/pacman/query.c @@ -248,9 +248,12 @@ static int is_foreign(pmpkg_t *pkg) static int is_orphan(pmpkg_t *pkg) { - if(alpm_pkg_get_requiredby(pkg) == NULL) { + alpm_list_t *requiredby = alpm_pkg_compute_requiredby(pkg); + if(requiredby == NULL) { + free(requiredby); return(1); } + free(requiredby); return(0); } diff --git a/src/util/testdb.c b/src/util/testdb.c index 421a553..c0f9d55 100644 --- a/src/util/testdb.c +++ b/src/util/testdb.c @@ -39,7 +39,7 @@ int str_cmp(const void *s1, const void *s2) return(strcmp(s1, s2)); } -static void diffrqdby(const char *pkgname, alpm_list_t *oldrqdby, alpm_list_t *newrqdby) +/*static void diffrqdby(const char *pkgname, alpm_list_t *oldrqdby, alpm_list_t *newrqdby) { oldrqdby = alpm_list_msort(oldrqdby, alpm_list_count(oldrqdby), str_cmp); newrqdby = alpm_list_msort(newrqdby, alpm_list_count(newrqdby), str_cmp); @@ -73,7 +73,7 @@ static void diffrqdby(const char *pkgname, alpm_list_t *oldrqdby, alpm_list_t *n j = j->next; } } -} +}*/ static void cleanup(int signum) { if(alpm_release() == -1) { @@ -187,12 +187,12 @@ int main(int argc, char **argv) } /* check requiredby */ - for(i = alpm_db_getpkgcache(db); i; i = alpm_list_next(i)) { + /*for(i = alpm_db_getpkgcache(db); i; i = alpm_list_next(i)) { pmpkg_t *pkg = alpm_list_getdata(i); const char *pkgname = alpm_pkg_get_name(pkg); alpm_list_t *rqdby = alpm_pkg_compute_requiredby(pkg); diffrqdby(pkgname, alpm_pkg_get_requiredby(pkg), rqdby); - } + }*/ cleanup(retval); } -- 1.5.3.5
From: Dan McGee <dan@archlinux.org> Remove any checks dealing with requiredby from pactest (but not actually from the pactests themselves). Of course, we should probably find a new way to check requiredby values of packages since there is no guarantee our code is working perfectly. Signed-off-by: Dan McGee <dan@archlinux.org> --- pactest/pmdb.py | 7 +------ pactest/pmpkg.py | 1 - pactest/pmrule.py | 3 --- 3 files changed, 1 insertions(+), 10 deletions(-) diff --git a/pactest/pmdb.py b/pactest/pmdb.py index ebde324..f927b4b 100755 --- a/pactest/pmdb.py +++ b/pactest/pmdb.py @@ -189,8 +189,6 @@ class pmdb: pkg.depends = _getsection(fd) elif line == "%OPTDEPENDS%": pkg.optdepends = _getsection(fd) - elif line == "%REQUIREDBY%": - pkg.requiredby = _getsection(fd) elif line == "%CONFLICTS%": pkg.conflicts = _getsection(fd) elif line == "%PROVIDES%": @@ -288,16 +286,13 @@ class pmdb: pkg.mtime["files"] = getmtime(filename) # depends - # for local db entries: depends, requiredby, conflicts, provides + # for local db entries: depends, conflicts, provides # for sync ones: depends, conflicts, provides data = [] if pkg.depends: data.append(_mksection("DEPENDS", pkg.depends)) if pkg.optdepends: data.append(_mksection("OPTDEPENDS", pkg.optdepends)) - if self.treename == "local": - if pkg.requiredby: - data.append(_mksection("REQUIREDBY", pkg.requiredby)) if pkg.conflicts: data.append(_mksection("CONFLICTS", pkg.conflicts)) if pkg.provides: diff --git a/pactest/pmpkg.py b/pactest/pmpkg.py index 410423d..7b8f81d 100755 --- a/pactest/pmpkg.py +++ b/pactest/pmpkg.py @@ -54,7 +54,6 @@ class pmpkg: # depends self.depends = [] self.optdepends = [] - self.requiredby = [] # local only self.conflicts = [] self.provides = [] # files diff --git a/pactest/pmrule.py b/pactest/pmrule.py index c5682dc..8b49f5c 100755 --- a/pactest/pmrule.py +++ b/pactest/pmrule.py @@ -86,9 +86,6 @@ class pmrule: elif case == "OPTDEPENDS": if not value in newpkg.optdepends: success = 0 - elif case == "REQUIREDBY": - if not value in newpkg.requiredby: - success = 0 elif case == "REASON": if newpkg.reason != int(value): success = 0 -- 1.5.3.5
On Mon, Nov 12, 2007 at 08:07:25PM -0600, Dan McGee wrote:
From: Dan McGee <dan@archlinux.org>
Instead of using the often-busted REQUIREDBY entries in the pacman database, compute them each time they are required. This should help many things:
1. Simplify the codebase 2. Prevent future database corruption 3. Ensure when we do use requiredby, it is always correct 4. Shrink the pmpkg_t memory overhead
Signed-off-by: Dan McGee <dan@archlinux.org> --- lib/libalpm/add.c | 7 ----- lib/libalpm/alpm.h | 1 - lib/libalpm/be_files.c | 11 -------- lib/libalpm/db.c | 1 - lib/libalpm/deps.c | 23 ++++++++++++---- lib/libalpm/package.c | 23 ----------------- lib/libalpm/package.h | 2 - lib/libalpm/remove.c | 9 +------ lib/libalpm/sync.c | 49 ------------------------------------ lib/libalpm/trans.c | 65 ------------------------------------------------ lib/libalpm/trans.h | 1 - src/pacman/package.c | 4 ++- src/pacman/query.c | 5 +++- src/util/testdb.c | 8 +++--- 14 files changed, 29 insertions(+), 180 deletions(-)
It all looks good to me :) 150 lines removed, eheh. The only change I found a bit strange was the testdb one, but I see you already changed that on your branch. As you noticed, a lot of code in testdb is obsolete now. So even more lines removed :)
On Nov 12, 2007 8:01 PM, Dan McGee <dpmcgee@gmail.com> wrote:
Some quick observations: $ time pacman -Qt > /dev/null (pacman 3, latest GIT release) real 0m0.084s user 0m0.020s sys 0m0.043s
$ time ./src/pacman/pacman -Qt > /dev/null (with compute_requiredby switch) real 0m1.893s user 0m1.800s sys 0m0.037s
Obviously a slowdown, but is it all that bad in the big scheme of things when corrupted requiredby entries no longer hurt us?
I ran valgrind --tool=callgrind on the above command to see where our hotpoint is in the code, and it is quite clear- alpm_splitdeps takes almost 70% of the total time. That is wildly inefficient for such a small function, and that is after I even did a little optimization to it. So if we are looking to focus our energy on anything, it is this one function that needs it OR at least figuring out how we could better implement it. Perhaps we should split all the deps upon package loading instead? Considering I doubt their are 1.3 million dep entries in our DB (the amount of times alpm_splitdeps is called), this would really help if it is possible. -Dan
On Nov 12, 2007 9:24 PM, Dan McGee <dpmcgee@gmail.com> wrote:
On Nov 12, 2007 8:01 PM, Dan McGee <dpmcgee@gmail.com> wrote:
Some quick observations: $ time pacman -Qt > /dev/null (pacman 3, latest GIT release) real 0m0.084s user 0m0.020s sys 0m0.043s
$ time ./src/pacman/pacman -Qt > /dev/null (with compute_requiredby switch) real 0m1.893s user 0m1.800s sys 0m0.037s
Obviously a slowdown, but is it all that bad in the big scheme of things when corrupted requiredby entries no longer hurt us?
I ran valgrind --tool=callgrind on the above command to see where our hotpoint is in the code, and it is quite clear- alpm_splitdeps takes almost 70% of the total time. That is wildly inefficient for such a small function, and that is after I even did a little optimization to it. So if we are looking to focus our energy on anything, it is this one function that needs it OR at least figuring out how we could better implement it. Perhaps we should split all the deps upon package loading instead? Considering I doubt their are 1.3 million dep entries in our DB (the amount of times alpm_splitdeps is called), this would really help if it is possible.
And continuing the trend of replying to myself, and showing why optimizing the small little parts of code can make a huge difference: $ time pacman -Qt > /dev/null real 0m0.089s user 0m0.023s sys 0m0.040s $ time ./src/pacman/pacman -Qt > /dev/null real 0m0.377s user 0m0.277s sys 0m0.063s This is a 5x increase in speed by making the change I described above in the previous email. I'll push a branch to my GIT repo tonight detailing the changes. -Dan
On Mon, Nov 12, 2007 at 11:28:33PM -0600, Dan McGee wrote:
And continuing the trend of replying to myself, and showing why optimizing the small little parts of code can make a huge difference: $ time pacman -Qt > /dev/null real 0m0.089s user 0m0.023s sys 0m0.040s
$ time ./src/pacman/pacman -Qt > /dev/null real 0m0.377s user 0m0.277s sys 0m0.063s
This is a 5x increase in speed by making the change I described above in the previous email. I'll push a branch to my GIT repo tonight detailing the changes.
Wow, that's an impressive speed up you made there. Great work! The inefficiency of Qt will be much less noticeable now.
Idézés Dan McGee <dpmcgee@gmail.com>:
On Nov 12, 2007 9:24 PM, Dan McGee <dpmcgee@gmail.com> wrote:
On Nov 12, 2007 8:01 PM, Dan McGee <dpmcgee@gmail.com> wrote:
Some quick observations: $ time pacman -Qt > /dev/null (pacman 3, latest GIT release) real 0m0.084s user 0m0.020s sys 0m0.043s
$ time ./src/pacman/pacman -Qt > /dev/null (with compute_requiredby switch) real 0m1.893s user 0m1.800s sys 0m0.037s
Obviously a slowdown, but is it all that bad in the big scheme of things when corrupted requiredby entries no longer hurt us?
I ran valgrind --tool=callgrind on the above command to see where our hotpoint is in the code, and it is quite clear- alpm_splitdeps takes almost 70% of the total time. That is wildly inefficient for such a small function, and that is after I even did a little optimization to it. So if we are looking to focus our energy on anything, it is this one function that needs it OR at least figuring out how we could better implement it. Perhaps we should split all the deps upon package loading instead? Considering I doubt their are 1.3 million dep entries in our DB (the amount of times alpm_splitdeps is called), this would really help if it is possible.
And continuing the trend of replying to myself, and showing why optimizing the small little parts of code can make a huge difference: $ time pacman -Qt > /dev/null real 0m0.089s user 0m0.023s sys 0m0.040s
$ time ./src/pacman/pacman -Qt > /dev/null real 0m0.377s user 0m0.277s sys 0m0.063s
This is a 5x increase in speed by making the change I described above in the previous email. I'll push a branch to my GIT repo tonight detailing the changes.
Hmm. Great job. I'm a bit doubter about alpm_splitdep: did you empty disk cache before retest? If you did so, this is really surprising to me: this is obviously a speed-up, but that is _much_ greater than I expected. Hmm. http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=7653bb93997f52848... we can spare the first strdup, if you backup *ptr then reset before return. Some other suggestions: 1. compute_requiredby now can return with an alpm_list of pmpkg_t* -s, and then we needn't scan pkgache to find 'pkgname' in deptest 2. -Qt needn't compute all the requiredby packages, it can stop if he find the first requiredby (no orphan). ... Summary: the results are very promising for me ;-) Bye, ngaba ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Tue, Nov 13, 2007 at 02:05:27PM +0100, Nagy Gabor wrote:
Hmm. http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=7653bb93997f52848... we can spare the first strdup, if you backup *ptr then reset before return.
That doesn't seem to help much, no measurable differences.
Some other suggestions: 2. -Qt needn't compute all the requiredby packages, it can stop if he find the first requiredby (no orphan).
That would help indeed : 280ms -> 200ms. I doubt anyone could notice that though :) And I don't know how to do it properly.
On Tue, Nov 13, 2007 at 02:05:27PM +0100, Nagy Gabor wrote:
Hmm.
http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=7653bb93997f52848...
we can spare the first strdup, if you backup *ptr then reset before return.
That doesn't seem to help much, no measurable differences.
Some other suggestions: 2. -Qt needn't compute all the requiredby packages, it can stop if he find the first requiredby (no orphan).
That would help indeed : 280ms -> 200ms. I doubt anyone could notice that though :) And I don't know how to do it properly.
compute_requiredby_first? ;-) Bye, ngaba ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Nov 13, 2007 7:05 AM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Hmm. Great job. I'm a bit doubter about alpm_splitdep: did you empty disk cache before retest? If you did so, this is really surprising to me: this is obviously a speed-up, but that is _much_ greater than I expected. Hmm. http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=7653bb93997f52848... we can spare the first strdup, if you backup *ptr then reset before return.
Some other suggestions: 1. compute_requiredby now can return with an alpm_list of pmpkg_t* -s, and then we needn't scan pkgache to find 'pkgname' in deptest 2. -Qt needn't compute all the requiredby packages, it can stop if he find the first requiredby (no orphan). ... Summary: the results are very promising for me ;-)
pacman is the last pacman-git release (before any requiredby changes), src/pacman/.libs/pacman is with the compute_requiredby only. [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time pacman -Qt > /dev/null real 0m10.698s user 0m0.060s sys 0m0.160s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time ./src/pacman/.libs/lt-pacman -Qt > /dev/null real 0m11.327s user 0m0.607s sys 0m0.127s The above results tell me at least one thing. We don't have to worry about optimizing pacman -Qt by stopping early- it isn't worth the hack. [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time pacman -Qi pacman-git > /dev/null real 0m0.686s user 0m0.020s sys 0m0.027s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time ./src/pacman/.libs/pacman -Qi pacman-git > /dev/null real 0m10.712s user 0m0.047s sys 0m0.147s [root@dublin pacman]# time ./src/pacman/.libs/pacman -Qi pacman-git > /dev/null real 0m0.060s user 0m0.033s sys 0m0.023s OK, so -Qi performance takes a big hit when nothing is cached, because we actually have to load the whole DB to show the one requiredby line. However, note that its performance when the kernel FS cache is available is quite fast, so we have nothing to worry about there. [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time pacman -Su > /dev/null real 0m4.573s user 0m0.490s sys 0m0.533s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time ./src/pacman/.libs/lt-pacman -Su > /dev/null real 0m4.848s user 0m0.487s sys 0m0.550s No difference here (note that this was a sysupgrade with no upgrades available, so I don't think requiredby is even needed here). [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time yes "no" | pacman -Su > /dev/null Proceed with installation? [Y/n] real 0m13.979s user 0m0.533s sys 0m0.723s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time yes "no" | ./src/pacman/.libs/lt-pacman -Su > /dev/null Proceed with installation? [Y/n] real 0m14.467s user 0m0.507s sys 0m0.623s No huge difference here (sysupgrade with one upgrade available- qt3). If anyone else can think of places where this switch to compute_requiredby code would really hurt us, I'd be glad to hear it. -Dan
Idézés Dan McGee <dpmcgee@gmail.com>:
On Nov 13, 2007 7:05 AM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Hmm. Great job. I'm a bit doubter about alpm_splitdep: did you empty disk cache before retest? If you did so, this is really surprising to me: this is obviously a speed-up, but that is _much_ greater than I expected. Hmm.
http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=7653bb93997f52848...
we can spare the first strdup, if you backup *ptr then reset before return.
Some other suggestions: 1. compute_requiredby now can return with an alpm_list of pmpkg_t* -s, and then we needn't scan pkgache to find 'pkgname' in deptest 2. -Qt needn't compute all the requiredby packages, it can stop if he find the first requiredby (no orphan). ... Summary: the results are very promising for me ;-)
pacman is the last pacman-git release (before any requiredby changes), src/pacman/.libs/pacman is with the compute_requiredby only.
[root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time pacman -Qt > /dev/null real 0m10.698s user 0m0.060s sys 0m0.160s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time ./src/pacman/.libs/lt-pacman -Qt > /dev/null real 0m11.327s user 0m0.607s sys 0m0.127s
The above results tell me at least one thing. We don't have to worry about optimizing pacman -Qt by stopping early- it isn't worth the hack. Any notable optimization worth the hack. And implement compute_requiredby_first is quite trivial. My estimated speed-up (only compared to compute_requiredby) is at least 2x. [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time pacman -Qi pacman-git > /dev/null real 0m0.686s user 0m0.020s sys 0m0.027s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time ./src/pacman/.libs/pacman -Qi pacman-git > /dev/null real 0m10.712s user 0m0.047s sys 0m0.147s [root@dublin pacman]# time ./src/pacman/.libs/pacman -Qi pacman-git > /dev/null real 0m0.060s user 0m0.033s sys 0m0.023s
OK, so -Qi performance takes a big hit when nothing is cached, because we actually have to load the whole DB to show the one requiredby line. However, note that its performance when the kernel FS cache is available is quite fast, so we have nothing to worry about there.
[root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time pacman -Su > /dev/null real 0m4.573s user 0m0.490s sys 0m0.533s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time ./src/pacman/.libs/lt-pacman -Su > /dev/null real 0m4.848s user 0m0.487s sys 0m0.550s
No difference here (note that this was a sysupgrade with no upgrades available, so I don't think requiredby is even needed here).
[root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time yes "no" | pacman -Su > /dev/null Proceed with installation? [Y/n] real 0m13.979s user 0m0.533s sys 0m0.723s [root@dublin pacman]# sync; echo 3 > /proc/sys/vm/drop_caches; time yes "no" | ./src/pacman/.libs/lt-pacman -Su > /dev/null Proceed with installation? [Y/n] real 0m14.467s user 0m0.507s sys 0m0.623s
No huge difference here (sysupgrade with one upgrade available- qt3).
If anyone else can think of places where this switch to compute_requiredby code would really hurt us, I'd be glad to hear it.
compute_requiredby mostly has negative impact in case of removals: I would try -Rs, [-Rc <- why do we have such dangerous stuffs ? ;-], -R "many packages"; and so on. Bye, ngaba ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Nov 13, 2007 12:47 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Any notable optimization worth the hack.
I just need to comment here that I'd prefer we _not_ do this in pacman. "Any optimization is worth it" is a REALLY bad mantra when it comes to code, in my experience. Code should be clean and elegant _first_ and as optimized as possible _second_. So if optimizations break code clarity and add unnecessary stuff for a 0.1 second improvement, it's not worth it. Sure large improvements are worth a loss in code clarity, but, as Dan actually proved with the timings, there's very little room for improvement here.
On Nov 13, 2007 12:47 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
Any notable optimization worth the hack.
I just need to comment here that I'd prefer we _not_ do this in pacman. "Any optimization is worth it" is a REALLY bad mantra when it comes to code, in my experience. Code should be clean and elegant _first_ and as optimized as possible _second_. So if optimizations break code clarity and add unnecessary stuff for a 0.1 second improvement, it's not worth it.
Sure large improvements are worth a loss in code clarity, but, as Dan actually proved with the timings, there's very little room for improvement here.
First of all, I don't want to start a flame (again): I wouldn't have replied anything if you just had said: "I don't like this. Forget it.". You are the leader, you can do it. No question. (I forgot it ;-). But you should see, that your reasoning is simply impossible to keep, because "clean code" and "elegance" are _subjetive_ things: -_imho_ sync.c is not elegant at all -_imho_ your alpm_list patch (first prev == tail) is not elegant (I would have preferred alpm_list_add_first) -_imho_ a function which we could _radically_ speed-up with a 'first' parameter and an 'if(first) return ...' line, is not elegant ... We are different, that's it. Well, maybe I misled you with this: "Any notable optimization worth the hack.", sry. Notable and hack are subjective things too ;-) Bye, ngaba ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Nov 13, 2007 2:01 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
First of all, I don't want to start a flame (again) So next time, please don't
On Nov 13, 2007 2:01 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote: > > On Nov 13, 2007 12:47 PM, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote: > > > Any notable optimization worth the hack. I disagree, just as Aaron stated below. The biggest reason? This is code that crosses the libalpm/pacman barrier. When that happens, interfaces should be as clean as possible. > > I just need to comment here that I'd prefer we _not_ do this in > > pacman. "Any optimization is worth it" is a REALLY bad mantra when it > > comes to code, in my experience. Code should be clean and elegant > > _first_ and as optimized as possible _second_. So if optimizations > > break code clarity and add unnecessary stuff for a 0.1 second > > improvement, it's not worth it. > > > > Sure large improvements are worth a loss in code clarity, but, as Dan > > actually proved with the timings, there's very little room for > > improvement here. > > But you should see, that your reasoning is simply impossible to keep, > because "clean code" and "elegance" are _subjetive_ things: Sure they are. And unfortunately since Aaron and I are the committers to the projects.al.org GIT tree, you will be held to our definitions of clean code and elegance. And I'm not trying to sound nasty here- I'm just laying out the facts. > -_imho_ sync.c is not elegant at all I don't believe we have ever disagreed with you here, so no need to fan the flames on this. Aaron and I didn't write this code, we've inherited it. At the same time, it works, and a monster patch that I can't digest isn't the solution to fixing it unless there is a great deal of discussion on how the new code works. > -_imho_ your alpm_list patch (first prev == tail) is not elegant (I would have > preferred alpm_list_add_first) A few issues here. 1. This is completely inside libalpm- there is no noticeable interface change to the "end user" of libalpm. 2. This is a huge performance increase. Thus, we can sacrifice some elegance for what we did. And it was commented well. 3. alpm_list_add_first would not help when it comes to printing the filelists of all the packages, etc. It would print them all in reverse order, which would confuse users. > -_imho_ a function which we could _radically_ speed-up with a 'first' > parameter and an 'if(first) return ...' line, is not elegant Hmm? Please keep this constructive if you (anyone on the list) respond to this thread. We all want to work on pacman, hear bitching about what "elegant" means. -Dan
On Tue, Nov 13, 2007 at 03:12:30PM -0600, Dan McGee wrote:
-_imho_ a function which we could _radically_ speed-up with a 'first' parameter and an 'if(first) return ...' line, is not elegant Hmm?
He's referring to the 2. there : http://www.archlinux.org/pipermail/pacman-dev/2007-November/010033.html To Nagy: No one said (yet ;)) that it wasn't elegant. It's probably easier to comment on the patch than just on the idea anyway.
-_imho_ sync.c is not elegant at all I don't believe we have ever disagreed with you here, so no need to fan the flames on this. Aaron and I didn't write this code, we've inherited it. At the same time, it works, and a monster patch that I can't digest isn't the solution to fixing it unless there is a great deal of discussion on how the new code works. Well, if you referred here to my pending alpm_sync_prepare patch: if you can't digest something, you can ask me (on this ML or in private), I will gladly answer to you (if I can ;-). [If you don't ask anything, I cannot figure out what your problem is.] Sometimes monster patches are needed: -if you change a data structure -back to sync.c: alpm_sync_prepare was so chaotic to me, that I have rewritten from ~scratch <- this was easier to me than computing the results of my changes in that sync_prepare monster ;-)
Bye ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Nov 12, 2007 11:28 PM, Dan McGee <dpmcgee@gmail.com> wrote:
And continuing the trend of replying to myself, and showing why optimizing the small little parts of code can make a huge difference: $ time pacman -Qt > /dev/null real 0m0.089s user 0m0.023s sys 0m0.040s
$ time ./src/pacman/pacman -Qt > /dev/null real 0m0.377s user 0m0.277s sys 0m0.063s
This is a 5x increase in speed by making the change I described above in the previous email. I'll push a branch to my GIT repo tonight detailing the changes.
Dan, my hero. Looks like pacman 3.1 is going to be significantly better - yay. One question while we're on this - we're probably going to need a small script to clean out the requiredby entries from the local DB, just to be concise - of course we should back up the DB first 8)
On Nov 13, 2007 12:26 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
One question while we're on this - we're probably going to need a
*comment That's not a question :roll:
On Nov 13, 2007 12:26 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Dan, my hero. Looks like pacman 3.1 is going to be significantly better - yay.
One question while we're on this - we're probably going to need a small script to clean out the requiredby entries from the local DB, just to be concise - of course we should back up the DB first 8)
Go right ahead, use libalpm for something useful! We should be able to do the following- read in the whole localdb, then write every entry back. The writeback will effectively clean out the DB. -Dan
Idézés Dan McGee <dpmcgee@gmail.com>:
On Nov 13, 2007 12:26 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Dan, my hero. Looks like pacman 3.1 is going to be significantly better - yay.
One question while we're on this - we're probably going to need a small script to clean out the requiredby entries from the local DB, just to be concise - of course we should back up the DB first 8)
Go right ahead, use libalpm for something useful! We should be able to do the following- read in the whole localdb, then write every entry back. The writeback will effectively clean out the DB. Well, unneeded %REQUIREDBY% entries in localdbs won't hurt anything. They will slowly disappear with package updates.
---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Mon, Nov 12, 2007 at 08:01:02PM -0600, Dan McGee wrote:
This is mainly an RFC on the whole idea. Nagy, you made me think enough about it that I just went ahead and whipped something up.
I'd first like to say that I did this in the quickest way possible, so there is no guarantee I did everything right. I also did nothing to optimize for efficiency. In some cases, we could be calling the alpm_pkg_compute_requiredby() function way more than we should, because I basically replaced any call to alpm_pkg_get_requiredby() with a corresponding alpm_pkg_compute_requiredby() and free().
I wonder if it couldn't be later improved by a sort of runtime cache. The first time, doing pkg->requiredby = alpm_pkg_compute_requiredby(), and then just returning this. But that would add some complexity again, and some difficulties we already have with the pkgcache (that is, when do we update it).
Note that we should definitely find a way to pactest or otherwise test the computed requiredby entries.
I didn't think about that at all. It looks problematic at first sight. I wonder though if problems in the requiredby code couldn't be detected by the other rules (return code, installed / removed packages, etc).
Some quick observations: $ time pacman -Qt > /dev/null (pacman 3, latest GIT release) real 0m0.084s user 0m0.020s sys 0m0.043s
$ time ./src/pacman/pacman -Qt > /dev/null (with compute_requiredby switch) real 0m1.893s user 0m1.800s sys 0m0.037s
Obviously a slowdown, but is it all that bad in the big scheme of things when corrupted requiredby entries no longer hurt us?
Indeed, that's a big slowdown, and for -Qt, it's really not surprising at all. The only thing it does is looking at requiredby. I was mostly thinking about a more common operation like -Su.
Also: $ pacman -Qt > orphans-old $ ./src/pacman/pacman -Qt > orphans-new $ diff orphans-old orphans-new 5a6
agg 2.5-2 121a123 libnet 1.1.2.1-1 144a147 mysql 5.0.45-1 184a188 python-eyed3 0.6.14-1 237a242 ttf-bitstream-vera 1.10-5 306a312 xmlsec 1.2.10-3
At first, I was worried. Then I see that the new code is much better! openoffice-base has at least two stale entries in the requiredby fields of agg and xmlsec, while libnet is stale from being an old depend of ettercap. So yeah, this new code helps us in a lot of places (everywhere but speed).
Well, yes, that's mostly why I made testdb. To find the incorrect requiredby entries in the database, and fix them manually by reinstalling the affected packages with pacman. That tool becomes quite obsolete now, but not totally. It still looks for missing dependencies (caused by Rd for example). And it could be extended for conflicts checking too.
Idézés Dan McGee <dpmcgee@gmail.com>:
This is mainly an RFC on the whole idea. Nagy, you made me think enough about it that I just went ahead and whipped something up.
I'd first like to say that I did this in the quickest way possible, so there is no guarantee I did everything right. I also did nothing to optimize for efficiency. In some cases, we could be calling the alpm_pkg_compute_requiredby() function way more than we should, because I basically replaced any call to alpm_pkg_get_requiredby() with a corresponding alpm_pkg_compute_requiredby() and free().
Note that we should definitely find a way to pactest or otherwise test the computed requiredby entries.
Some quick observations: $ time pacman -Qt > /dev/null (pacman 3, latest GIT release) real 0m0.084s user 0m0.020s sys 0m0.043s
$ time ./src/pacman/pacman -Qt > /dev/null (with compute_requiredby switch) real 0m1.893s user 0m1.800s sys 0m0.037s
Obviously a slowdown, but is it all that bad in the big scheme of things when corrupted requiredby entries no longer hurt us?
Also: $ pacman -Qt > orphans-old $ ./src/pacman/pacman -Qt > orphans-new $ diff orphans-old orphans-new 5a6
agg 2.5-2 121a123 libnet 1.1.2.1-1 144a147 mysql 5.0.45-1 184a188 python-eyed3 0.6.14-1 237a242 ttf-bitstream-vera 1.10-5 306a312 xmlsec 1.2.10-3
At first, I was worried. Then I see that the new code is much better! openoffice-base has at least two stale entries in the requiredby fields of agg and xmlsec, while libnet is stale from being an old depend of ettercap. So yeah, this new code helps us in a lot of places (everywhere but speed).
Huh, you were really fast. I just read and answer today's tons of mail in chronology, and this mail was a big surprise ;-) Thanks. Bye ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Mon, Nov 12, 2007 at 08:01:02PM -0600, Dan McGee wrote:
I'd first like to say that I did this in the quickest way possible, so there is no guarantee I did everything right. I also did nothing to optimize for efficiency. In some cases, we could be calling the alpm_pkg_compute_requiredby() function way more than we should, because I basically replaced any call to alpm_pkg_get_requiredby() with a corresponding alpm_pkg_compute_requiredby() and free().
Should it be FREE(requiredby) or FREELIST(requiredby) ? It looks like in your patch submitted on the ML, FREE(requiredby) is used, but in the patch commited in git, FREELIST(requiredby) is used, except in one place : lib/libalpm/deps.c , can_remove_package line 585.
On Nov 15, 2007 6:01 AM, Xavier <shiningxc@gmail.com> wrote:
Should it be FREE(requiredby) or FREELIST(requiredby) ? It looks like in your patch submitted on the ML, FREE(requiredby) is used, but in the patch commited in git, FREELIST(requiredby) is used, except in one place : lib/libalpm/deps.c , can_remove_package line 585.
Damn, looks like I missed one. It should be FREELIST, which is why I updated the patch after posting it here. I'll fix it locally, let me know if you find any other problems (like the one I did last night with alpm_list_copy_data that made remove045 fail). -Dan
participants (4)
-
Aaron Griffin
-
Dan McGee
-
Nagy Gabor
-
Xavier