[pacman-dev] little error when installing/upgrading the first package in a new database
There was an odd error when the first package installed in a database had a scriptlet : LANG=C sudo src/pacman/pacman --conf pacman.conf -Ud /var/cache/pacman/pkg/filesystem-0.8-3.pkg.tar.gz loading package data... done. (1/1) checking for file conflicts [####################################################################################################] 100% error: could not open file /home/xav/dev/pacman/foo/var/lib/pacman/local//filesystem-0.8-3/depends: No such file or directory----------------------------] 0% (1/1) installing filesystem [####################################################################################################] 100% This doesn't happen with package that doesn't have a scriptlet, and it doesn't happen either when there is at least one other package in the database. But it still happened after running the same pacman command several times (ie upgrading the single package in the database). I finally found why. in add.c, alpm_add_commit function : 366 _alpm_log(PM_LOG_DEBUG, _("removing old package first (%s-%s)"), oldpkg->name, oldpkg->version); That part will remove the filesystem package if it existed (in case of an upgrade), and will also remove the entry from the database (filesystem directory in $dbpath/local) 473 } else if(strcmp(entryname, ".INSTALL") == 0) { 474 /* the install script goes inside the db */ 475 snprintf(filename, PATH_MAX, "%s/%s-%s/install", db->path, 476 newpkg->name, newpkg->version); That extracts the scriptlet to $dbpath/local//filesystem-0.8-3/install . then later, we have this : 778 /* Update the requiredby field by scanning the whole database 779 * looking for packages depending on the package to add */ 780 _alpm_pkg_update_requiredby(newpkg); and later, this : 821 _alpm_log(PM_LOG_DEBUG, _("updating database")); 822 _alpm_log(PM_LOG_DEBUG, _("adding database entry '%s'"), newpkg->name); So when update_requiredby is called, the entry hasn't been totally written in the database, we only have the install file, but no depends or desc file. and some code from update_requiredby in package.c : 756 pmdb_t *localdb = alpm_option_get_localdb(); 757 for(i = _alpm_db_get_pkgcache(localdb); i; i = i->next) { ... 764 for(j = alpm_pkg_get_depends(cachepkg); j; j = j->next) { So _alpm_get_pkgcache will return the filesystem package, because the directory exists in $dbpath/local/ , but alpm_pkg_get_depends will try to read the depends file which doesn't exist yet, resulting in the error message. Finding this didn't take too long, but this error should then appear in every cases, as long as the package has a scriptlet. But it didn't, as soon as there were more than one package in the db, the error disappeared. That's because of the pkg cache. in package.c , _alpm_db_get_pkgcache function, we have : 106 if(!db->pkgcache) { 107 _alpm_db_load_pkgcache(db); 108 } so the package database is loaded again only if db->pkgcache is NULL. The first time we install filesystem, pkgcache will be NULL when that function is called, so it'll read the database, finding the partial filesystem entry. And when we upgrade filesystem, it removes the old filesystem package first and at this moment _alpm_db_remove_pkgfromcache function from cache.c will be called, doing this : 152 _alpm_log(PM_LOG_DEBUG, _("removing entry '%s' from '%s' cache"), 153 alpm_pkg_get_name(pkg), db->treename); 154 155 db->pkgcache = alpm_list_remove(db->pkgcache, pkg, _alpm_pkg_cmp, &vdata); which will result in a NULL db->pkgcache too. There is no differences between an uninitialized list and an empty list, unfortunately ... When there is another package in the database, pkgcache won't be NULL after the current package being upgraded is removed from pkgcache list. So the pkgcache won't be updated by reading the database again when update_requiredby is called, and so it won't fail on a broken partial filesystem entry, because it has been removed from pkgcache ealier. While that issue was totally harmless, and never happens, because there is always more than one package in the db, it was annoying to get it every time when debugging, without knowing where it came from. For fixing it, it shouldn't be too hard, but I'm not sure yet. Maybe we can delay the extraction of the scriptlet to $dbpath/ , and only installs it in the same time than the dabase entry is created. Or maybe we can just call requiredby after the database entry is fully created, I'm not sure.. Or maybe we can make the difference between an empty pkgcache and a not initialized one.
On 7/4/07, Xavier <shiningxc@gmail.com> wrote:
There was an odd error when the first package installed in a database had a scriptlet : LANG=C sudo src/pacman/pacman --conf pacman.conf -Ud /var/cache/pacman/pkg/filesystem-0.8-3.pkg.tar.gz loading package data... done. (1/1) checking for file conflicts [####################################################################################################] 100% error: could not open file /home/xav/dev/pacman/foo/var/lib/pacman/local//filesystem-0.8-3/depends: No such file or directory----------------------------] 0% (1/1) installing filesystem [####################################################################################################] 100%
This doesn't happen with package that doesn't have a scriptlet, and it doesn't happen either when there is at least one other package in the database. But it still happened after running the same pacman command several times (ie upgrading the single package in the database). I finally found why.
in add.c, alpm_add_commit function : 366 _alpm_log(PM_LOG_DEBUG, _("removing old package first (%s-%s)"), oldpkg->name, oldpkg->version);
That part will remove the filesystem package if it existed (in case of an upgrade), and will also remove the entry from the database (filesystem directory in $dbpath/local)
473 } else if(strcmp(entryname, ".INSTALL") == 0) { 474 /* the install script goes inside the db */ 475 snprintf(filename, PATH_MAX, "%s/%s-%s/install", db->path, 476 newpkg->name, newpkg->version);
That extracts the scriptlet to $dbpath/local//filesystem-0.8-3/install .
then later, we have this : 778 /* Update the requiredby field by scanning the whole database 779 * looking for packages depending on the package to add */ 780 _alpm_pkg_update_requiredby(newpkg);
and later, this : 821 _alpm_log(PM_LOG_DEBUG, _("updating database")); 822 _alpm_log(PM_LOG_DEBUG, _("adding database entry '%s'"), newpkg->name);
So when update_requiredby is called, the entry hasn't been totally written in the database, we only have the install file, but no depends or desc file.
and some code from update_requiredby in package.c : 756 pmdb_t *localdb = alpm_option_get_localdb(); 757 for(i = _alpm_db_get_pkgcache(localdb); i; i = i->next) { ... 764 for(j = alpm_pkg_get_depends(cachepkg); j; j = j->next) {
So _alpm_get_pkgcache will return the filesystem package, because the directory exists in $dbpath/local/ , but alpm_pkg_get_depends will try to read the depends file which doesn't exist yet, resulting in the error message.
Finding this didn't take too long, but this error should then appear in every cases, as long as the package has a scriptlet. But it didn't, as soon as there were more than one package in the db, the error disappeared. That's because of the pkg cache. in package.c , _alpm_db_get_pkgcache function, we have : 106 if(!db->pkgcache) { 107 _alpm_db_load_pkgcache(db); 108 }
so the package database is loaded again only if db->pkgcache is NULL. The first time we install filesystem, pkgcache will be NULL when that function is called, so it'll read the database, finding the partial filesystem entry. And when we upgrade filesystem, it removes the old filesystem package first and at this moment _alpm_db_remove_pkgfromcache function from cache.c will be called, doing this : 152 _alpm_log(PM_LOG_DEBUG, _("removing entry '%s' from '%s' cache"), 153 alpm_pkg_get_name(pkg), db->treename); 154 155 db->pkgcache = alpm_list_remove(db->pkgcache, pkg, _alpm_pkg_cmp, &vdata);
which will result in a NULL db->pkgcache too. There is no differences between an uninitialized list and an empty list, unfortunately ...
When there is another package in the database, pkgcache won't be NULL after the current package being upgraded is removed from pkgcache list. So the pkgcache won't be updated by reading the database again when update_requiredby is called, and so it won't fail on a broken partial filesystem entry, because it has been removed from pkgcache ealier.
While that issue was totally harmless, and never happens, because there is always more than one package in the db, it was annoying to get it every time when debugging, without knowing where it came from. For fixing it, it shouldn't be too hard, but I'm not sure yet. Maybe we can delay the extraction of the scriptlet to $dbpath/ , and only installs it in the same time than the dabase entry is created. Or maybe we can just call requiredby after the database entry is fully created, I'm not sure.. Or maybe we can make the difference between an empty pkgcache and a not initialized one.
I don't know if you have ever peeked at the kernel source before, but they use some tricks for this kind of thing. Instead of setting the PKGCACHE to NULL to represent both empty and uninitialized, they #define a separate value such as EMPTY to be some other memory address. Thus you can test for either NULL or EMPTY. Can't remember exactly where this is done, but look at the kernel's list.h. Does this seem like a smart thing to do? I'm sure there are other cases where this could be useful, but I'm not sure. -Dan
On Wed, Jul 04, 2007 at 11:51:22PM -0400, Dan McGee wrote:
I don't know if you have ever peeked at the kernel source before, but they use some tricks for this kind of thing. Instead of setting the PKGCACHE to NULL to represent both empty and uninitialized, they #define a separate value such as EMPTY to be some other memory address. Thus you can test for either NULL or EMPTY. Can't remember exactly where this is done, but look at the kernel's list.h. Does this seem like a smart thing to do? I'm sure there are other cases where this could be useful, but I'm not sure.
-Dan
Well, I guess that's the main thing to find out, seeing if it could be useful in other cases. But if the plan is to eventually move to kernel list, would it allow this disctinction ? I looked at your kernel_list branch, here is a part of the commit : +/** + * list_empty - tests whether a list is empty + * @head: the list to test. + */ But I'm quite confused by this : +/** + * list_del - deletes entry from list. + * @entry: the element to delete from the list. + * Note: list_empty() on entry does not return true after this, the entry is + * in an undefined state. + */ Anyway, this would fix the problem I was describing, but I wonder if it wouldn't just hide it. Is it really safe to assume we are using an outdated cache (not containing the partial entry we are installing) ? Or maybe, we should just check when alpm_pkg_get_depends return NULL, and skip these partial entries ? Otherwise, we could try avoiding partial entries, ie installing .INSTALL and .CHANGELOG in the database only when we are actually writing the entry to the db. But well, I'm not sure how easy that is.
2007/7/5, Xavier <shiningxc@gmail.com>:
There was an odd error when the first package installed in a database had a scriptlet : LANG=C sudo src/pacman/pacman --conf pacman.conf -Ud /var/cache/pacman/pkg/filesystem-0.8-3.pkg.tar.gz loading package data... done. (1/1) checking for file conflicts [####################################################################################################] 100% error: could not open file /home/xav/dev/pacman/foo/var/lib/pacman/local//filesystem-0.8-3/depends: No such file or directory----------------------------] 0% (1/1) installing filesystem [####################################################################################################] 100%
Just for referrence: http://bugs.archlinux.org/task/7520 -- Roman Kyrylych (Роман Кирилич)
2007/7/5, Roman Kyrylych <roman.kyrylych@gmail.com>:
Just for referrence: http://bugs.archlinux.org/task/7520
Hm, that issue seemed like a bigger one, and he said it was working fine using the same database on a normal partition instead of a loopback device. And that running fsck on the loopback device fixed the problem. So like Dan said, it looked more like a filesystem issue, and it's not clear pacman made anything wrong in that case. But for the error I'm describing, it's clearly pacman which is in fault, and by the way, this error appeared in a bug you reported :) (but for a different issue : wrong permission on /tmp) : http://bugs.archlinux.org/task/7197
2007/7/5, Xavier <shiningxc@gmail.com>:
2007/7/5, Roman Kyrylych <roman.kyrylych@gmail.com>:
Just for referrence: http://bugs.archlinux.org/task/7520
Hm, that issue seemed like a bigger one, and he said it was working fine using the same database on a normal partition instead of a loopback device. And that running fsck on the loopback device fixed the problem. So like Dan said, it looked more like a filesystem issue, and it's not clear pacman made anything wrong in that case.
But for the error I'm describing, it's clearly pacman which is in fault, and by the way, this error appeared in a bug you reported :) (but for a different issue : wrong permission on /tmp) : http://bugs.archlinux.org/task/7197
LOL, I did notice it at that time, but because of another issue just forgot about it. :-) -- Roman Kyrylych (Роман Кирилич)
participants (3)
-
Dan McGee
-
Roman Kyrylych
-
Xavier